[security] Potential for XXE-type exploits when parsing untrusted documents #37

GoogleCodeExporter · 2016-02-11T23:48:12Z

What steps will reproduce the problem?

It may be possible to leak local files or make network requests when processing 
a malicious document, per 
https://www.owasp.org/index.php/XML_External_Entity_%28XXE%29_Processing

The default options to the lxml parser are not suitable for use on untrusted 
inputs, and pykml.parser does not expose them for reconfiguration.

See PoC/example below.

What is the expected output? What do you see instead?

parser.parse() should probably default to using a lxml.Parser instance with 
resolve_entities=False (and maybe no_network=True) to avoid malicious entity 
expansion.

http://lxml.de/parsing.html#parser-options details the available options.

What version of the product are you using? On what operating system?

OSX 10.9.1
Python 2.7.6 (MacPorts)
pykml==0.1.0

Please provide any additional information below.

# Simple PoC from OWASP sample document.
from lxml import etree
from pykml import parser
doc = parser.fromstring('<?xml version="1.0" encoding="UTF-8"?>'
                        '<!DOCTYPE foo [ <!ELEMENT foo ANY > '
                        '<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>'
                        '<foo>&xxe;</foo>')
print etree.tostring(doc, pretty_print=True)


---

Mitigation:

I'm unaware of any KML documents that actually use entities, so just modifying 
the parse/fromstring functions to use a parser with expand_entities disabled 
may be sufficient. 

An alternative may be to add new methods parse_safe() or add additional 
optional kwargs to the existing methods to allow users to provide their own 
Parser object, or set options on it.

If anything other than the first option, the docs should be updated with a 
prominent warning about the risks of handling untrusted input without 
precautions.

An example kml input that also passes kml22gx.xsd schema validation is attached.

Original issue reported on code.google.com by [email protected] on 8 Feb 2014 at 2:26

Attachments:

kmlxxe.kml

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter added Priority-Medium auto-migrated Type-Defect labels Feb 11, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[security] Potential for XXE-type exploits when parsing untrusted documents #37

[security] Potential for XXE-type exploits when parsing untrusted documents #37

GoogleCodeExporter commented Feb 11, 2016

[security] Potential for XXE-type exploits when parsing untrusted documents #37

[security] Potential for XXE-type exploits when parsing untrusted documents #37

Comments

GoogleCodeExporter commented Feb 11, 2016