Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[security] Potential for XXE-type exploits when parsing untrusted documents #37

Open
GoogleCodeExporter opened this issue Feb 11, 2016 · 0 comments

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?

It may be possible to leak local files or make network requests when processing 
a malicious document, per 
https://www.owasp.org/index.php/XML_External_Entity_%28XXE%29_Processing

The default options to the lxml parser are not suitable for use on untrusted 
inputs, and pykml.parser does not expose them for reconfiguration.

See PoC/example below.

What is the expected output? What do you see instead?

parser.parse() should probably default to using a lxml.Parser instance with 
resolve_entities=False (and maybe no_network=True) to avoid malicious entity 
expansion.

http://lxml.de/parsing.html#parser-options details the available options.

What version of the product are you using? On what operating system?

OSX 10.9.1
Python 2.7.6 (MacPorts)
pykml==0.1.0

Please provide any additional information below.

# Simple PoC from OWASP sample document.
from lxml import etree
from pykml import parser
doc = parser.fromstring('<?xml version="1.0" encoding="UTF-8"?>'
                        '<!DOCTYPE foo [ <!ELEMENT foo ANY > '
                        '<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>'
                        '<foo>&xxe;</foo>')
print etree.tostring(doc, pretty_print=True)


---

Mitigation:

I'm unaware of any KML documents that actually use entities, so just modifying 
the parse/fromstring functions to use a parser with expand_entities disabled 
may be sufficient. 

An alternative may be to add new methods parse_safe() or add additional 
optional kwargs to the existing methods to allow users to provide their own 
Parser object, or set options on it.

If anything other than the first option, the docs should be updated with a 
prominent warning about the risks of handling untrusted input without 
precautions.

An example kml input that also passes kml22gx.xsd schema validation is attached.

Original issue reported on code.google.com by [email protected] on 8 Feb 2014 at 2:26

Attachments:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant