A SYSTEMATIC ANALYSIS OF XSS SANITIZATION IN WEB
A SYSTEMATIC ANALYSIS OF XSS SANITIZATION IN WEB APPLICATION FRAMEWORKS Joel Weinberger, Prateek Saxena, Devdatta Akhawe, Matthew Finifter, Richard Shin, and Dawn Song University of California, Berkeley
Content Injection <div class=“comment”> <iframe src=“http: //www. voteobama. com”></iframe> </div>
Web Frameworks • Systems to aid the development of web applications • Dynamically generated pages on the server • Templates for code reuse • Untrusted data dynamically inserted into programs • User responses, SQL data, third party code, etc.
Code in Web Frameworks <html> <p>hello, world</p> </html>
Code in Web Frameworks <html> <? php echo "<p>hello, world</p>"; ? > </html>
Code in Web Frameworks <html> <? php echo $USERDATA </html> What happens if $USERDATA = <script>do. Evil()</script> ? >
Code in Web Frameworks <html> <script>do. Evil()</script> </html>
Sanitization The encoding or elimination of dangerous constructs in untrusted data.
Contributions • Build a detailed model of the browser to explain subtleties in data sanitization • Evaluate the effectiveness of auto sanitization in popular web frameworks • Evaluate the ability of frameworks to sanitize different contexts • Evaluate the tools of frameworks in relation to what web applications actually use and need
Sanitization Example • "<p>" + "<script> do. Evil()</script> " + "</p>" Untrusted
Sanitization Example "<p>" + sanitize. HTML( "<script> do. Evil() </script>" ) + "</p>" <p> do. Evil() </p>
Are we done? "<a href='" + sanitize. HTML( "javascript: …" ) + "' />" HTML context sanitizer <a href=' javascript: … '/> URI Context, not HTML
Now are we done? <div onclick='display. Comment(" SANITIZED_ATTRIBUTE ")' > </div> What if SANITIZED_ATTRIBUTE = " ); steal. Info(" "
Now are we done? <div onclick='display. Comment(" SANITIZED_ATTRIBUTE ")' > </div> <div onclick='display. Comment( ""); steal. Info("") '> </div>
Browser Model OMG!!!
Framework and Application Evaluation • What support for auto sanitization do frameworks provide? • What support for context sensitivity do frameworks provide? • Does the support of frameworks match the requirements of web applications?
Using Auto Sanitization {% if header. sortable %} <a href="{{header. url}}"> {% endif %} Django doesn’t know how to auto sanitize this context!
Overriding Auto Sanitization {% if header. sortable %} <a href="{{header. url | escape}}"> {% endif %} Whoops! Wrong sanitizer.
Auto Sanitization Support No Auto Sanitization HTML Context Only Auto Context Aware sanitization 7 4 3 • Examined 14 different frameworks • 7 have no auto sanitization support at all • 4 provide auto sanitization for HTML contexts only • 3 automatically determine correct context and which sanitizer to apply • …although may only support a limited number of contexts
Sanitization Context Support HTML Tag Context URI Attribute (excluding scheme) URI Attribute (including scheme) JS String JS Number or Boolean Style Attribute or Tag 14 14 4 4 1 2 • Examined 14 different frameworks • Only 1 handled all of these contexts • Numbers indicate sanitizer support for a context regardless of auto sanitization support
Contexts Used By Web Applications HTML Tag Context URI Attribute (excluding scheme) 8/8 7/8 URI Attribute (including scheme) JS String, Style Number, or Attribute Boolean or Tag 7/8 6/8 8/8 • Web applications (all in PHP): • Round. Cube, Drupal, Joomla, Word. Press, Media. Wiki, PHPBB 3, Open. EMR, Moodle • Ranged from ~19 k LOC to ~530 k LOC
Further Complexity in Sanitization Policies wordpress/post_comment. php User Admin "<img src='…'></img>" "" "<img src='…'></img>"
Evaluation Summary • Auto sanitization alone is insufficient • Frameworks lack sufficient expressivity • Web applications already use more features than frameworks provide
Take Aways • Defining correct sanitization policies is hard • And it’s in the browser spec! • Frameworks can do more • More sanitizer contexts, better automation, etc. • Is sanitization the best form of policy going forward?
- Slides: 24