Protection and Communication Abstractions for Web Browsers in
Protection and Communication Abstractions for Web Browsers in Mashup. OS Helen J. Wang, Xiaofeng Fan, Jon Howell (MSR) Collin Jackson (Stanford) February, 2008 1
2
… but most of all, Samy is my hero 3
4
Outline • • The problem The Mashup. OS project Protection Communication Implementation and demo Evaluation Related work Conclusions 5
Client Mashups • Web content has evolved from single-principal services to multi-principal services, rivaling that of desktop PCs. • Principal is domain 6
Browsers Remain Single-Principal Systems http: //integrator. com/ X <iframe src=“http: //provider. com/p. html”> </iframe> http: //integrator. com/ <script src=“http: //provider. com/p. js”> </script> • The Same Origin Policy (SOP), an all-ornothing trust model: – No cross-domain interactions allowed – (External) scripts run with the privilege of the enclosing page 7
Insufficiency of the SOP • Sacrifice security for functionality when including an external script without fully trusting it • E. g. , i. Google, Live gadget aggregators’ inline gadget 8
Insufficiency of the SOP, Cont. • Third-party content sanitization is hard – Cross site scripting (XSS): • Unchecked user input in a generated page • E. g. , Samy worm: infected 1 million My. Space. com users in 20 hours • Root cause: – The injected scripts run with the page’s privilege Samy is my hero 9
Insufficiency of the SOP, Cont. • Sacrifice functionality for security when denying scripts in third -party content • E. g. , My. Space. com disallows scripts in user profiles 10
The Mashup. OS Project • Enable browser to be a multi-principal OS • Focus of this paper: protection and communication abstractions • Protection: – Provide default isolation boundaries • Communications: – Allow service-specific, fine-grained access control across isolation boundaries 11
Design Principles • Match all common trust levels to balance ease-of-use and security – Goal: enable programmers to build robust services – Non-goal: make it impossible for programmers to shoot themselves in the foot • Easy adoption and no unintended behaviors 12
Outline ü The problem ü The Mashup. OS project • Protection • Communication • Implementation and demo • Evaluation • Related work • Conclusions 13
A Principal’s Resources • Memory: – heap of script objects including DOM objects that control the display • Persistent state: – cookies, etc. • Remote data access: – XMLHttp. Request 14
Trust Relationship between Providers and Integrators i. com p. com i. com HTML i. com No XHR Internet No Content Semantics Abstraction Runas Isolated <Frame> p. com XHR X http: //i. com/ X X <iframe src=“http: //p. com/c. html”> </iframe> 15
Trust Relationship between Providers and Integrators i. com p. com i. com Script i. com Internet http: //i. com/ Content Semantics Abstraction Runas No No Isolated <Frame> p. com Yes Open <Script> i. com XHR <script src=“http: //p. com/c. js”> </script> 16
Trust Relationship between Providers and Integrators i. com p. com i. com Internet Content Semantics Abstraction Runas No No Isolated <Frame> p. com Yes Open <Script> i. com No Yes http: //i. com/ X 17
Trust Relationship between Providers and Integrators i. com p. com i. com Unauth i. com XHR X Internet http: //i. com/ XHR X Content Semantics Abstraction Runas No No Isolated <Frame> p. com Yes Open <Script> i. com No Yes No Unauthorized <Sandbox> <Open. Sandbox> None X <sandbox src=“http: //p. com/c. html”> </sandbox> Unauthorized content is not authorized to access any principal’s resources. 18
Properties of Sandbox • Asymmetric access – Access: reading/writing script global objects, function invocations, modifying/creating DOM elements inside the sandbox • Invoking a sandbox’s function is done in the context of the sandbox – setuid (“unauthorized”) before invocation and setuid (“enclosing. Page. Principal) upon exit • The enclosing page cannot pass non-sandbox object references into the sandbox. – Programmers can put needed objects inside the sandbox • Private vs. Open sandboxes 19
Private Sandbox <sandbox src=“file”> Content if tag not supported </sandbox> • Belongs to a domain and can only be accessed by that domain – E. g. , private location history marked on a map • Private sandboxes cannot access one another even when nested – Otherwise, a malicious script can nest another private sandbox and access its private content 20
Open Sandbox <Open. Sandbox src=“file”> Content if tag not supported </Open. Sandbox> • Can be accessed by any domain • Can access its descendant open sandboxes --- important for third party service composition – E. g. , e-mail containing a map; don’t want an email to tamper hotmail. com; don’t want the map library to tamper the e-mail 21
Provider-Browser Protocol for Unauthorized Content • Unauthorized content must be sandboxed and must not be renderable by frames – Otherwise, unauthorized content would run as the principal of the frame • MIME protocol seems to be what we want: – Require providers to prefix unauthorized content subtype with x-private. Unauthorized+ or x-open. Unauthorized+ – E. g. , text/html text/x-private. Unauthorized+html – Verified that Firefox cannot render these content types with <frame> and <script> – But, IE’s MIME sniffing allows rendering sometimes • Alternative: encraption (e. g. , Base 64 encoding) • Prevent providers from unintentionally publishing unauthorized content as other types of content: – Constrain sandbox to take only unauthorized content 22
Key Benefits of Sandbox • Safe mashups with ease • Beneficial to host third-party content as unauthorized content 23
Sandbox for Safe Mashups with Ease http: //Mashup. com/index. htm <script> // local script to Mashup. com // calling functions in a. js and b. js </script> X <script src=“a. com/a. js”> </script> <div id=“display. Area. For. A”> … </div> X <script src=“b. com/b. js”> </script> 24
Hosting Third-Party Content as Unauthorized Content • Combats cross site scripting attacks in a fundamental way – Put user input into a sandbox – Does not have to sacrifice functionality • Helps with Web spam – Discount the score of hyperlinks in third party content 25
Outline ü The problem ü The Mashup. OS project ü Protection • Communication • Implementation & demo • Evaluation • Related work • Conclusions 26
Communications • Message passing across the isolation boundaries enable custom, fine-grained access control b. com Comm. Request a. com Unauthorized Isolated 27
Comm. Request • Server: server = new Comm. Server(); server. listen. To(“a. Port”, request. Handler. Function); • Client: req = new Comm. Request(); req. open (“INVOKE”, “local: http: //bob. com//a. Port”, is. Synchronous); req. send (request. Data); req. onreadystatechange = function () { …} 28
Comm. Request vs. XMLHttp. Request • • • Cross domain Source labeled No cookies sent “Server” can be on client Reply from remote server tagged with special MIME type • Syntax similar to socket API and XHR 29
Outline ü The problem ü The Mashup. OS project ü Protection ü Communication • Implementation & demo • Evaluation • Related work • Conclusions 30
Implementation Mashup. OS Script Engine Proxy HTML Layout Engine Original HTML Script execution DOM object access DOM object update Script Engine Mashup. OS transformed HTML Mashup. OS MIME Filter • Use frames as our building blocks, but we apply our access control 31
Evaluation: Showcase Application • Photo. Loc, a photo location service – Mash up Google’s map service and Flickr’s geo-tagged photo gallery service – Map out the locations of photographs taken • Photo. Loc doesn’t trust flickr nor gmap 32
Photo. Loc/index. htm <script> function set. Photo. Loc(request) { var coordinate = request. body; var latitude = get. Latitude (coordinate); var longitude = get. Longitude (coordinate); Direct G. map. set. Center(new GLat. Lng(latitude, longitude), 6); access } var svr = new Comm. Server(); svr. listen. To(“recv. Location. Port”, set. Photo. Loc); </script> Comm. Request <Sandbox src=”f. uhtml” id=F> </Sandbox> <Sandbox src=”g. uhtml” id=G> </Sandbox> 33
Demo 34
Evaluation: Prototype Performance • Microbenchmarking for script engine proxy – Negligible overhead for no or moderate DOM manipulations – 33%--82% overhead with heavy DOM manipulations • Macrobenchmark measures overall page-loading time using top 500 pages from the top clickthrough search results of MSN search from 2005 – shows no impact • Anticipate in-browser implementation to have low overhead 35
Outline ü The problem ü The Mashup. OS project ü Protection ü Communication ü Implementation & demo ü Evaluation • Related work • Conclusions 36
Related work • Crockford’s <Module> – Symmetric isolation with socket-like communication with the enclosing page • Wahbe et al’s Software Fault Isolation – Asymmetric access though never leveraged – Primary goal was to avoid context switches for untrusted code in a process • Cox et al’s Tahoma browser operating system uses VM to – Protect the host system from browser and web services – Protect web applications (a set of web sites) from one another 37
Future Work • Robust implementation of the protection model • Tools to detect whether a browser extension violates the browser’s protection model • Tools for ensuring proper segregation of different content types • Resource management, OS facilities 38
Conclusions • Web content involves multiple principals • Browsers remain a single principal platform • The missing protection abstraction: Unauthorized content and <sandbox> – Enable safe mashups with ease – Combats cross-site scripting in a fundamental way • Comm. Request allows fine-grained access control across isolation boundaries • Practical for deployment 39
Thank you! 40
- Slides: 40