Web Caching Web Caching 1 Web Caching v
Web Caching? Web Caching: 1
Web Caching? v Fetching something over the network is both slow and expensive. § Large responses require many roundtrips between the client and server, § This requires extra processing from the browser and incurs extra costs for the visitor (bandwidth) and the visited server. v The ability to cache and reuse previously fetched resources is a critical aspect of performance optimization v Almost, every browser coms with an implementation of an HTTP cache. § When you visit a web page, the browser stores the web page in cache to make it load faster in subsequent visits. v All you need is to ensure that each server response: § Provides the correct HTTP header directives § Use these headers to instruct the browser on: Web Caching: -2
Web Caching? v v Storing copies of recently accessed web pages Pages are delivered from the cache when requested again § Browser caches § Proxy caches v Why Cache? § § Shorter response time Reduced bandwidth requirement Reduced load on servers Access control and logging Web Caching: -3
Browser Caching vs. Proxy Caching v Browser Caching v § Similar concept, but for multiple users. § Usually implemented on a firewall or separate device known as intermediaries (Proxies). § Local hard drive space stores representation of viewed content. v Usefulness § Recently viewed pages (Back/Forward button) § Commonly used images Proxy Caching v Usefulness § Latency and network traffic are reduced Web Caching:
Cache Controlling v HTML Meta Tags § § Written in <head> section of an HTML page. Can mark expiration date or as un-cacheable. Only used by some browser caches and not seen by proxy caches. i. e. To disable browser cache, you can use: <meta http-equiv="Cache-Control" content="no-store" /> v HTTP Headers § Automatically created by Web server § Sent before HTML § Seen by browser and proxy caches v HTTP Defining Mechanisms § Freshness – Content is able to be loaded from cache without having to check with the original server § Validation – The process of confirming with the original server whether or not cached content is still valid to load Web Caching: -5
HTTP Response Headers Example Sample Response Header Freshness Validation HTTP/1. 1 200 OK Date: Fri, 20 Oct 2017 13: 19: 41 GMT Server: Microsoft IIS/8. 5 Cache-Control: max-age=3600, must-revalidate Expires: Fri, 20 Oct 2017 14: 19: 41 GMT Last-Modified: Mon, 16 Oct 2017 02: 28: 12 GMT ETag: "3 e 86 -410 -3596 fbbc" Content-Length: 1024 Content-Type: text/html Web Caching: 6
Expires Header and Freshness Expires: Fri, 20 Oct 2017 14: 19: 41 GMT v Indicates how long the representation is fresh. § After this time passes, the cache will communicate with the original server to see if there have been any changes. v v Beneficial for static page images as well as continually changing content. Web server and cache must be synchronized. Web Caching: -7
Last-Modified and Validation Last-Modified: Mon, 16 Oct 2017 02: 28: 12 GMT v Validate cache by looking at the last time the document was altered (Last-Modified). v If-Modified-Since request is sent to the original server. § If changes have been made since the date given, the entire document is returned. Otherwise, the cached document can be loaded. Web Caching: -8
ETag and Validation ETag: "3 e 86 -410 -3596 fbbc" v Unique identifiers created by server. v Changed each time the representation is altered on the original server. v If-None-Match request is sent to server and a simple comparison is used to validate the content. Web Caching: -9
v Browser Caching Ex. This example shows that the server First Request returns: § a 1024 -byte response, § instructs the client to cache it for up to 120 seconds, § and provides a validation token ("x 234 dff") that can be used after the response has expired to check if the resource has been modified. v ETag validation token enables efficient resource update checks § No data is transferred if the resource has not changed. § It is used as a fingerprint of the file contents § On the second request, the client only needs to send it to the server § The server checks the token against the current resource § If the token hasn't changed, the server returns a "304 Not Modified" response § This means, cache hasn't changed and it can be renewed for another 120 seconds. When the server returns a response, it includes a set of HTTP headers. i. e. content-type, length, caching directives, validation token, etc. Second Request
Cache-Control v v Each resource can define its caching policy via the Cache. Control HTTP header. Cache-Control directives control § who can cache the response, § under which conditions, § and for how long. "no-cache" v the returned response can't be used to satisfy a subsequent request to the same URL without first checking with the server if the response has changed. v ETag token and “no-cache” incurs a roundtrip to validate the cached response, but eliminates the download if the resource has not changed. "no-store" v disallows the browser and all intermediate caches from storing any version of the returned response v Thus, every time the user requests this URL, a request is sent to the server and a full response is downloaded. "max-age" v specifies the maximum time in seconds that the fetched response is allowed to be reused from the time of the request. Web Caching: 11
Web Catching: Proxy Web Caching: 12
What is a Web Proxy (Web Cache)? v A proxy is a host which relays web access requests from clients v Used when clients do not access the web directly Used for security, logging, accounting and performance Typically a Web cache is purchased and installed by an ISP. v v § For example, a university might install a cache on its campus network and configure all of the campus browsers to point to the cache. browser proxy web Web Caching: -13
Web caches (proxy server) goal: satisfy client request without involving origin server v User sets browser: Web v accesses via cache Browser sends all HTTP requests to cache § If object is in cache: cache returns object § else cache requests object from origin server, then returns object to client HT TP H client TTP res proxy st e u req server req ues P e T t ons HT pon eq Pr T HT se est u p res P T HT origin server e ns o p es r TP HT client origin server Web Caching: 14
Web caches (proxy server) 1. The browser: 2. § establishes a TCP connection to the Web cache and § sends an HTTP request for the object to the Web cache. The Web cache: § checks to see if it has a copy of the object stored locally § If it does: • the Web cache returns the object within an HTTP response message to the client browser. • No request from the original server is made Web Caching: -15
Web caches (proxy server) 3. If the Web cache does not have the object: § The Web cache: • opens a TCP connection to the origin server. • then sends an HTTP request for the object into the cache-to-server TCP connection. § Origin server • After receiving this request, it sends the object within an HTTP response to the Web cache. § When the Web cache receives the object: • it stores a copy in its local storage and • sends a copy, within an HTTP response message, to the client browser (over the existing TCP connection between the client browser and the Web cache). Web Caching: -16
Web Caches (proxy server) v Note that a cache is both a server and a client at the same time. § When it receives requests from and sends responses to a browser, it is a server. § When it sends requests to and receives responses from an origin server, it is a client. v Web caching has seen deployment in the Internet for two reasons: § First, a Web Cache can substantially reduce the response time for a client request, § Second, Web Caches can substantially reduce traffic on an institution’s access link to the Internet (reducing costs). v Web caches can substantially reduce Web traffic in the Internet as a whole, § thereby improving performance for all applications. Web Caching: -17
Summary: Web Caching v cache acts as both client and server § server for original requesting client § client to origin server typically cache is installed by ISP (university, company, residential ISP) why Web caching? v reduce response time for client request v reduce traffic on an institution’s access link v Internet dense with caches: enables “poor” content providers to effectively deliver content (so too does P 2 P file sharing) v Web Caching: 2 -18
Conditional GET v Goal: don’t send object if cache has up-to-date cached version § no object transmission delay § lower link utilization v cache: specify date of cached copy in HTTP request server client HTTP request msg If-modified-since: <date> HTTP response HTTP/1. 0 304 Not Modified object not modified before <date> If-modified-since: <date> v server: response contains no object if cached copy is up-to -date: HTTP/1. 0 304 Not Modified HTTP request msg If-modified-since: <date> HTTP response HTTP/1. 0 200 OK object modified after <date> <data> Web Caching: 2 -19
- Slides: 19