HTTP CSx 760 Computer Networks 1 The Web
HTTP CSx 760 Computer Networks 1
The Web: Some Jargon r User agent for Web is r Web page: m m consists of “objects” addressed by a URL r Most Web pages consist of: m m base HTML page, and several referenced objects. r URL has three components: called a browser: m m MS Internet Explorer Netscape Navigator r Server for Web is called Web server: m m Apache (public domain) MS Internet Information Server host name, port number and path name: snowball. cs. uga. edu: 80/~cs 6760/index. html CSx 760 Computer Networks 2
The Web: the HTTP Protocol HTTP: hypertext transfer protocol r Web’s application layer protocol r client/server model m client: browser that requests, receives, “displays” Web objects m server: Web server sends objects in response to requests r http 1. 0: RFC 1945 r http 1. 1: RFC 2068 htt pr equ est Windows running ht tp res Explorer pon se t es u q re tp ht se n po s re Server running Apache Web server Linux running Navigator CSx 760 Computer Networks 3
The HTTP Protocol: Message Flow HTTP: TCP transport service: r client initiates TCP connection (creates socket) to server, port 80 r server accepts TCP connection from client r http messages (applicationlayer protocol messages) exchanged between browser (http client) and Web server (http server) r TCP connection closed http is “stateless” r server maintains no information about past client requests aside Protocols that maintain “state” are complex! r past history (state) must be maintained r if server/client crashes, their views of “state” may be inconsistent, must be reconciled r When is “state” desired? CSx 760 Computer Networks 4
HTTP Example Suppose user enters URL www. cs. uga. edu/index. html 1 a. http client initiates TCP connection to http server (process) at www. cs. uga. edu. Port 80 is default for http server. 1 b. http server at host www. cs. uga. edu waiting for TCP connection at port 80. “accepts” connection, notifying client 2. http client sends http request message (containing URL) into TCP connection socket time 3. http server receives request message, forms response message containing requested object (index. html), sends message into socket (the sending speed increases slowly, which is called slow-start) CSx 760 Computer Networks 5
HTTP Example (cont. ) 4. http server closes TCP connection. 5. http client receives response message containing html file, parses html file, finds embedded image time 6. Steps 1 -5 repeated for each of the embedded images CSx 760 Computer Networks 6
HTTP Request Message: General Format CSx 760 Computer Networks 7
The HTTP Protocol: Message Format r Two types of HTTP messages: request, response r HTTP request message: m ASCII (human-readable format) request line (GET, POST, HEAD commands) header lines Carriage return, line feed indicates end of message GET /~cs 6760/index. html HTTP/1. 1 Host: snowball. cs. uga. edu Connection: close User-agent: Mozilla/4. 0 Accept: text/html, image/gif, image/jpeg Accept-language: en (extra carriage return, line feed) CSx 760 Computer Networks 8
HTTP Message Format: Response status line (protocol status code status phrase) header lines data, e. g. , requested html file HTTP/1. 1 200 OK Date: Sun, 06 Aug 2002 12: 00: 15 GMT Server: Apache/1. 3. 0 (Unix) Last-Modified: Wed, 2 Oct 2002 …. . . Content-Length: 6821 Content-Type: text/html data data. . . CSx 760 Computer Networks 9
HTTP Response Status Codes In first line in server->client response message. A few sample codes: 200 OK m request succeeded, requested object later in this message 301 Moved Permanently m requested object moved, new location specified later in this message (Location: ) 400 Bad Request m request message not understood by server 404 Not Found m requested document not found on this server 505 HTTP Version Not Supported CSx 760 Computer Networks 10
More about Message Format r Why encode HTTP message in ASCII, why not use binary? r Loss a few encoding bytes on the wire, and in RAM … r But human-readable format is very important for high level network protocols m typing interactively to the system or sending scripts directly into the system is. CSx 760 Computer Networks 11
Trying out HTTP (client side) for yourself 1. Telnet to your favorite Web server: telnet www. cs. uga. edu 80 Opens TCP connection to port 80 (default http server port) at www. cs. uga. edu. Anything typed in sent to port 80 at www. cs. uga. edu 2. Type in a GET http request: GET /index. html HTTP/1. 0 By typing this in (hit carriage return twice), you send this minimal (but complete) GET request to http server 3. Look at response message sent by http server! CSx 760 Computer Networks 12
Response time modeling Definition of RTT: time to send a small packet to travel from client to server and back. Web Response time: r one RTT to initiate TCP connection r one RTT for HTTP request and first few bytes of HTTP response to return r file transmission time total = 2 RTT+transmit time initiate TCP connection RTT request file RTT file received time CSx 760 Computer Networks time to transmit file time 13
Non-Persistent and Persistent Connections Non-persistent r HTTP/1. 0 r server parses request, responds, and closes TCP connection r 2 RTTs to fetch each object r Each object transfer suffers from slow start But most 1. 0 browsers use parallel TCP connections. Persistent r default for HTTP/1. 1 r on same TCP connection: server parses request, responds, parses new request, … r Client sends requests for all referenced objects as soon as it receives base HTML. r Fewer RTTs and less slow start. CSx 760 Computer Networks 14
Pipelining Persistent without pipelining: r client issues new request only when previous response has been received r one RTT for each referenced object Persistent with pipelining: r default in HTTP/1. 1 m “It is unfortunately not well supported by many web servers and proxies”, therefore Mozilla (and probably all other browers disable it by default) r client sends requests as soon as it encounters a referenced object r as little as one RTT for all the referenced objects CSx 760 Computer Networks 15
User-Server Interaction: Authentication server client Authentication goal: control access to server documents usual http request msg r stateless: client must present 401: authorization req. authorization in each request WWW-authenticate: r authorization: typically name, password usual http request msg m Authorization: header + Authorization: line in request m if no authorization usual http response msg presented, server refuses access, sends WWW-authenticate: header line in response usual http request msg + Authorization: line usual http response msg Browser caches name & password so that user does not have to repeatedly enter it. CSx 760 Computer Networks time 16
User-server Interaction: Cookies r server sends “cookie” to client in response msg usual http request msg Set-cookie: 1678453 r client presents cookie in later requests Cookie: 1678453 r server matches server client presented-cookie with server-stored info m authentication m remembering user preferences, previous choices usual http response + Set-cookie: # usual http request msg Cookie: # usual http response msg CSx 760 Computer Networks cookiespectific action 17
HTTPS client server https request msg (cryptographic preferences) r Secure Socket Layer Cryptographic preference, (SSL) public key, and CA certificate r Certification Authority (CA) If the client has the CA’s r Symmetric Key public key, then the client can Cryptography verify the CA certificate r Asymmetric Key Cryptography m Public Key & Private Key r What if m m Sniffing? Replaying? time Client generates a random symmetric key, encrypts it using the server’s public key Server extracts the symmetric key and encrypt further communication with it CSx 760 Computer Networks 18
Certification Example r A digital certificate contains the name of a company, web site or individual, along with a cryptographic key that can be used to encrypt information that must be sent to that individual r A list of CAs is pre-loaded in your browser CSx 760 Computer Networks 19
User-Server Interaction: Conditional GET r Goal: don’t send object if server client has up-to-date stored (cached) version r client: specify date of cached copy in http request If-modified-since: <date> http request msg If-modified-since: <date> http response HTTP/1. 0 304 Not Modified object not modified r server: response contains no object if cached copy up-todate: HTTP/1. 0 304 Not Modified http request msg If-modified-since: <date> http response object modified HTTP/1. 1 200 OK … <data> CSx 760 Computer Networks 20
Web Caches (Proxy Server) Goal: satisfy client request without involving origin server r user sets browser: Web origin server accesses via web cache r client sends all http requests to web cache m m if object at web cache, web cache immediately returns object in http response else requests object from origin server, then returns http response to client r Why do we need proxy? htt client htt pr equ pr esp est Proxy server ons e t es u eq r nse tp o t p h es r tp ht client CSx 760 Computer Networks t es u eq r se p n t o p ht es r tp ht htt pr equ htt est pr esp ons e origin server 21
Why Web Caching? Assume: cache is “close” to client (e. g. , in same network) r smaller response time: cache “closer” to client r decrease traffic to distant servers m link out of institutional/local ISP network often bottleneck origin servers public Internet 1. 5 Mbps access link institutional network 10 Mbps LAN institutional cache CSx 760 Computer Networks 22
Content distribution networks (CDNs) r The content providers are the CDN customers. Content replication r CDN company installs hundreds of CDN servers throughout Internet m in lower-tier ISPs, close to users r CDN replicates its customers’ content in CDN servers. When provider updates content, CDN updates servers origin server in North America CDN distribution node CDN server in S. America CDN server in Europe CSx 760 Computer Networks CDN server in Asia 23
HTTP request for www. foo. com/sports. html CDN example Origin server 1 2 3 DNS query for www. cdn. com CDNs authoritative DNS server HTTP request for www. cdn. com/www. foo. com/sports/ruth. gif origin server r www. foo. com r distributes HTML Nearby CDN server r Replaces: http: //www. foo. com/sports. ruth. gif with http: //www. cdn. com/www. foo. com/sports/ruth. gif CDN company r cdn. com r distributes gif files r uses its authoritative DNS server to route redirect requests CSx 760 Computer Networks 24
Content Distribution Networks cache CCC. CO M B A C B BBB. CO M B A C A C B A B C AAA. CO M client server surrogate CSx 760 Computer Networks redirector 25
Partial Replication on CDN CCC. CO M B A C B BBB. CO M B A C A C B A B C AAA. CO M client server surrogate CSx 760 Computer Networks redirector 26
Partial Replication on CDN CCC. CO M B A BBB. CO M A A C C B B C A AAA. CO M client server surrogate CSx 760 Computer Networks redirector 27
Factors Affecting Redirection r Goals m Increase system throughput under load m Reduce response latency perceived by clients r Server load m Pick least loaded server r Network proximity m Pick closest server r Cache locality m Pick server just served the object Often conflict What’s the tradeoff across different loads? CSx 760 Computer Networks 28
More about CDNs routing requests r CDN creates a “map”, indicating distances from leaf ISPs and CDN nodes r when query arrives at authoritative DNS server: m m server determines ISP from which query originates uses “map” to determine best CDN server not just Web pages r streaming stored audio/video r streaming real-time audio/video m CDN nodes create application-layer overlay network CSx 760 Computer Networks 29
Summary r HTTP m Stateless application layer protocols m HTTP/1. 1 Persistent connection & pipeline m Authentication, Cookies, and HTTPS m HTTP is not limited to web server/browser m Proxy and CDN for performance improvement CSx 760 Computer Networks 30
- Slides: 30