Web Technologies Martin Kruli by Martin Kruli v
Web Technologies Martin Kruliš by Martin Kruliš (v 1. 3) 3. 10. 2019 1
World Wide Web � What is www? ◦ WWW is NOT the Internet �It is the most used Internet service though ◦ Started as an experiment of CERN physicists ◦ Soon becomes a platform for information exchange �… and business �… and communication �… and porn entertainment �… ◦ Now, www provides fully-grown environment for applications, that are accessible from anywhere by Martin Kruliš (v 1. 3) 3. 10. 2019 2
World Wide Web � Ancient History ◦ Dr. Vannevar Bush �Human brain operates with associations �Designed concept of MEMEX �Device that was never constructed �Published in “As We May Think” paper (1945) ◦ Theodore Nelson �First used the word hyper-text �I. e. , text interlinked with associations �Xanadu �System for sharing information �Implemented as prototype by Martin Kruliš (v 1. 3) 3. 10. 2019 3
World Wide Web � History ◦ Tim Berners-Lee �Created system for sharing data (1989) �Community of physicists in CERN �Simple textual data only ◦ NCSA Mosaic �First browser by Marc Andreesen and Eric Bina �Development started 1992, 1993 released for public �Bought by Microsoft … �… and released in 1995 as Internet Explorer by Martin Kruliš (v 1. 3) 3. 10. 2019 4
World Wide Web � History ◦ 1996 – The war of the browsers started �Internet Explorer vs. Netscape Navigator ◦ 1999 – Last revision of HTML 4. 01 ◦ 2002 – The first ideas of “Web 2. 0” �The content is massively created by users ◦ 2004~2006 – Introduction of AJAX applications �Web is becoming much more interactive ◦ 2010 – HTML 5 is entering the scene �An attempt to eliminate Flash, Silverlight, … ◦ 2013~2015 – Moving towards Single Page Apps by Martin Kruliš (v 1. 3) 3. 10. 2019 5
Client-Server Architecture • • HTML (text) Pictures CSS Embedded Objects (Flash) Scripting (Java. Script) XMLHttp. Request (AJAX, AJAJ) HTML 5 … Server Client • Serving Plaintext • Binary Content • Dynamic Content (CGI) • Scripting (PHP) • AJAX, AJAJ • Caching, HPC, Cloud Solutions • Web. Sockets Integration • Node. JS • … Internet • HTTP (0. 9, 1. 0, 1. 1, 2) • HTTPS • Long-held HTTP (Comet) • Web. Sockets by Martin Kruliš (v 1. 3) Database 3. 10. 2019 6
Accessing Web Pages � How does it work? Address Bar v Browser http: //www. ksi. mff. cuni. cz/cs/lide. php Port 80 v DNS Server 195. 113. 20. 128 v Creates TCP Connection HTTP Protocol Server Client 195. 113. 20. 128 by Martin Kruliš (v 1. 3) 3. 10. 2019 7
Resource Identification � Uniform Resource Identifier (URI) ◦ Identification string with specific format <schema>: <hierarchical_part>? <query>#<fragment> ◦ Query and fragment parts are optional � Uniform Resource Locator (URL) ◦ An URI that describes a location of a resource protocol: //username: password@domain: port/path ? p 1=v 1&p 2=v 2#element_id ◦ Real world example http: //webik. ms. mff. cuni. cz/~krulis#last_part by Martin Kruliš (v 1. 3) 3. 10. 2019 8
HTTP � Hyper-Text Transfer Protocol ◦ Simple textual-based protocol �Operates over TCP channel ◦ Designed for data retrieval �Originally for plain text data �Extended to support any type and encoding (MIME) ◦ The user sends a HTTP Request �Specifying the details of the requested content ◦ The server replies with HTTP Response �Usually containing the requested data ◦ HTTP 1. 1 (RFC 2616) and HTTP/2 (RFC 7540) by Martin Kruliš (v 1. 3) 3. 10. 2019 9
HTTP � Hyper-Text Transfer Protocol Client sends a HTTP request Headers (what the client wants), cookies, POSTed form data Client (Browser) Headers specifying the response and the content user wanted (e. g. , a HTML file) Loads/Generates Content TCP channel established Web Server TCP channel closed…? by Martin Kruliš (v 1. 3) 3. 10. 2019 10
HTTP Details � HTTP Request ◦ Request line (1 st line) Method Request-URI HTTP-version GET /index. html HTTP/1. 1 The method also defines semantics of the request �Method (e. g. , GET requests must �GET – retrieve data from server be nullipotent) �POST – send data to server �HEAD – retrieve response headers only �PUT, DELETE, … - used in special cases �Request URI �A path (possibly with query) of absolute URI �Specifying the requested content by Martin Kruliš (v 1. 3) 3. 10. 2019 11
HTTP Details � HTTP Request Headers ◦ Host – the host domain name ◦ Accept �What is acceptable data type (for a response) �Accept-Charset, Accept-Encoding, Accept-Language ◦ Range – byte range of the contents ◦ If – request conditional �If-Modified-Since, If-Range, … ◦ User-Agent – browser information ◦ Authentication – user credentials ◦ … by Martin Kruliš (v 1. 3) 3. 10. 2019 12
HTTP Details � HTTP Response ◦ Status line (1 st line) HTTP-version Status-code Reason-phrase HTTP/1. 1 404 Not Found �Status Codes � 1 xx – Informational � 2 xx – Success � 200 OK, 204 No Content, 206 Partial Content � 3 xx – Redirections � 301 Permanently Moved, 307 Temporary Redirect � 4 xx – Client side errors � 5 xx – Server side errors by Martin Kruliš (v 1. 3) 3. 10. 2019 13
HTTP Details � HTTP ◦ ◦ ◦ ◦ Response Headers Content-Type – type of the data in the body (MIME) Content-Encoding – how is the content transferred Content-Length – body length in bytes Cache-Control – rules for caching the content Expires – when does the content cease to be valid Location – new URL (in case of 3 xx Redirects) Connection – rules for maintaining TCP connection by Martin Kruliš (v 1. 3) 3. 10. 2019 14
HTTP Details � Multipurpose Internet Mail Extensions (MIME) ◦ Format type extension ◦ Originally designed for mail ◦ Content-Type: type/subtype �application (application/pdf) �audio (audio/mpeg) �image (image/png, image/jpg) �text (text/plain, text/html, text/css) �video (video/mpeg) by Martin Kruliš (v 1. 3) 3. 10. 2019 15
HTTP Example GET / HTTP/1. 1 Host: www. ksi. mff. cuni. cz Request User-Agent: Mozilla/5. 0 (Windows NT 6. 1; WOW 64; rv: 23. 0) Gecko/20100101 Firefox/23. 0 Accept: text/html, application/xhtml+xml, application/xml; q=0. 9, */*; q=0. 8 HTTP/1. 1 200 OK Accept-Language: cs, en-us; q=0. 7, en; q=0. 3 Mon, 16 Sep 2013 16: 11: 02 GMT Accept-Encoding: gzip, Date: deflate Connection: keep-alive Server: Apache/2. 2. 16 (Debian) X-Powered-By: PHP/5. 3. 3 -7+squeeze 15 Response Expires: Thu, 19 Nov 1981 08: 52: 00 GMT Cache-Control: no-store, no-cache, must-revalidate, Request post-check=0, pre-check=0 Pragma: no-cache Vary: Accept-Encoding Content-Encoding: gzip Content-Length: 3005 Response Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html. . . binary content of GZIPed HTML file. . . by Martin Kruliš (v 1. 3) 3. 10. 2019 16
HTTP � Typical Web Page Loading TCP channel established HTML Document Pipelining Client (Browser) Web Server CSS Styles, Images, Scripts, … by Martin Kruliš (v 1. 3) 3. 10. 2019 17
Web Server � Serving Static Pages Apache configuration HTTP Request GET /myweb/index. html. . . /var/www/myweb/ ` Internet Client Web Server HTTP Response HTTP/1. 1 200 OK Content-Length: 1019 Content-Type: text/html; . . . <contents of index. html> index. html by Martin Kruliš (v 1. 3) 3. 10. 2019 18
Web Server � Common Gateway Interface HTTP Request /var/www/myweb/ GET /myweb/app. cgi. . . ` stdin Internet stdout Client Web Server app. cgi HTTP Response HTTP/1. 1 200 OK Content-Length: 2049 Content-Type: text/html; . . . <contents generated by cgi> by Martin Kruliš (v 1. 3) 3. 10. 2019 19
Web Server � Integrating Scripting Modules HTTP Request /var/www/myweb/ GET /myweb/index. php. . . ` mod_php Internet Client Web Server index. php HTTP Response HTTP/1. 1 200 OK Content-Length: 1984 Content-Type: text/html; . . . <contents generated by php> by Martin Kruliš (v 1. 3) 3. 10. 2019 20
Web Servers � Apache HTTP Server ◦ The most often used web server (~65%) ◦ Highly configurable, with modular architecture � Microsoft IIS ◦ Deployed with Microsoft products � Nginx ◦ Widely used in Russia � Lighttpd � Node. js – lighweight HTTP server ◦ A Javascript engine with HTTP server package by Martin Kruliš (v 1. 3) 3. 10. 2019 21
HTTP Secure � HTTPS ◦ Insert SSL/TLS layer between TCP and HTTP ◦ SSL/TLS provides transparent asymmetric encryption ◦ X. 509 Certificates are used �Certificate carries the public and private key �Certificate has additional info (e. g. , a domain name) �Every certificate must be signed by another certificate �By a certificate of a trustworthy authority �By itself (self-signed certificate) �Certificate is verified, before its keys are used �Usually only the server has a certificate by Martin Kruliš (v 1. 3) 3. 10. 2019 22
HTTP Secure � The SSL/TLS Hanshake TCP channel established Client verifies the certificate Certificate (without private key) is sent to the client Client (Browser) If the certificate is accepted, client finishes SSL/TLS handshake using public key to safely send data to the server by Martin Kruliš (v 1. 3) Web Server 3. 10. 2019 23
HTTP Problematic Issues � HTTP Expected Usage ◦ Downloading contents from the server ◦ Uploading small amounts of data to the server � The Most Problematic Issues ◦ Stateless communication �Each request is treated without a context ◦ Client-initiated protocol �Server cannot initiate dialog (e. g. , send updates) ◦ Non-persistent connections �A HTTP (TCP) connection is not maintained for long by Martin Kruliš (v 1. 3) 3. 10. 2019 24
Maintaining Application State � Solution ◦ Additional layer that maintains session ◦ Session identification must be stored at both ends � Session Support ◦ Cookies �Text key-value pairs stored in browser �Associated with sites, transparently sent with each req. ◦ Browser Storage �Javascript APIs session. Storage and local. Storage ◦ PHP Sessions API by Martin Kruliš (v 1. 3) 3. 10. 2019 25
SPDY � SPDY (speedy) Protocol ◦ Designed by Google (July 2012) ◦ Open protocol that improves web content transportation (especially latency) �Basically a modification of HTTP protocol ◦ Most important features �One TCP connection per client (advanced multiplexing) �Intensive compression (including headers) �Server may push content in advance �E. g. , sending page-related data before the request �Focus on security (by TLS encryption) by Martin Kruliš (v 1. 3) 3. 10. 2019 26
HTTP/2 � New Version of HTTP ◦ Based on SPDY protocol �Google abandoned SPDY in favor HTTP/2 in 2015 �First draft was copy of SPDY specification ◦ Differences from SPDY �TLS optional (defined by URI http/https) �Faster and more secure compression �Multi-host multiplexing �Improved prioritization ◦ Implementation �Currently supported by major browsers and sites by Martin Kruliš (v 1. 3) 3. 10. 2019 27
Bidirectional Communication � Comet Client starts asynchronous HTTP Request After timeout, an empty response is sent timeout Server postpones the response if there is nothing to report Client processes the event and issues another request … Client (Browser) Client immediately issues a new request Event notification is sent event Web Server Reportable event occurs by Martin Kruliš (v 1. 3) 3. 10. 2019 28
Web. Socket Protocol � Extension of HTTP(S) Protocols ◦ Two way communication ◦ Persistent connections ◦ Layered over TCP or SSL/TLS connection � Protocol Properties ◦ Defined in detail in RFC 6455 ◦ Handshake is compatible with HTTP handshake ◦ Simple message-based communication �User can specify custom sub-protocols (i. e. , the contents and semantics of the messages) by Martin Kruliš (v 1. 3) 3. 10. 2019 29
Bidirectional Communication � Web. Sockets Web. Socket Protocol Client sends a HTTP “upgrade” request Client (Browser) Server responds with 101 Switching Protocols Messages on the client side are sent/processed by a script Web. Socket protocol replaces HTTP protocol on the TCP channel Web. Socket message can be sent at any time by any party Web Server by Martin Kruliš (v 1. 3) 3. 10. 2019 30
Web RTC � Web Real-Time Communication ◦ API for direct p 2 p communication between browsers ◦ Originally designed for audiovisual data (videophone) Signaling channel (AJAX, WS, …) is required for establishing the connection RTC data are then passed directly or via TURN servers by Martin Kruliš (v 1. 3) 3. 10. 2019 31
Discussion by Martin Kruliš (v 1. 3) 3. 10. 2019 32
- Slides: 32