Web Technologies Martin Kruli by Martin Kruli v
Web Technologies Martin Kruliš by Martin Kruliš (v 1. 0) 4. 11. 2020 1
World Wide Web � What is www? ◦ WWW is NOT the Internet �It is the most used Internet service though ◦ Started as an experiment of CERN physicists ◦ Soon becomes a platform for information exchange �… and business �… and communication �… and porn entertainment ◦ Now, www provides fully-grown environment for applications, that are accessible from anywhere by Martin Kruliš (v 1. 0) 4. 11. 2020 2
World Wide Web � Ancient History ◦ Dr. Vannevar Bush �Human brain operates with associations �Designed concept of MEMEX �Device that was never constructed �Published in “As We May Think” paper (1945) ◦ Theodore Nelson �First used the word hyper-text �I. e. , text interlinked with associations �Xanadu �System for sharing information �Implemented as prototype by Martin Kruliš (v 1. 0) 4. 11. 2020 3
World Wide Web � History ◦ Tim Berners-Lee �Created system for sharing data (1989) �Community of physicists in CERN �Simple textual data only ◦ NCSA Mosaic �First browser by Marc Andreesen and Eric Bina �Development started 1992, 1993 released for public �Bought by Microsoft … �… and released in 1995 as Internet Explorer by Martin Kruliš (v 1. 0) 4. 11. 2020 4
World Wide Web � History ◦ 1996 – The war of the browsers started �Internet Explorer vs. Netscape Navigator ◦ 1999 – Last revision of HTML 4. 01 ◦ 2001 – The collapse of “Dot-com bubble” ◦ 2002 – The first ideas of “Web 2. 0” �The content is massively created by users ◦ 2004~2006 – Introduction of AJAX applications �Web is becoming much more interactive ◦ 2010 – HTML 5 is entering the scene �An attempt to eliminate Flash, Silverlight, … by Martin Kruliš (v 1. 0) 4. 11. 2020 5
World Wide Web Providers The Perspective of Time time Consumers Content Dataflow � In Textual Content Static Webpages Dynamic Webpages Browser becomes a platform for online applications by Martin Kruliš (v 1. 0) 4. 11. 2020 6
Client-Server Architecture • • HTML (text) Pictures CSS Embedded Objects (Flash) Scripting (Java. Script) XMLHttp. Request (AJAX, AJAJ) HTML 5 … Server Client • Serving Plaintext • Binary Content • Dynamic Content (CGI) • Scripting (PHP) • AJAX, AJAJ • Caching, HPC, Cloud Solutions • Web. Sockets Integration • Node. JS • … Internet • HTTP (0. 9, 1. 0, 1. 1) • HTTPS • Long-held HTTP (Comet) • Web. Sockets by Martin Kruliš (v 1. 0) Database 4. 11. 2020 7
Accessing Web Pages � How does it work? Address Bar v Browser http: //www. ksi. mff. cuni. cz/cs/lide. php Port 80 v DNS Server 195. 113. 20. 128 v Creates TCP Connection HTTP Protocol Server Client 195. 113. 20. 128 by Martin Kruliš (v 1. 0) 4. 11. 2020 8
Resource Identification � Uniform Resource Identifier (URI) ◦ Identification string with specific format <schema>: <hierarchical_part>? <query>#<fragment> ◦ Query and fragment parts are optional � Uniform Resource Locator (URL) ◦ An URI that describes a location of a resource protocol: //username: password@domain: port/path ? p 1=v 1&p 2=v 2#element_id ◦ Real world example http: //webik. ms. mff. cuni. cz/~krulis#last_part by Martin Kruliš (v 1. 0) 4. 11. 2020 9
HTTP � Hyper-Text Transfer Protocol ◦ Simple textual-based protocol �Operates over TCP channel ◦ Designed for data retrieval �Originally for plain text data �Extended to support any type and encoding (MIME) ◦ The user sends a HTTP Request �Specifying the details of the requested content ◦ The server replies with HTTP Response �Usually containing the requested data ◦ Current version HTTP 1. 1 (RFC 2616) by Martin Kruliš (v 1. 0) 4. 11. 2020 10
HTTP � Hyper-Text Transfer Protocol Client sends a HTTP request Headers (what the client wants), cookies, POSTed form data Client (Browser) Headers specifying the response and content the user wanted (e. g. , a HTML file) Loads/Generates Content TCP channel established Web Server TCP channel closed…? by Martin Kruliš (v 1. 0) 4. 11. 2020 11
HTTP Details � HTTP Request ◦ Request line (1 st line) Method Request-URI HTTP-version GET /index. html HTTP/1. 1 �Method �GET – retrieve data from server �POST – send data to server �HEAD – retrieve response headers only �PUT, DELETE, … - used in special cases �Request URI �Relative or absolute path �Specifying the requested content by Martin Kruliš (v 1. 0) 4. 11. 2020 12
HTTP Details � HTTP Request Headers ◦ Host – the host domain name ◦ Accept �What is acceptable data type (for a response) �Accept-Charset, Accept-Encoding, Accept-Language ◦ Range – byte range of the contents ◦ If – request conditional �If-Modified-Since, If-Range, … ◦ User-Agent – browser information ◦ Authentication – user credentials ◦ … by Martin Kruliš (v 1. 0) 4. 11. 2020 13
HTTP Details � HTTP Response ◦ Status line (1 st line) HTTP-version Status-code Reason-phrase HTTP/1. 1 404 Not Found �Status Codes � 1 xx – Informational � 2 xx – Success � 200 OK, 204 No Content, 206 Partial Content � 3 xx – Redirections � 301 Permanently Moved, 307 Temporary Redirect � 4 xx – Client side errors � 5 xx – Server side errors by Martin Kruliš (v 1. 0) 4. 11. 2020 14
HTTP Details � HTTP ◦ ◦ ◦ ◦ Response Headers Content-Type – type of the response data (MIME) Content-Encoding – how the content is transferred Content-Length – body length in bytes Cache-Control – rules for caching the content Expires – when the content cease to be valid Location – new URL (in case of 3 xx Redirects) Connection – rules for maintaining TCP connection by Martin Kruliš (v 1. 0) 4. 11. 2020 15
HTTP Details � Multipurpose Internet Mail Extensions (MIME) ◦ Format type extension ◦ Originally designed for mail ◦ Content-Type: type/subtype �application (application/pdf) �audio (audio/mpeg) �image (image/png, image/jpg) �text (text/plain, text/html, text/css) �video (video/mpeg) by Martin Kruliš (v 1. 0) 4. 11. 2020 16
HTTP Example GET / HTTP/1. 1 Host: www. ksi. mff. cuni. cz Request User-Agent: Mozilla/5. 0 (Windows NT 6. 1; WOW 64; rv: 23. 0) Gecko/20100101 Firefox/23. 0 Accept: text/html, application/xhtml+xml, application/xml; q=0. 9, */*; q=0. 8 HTTP/1. 1 200 OK Accept-Language: cs, en-us; q=0. 7, en; q=0. 3 Mon, 16 Sep 2013 16: 11: 02 GMT Accept-Encoding: gzip, Date: deflate Connection: keep-alive Server: Apache/2. 2. 16 (Debian) X-Powered-By: PHP/5. 3. 3 -7+squeeze 15 Response Expires: Thu, 19 Nov 1981 08: 52: 00 GMT Cache-Control: no-store, no-cache, must-revalidate, Request post-check=0, pre-check=0 Pragma: no-cache Vary: Accept-Encoding Content-Encoding: gzip Content-Length: 3005 Response Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html. . . binary content of GZIPed HTML file. . . by Martin Kruliš (v 1. 0) 4. 11. 2020 17
HTTP � Typical Web Page Loading TCP channel established HTML Document Pipelining Client (Browser) Web Server CSS Styles, Images, Scripts, … by Martin Kruliš (v 1. 0) 4. 11. 2020 18
Web Server � Serving Static Pages Apache configuration HTTP Request GET /myweb/index. html. . . /var/www/myweb/ ` Internet Client Web Server HTTP Response HTTP/1. 1 200 OK Content-Length: 1019 Content-Type: text/html; . . . <contents of index. html> index. html by Martin Kruliš (v 1. 0) 4. 11. 2020 19
Web Server � Common Gateway Interface HTTP Request /var/www/myweb/ GET /myweb/app. cgi. . . ` stdin Internet stdout Client Web Server app. cgi HTTP Response HTTP/1. 1 200 OK Content-Length: 2049 Content-Type: text/html; . . . <contents generated by cgi> by Martin Kruliš (v 1. 0) 4. 11. 2020 20
Web Server � Integrating Scripting Modules HTTP Request /var/www/myweb/ GET /myweb/index. php. . . ` mod_php Internet Client Web Server index. php HTTP Response HTTP/1. 1 200 OK Content-Length: 1984 Content-Type: text/html; . . . <contents generated by php> by Martin Kruliš (v 1. 0) 4. 11. 2020 21
Web Servers � Apache HTTP Server ◦ The most often used web server (~65%) ◦ Highly configurable, with modular architecture � Microsoft IIS ◦ Deployed with Microsoft products � Nginx ◦ Widely used in Russia � Lighttpd � Node. js – lighweight HTTP server ◦ A Javascript engine with HTTP server package by Martin Kruliš (v 1. 0) 4. 11. 2020 22
Apache � Configuration ◦ Global configuration (e. g. , in /etc/apache 2) �General �Modules �Sites ◦ Local configuration �In. htaccess files �Per-directory, with nesting rules Example by Martin Kruliš (v 1. 0) 4. 11. 2020 23
HTTP Secure � HTTPS ◦ Insert SSL/TLS layer between TCP and HTTP ◦ SSL/TLS provides transparent asymmetric encryption ◦ X. 509 Certificates are used �Certificate carries the public and private key �Certificate has additional info (e. g. , a domain name) �Every certificate must be signed by another certificate �By a certificate of a trustworthy authority �By itself (self-signed certificate) �Certificate is verified, before its keys are used �Usually only the server has a certificate by Martin Kruliš (v 1. 0) 4. 11. 2020 24
HTTP Secure � The SSL/TLS Hanshake TCP channel established Client verifies the certificate Certificate (without private key) is sent to the client Client (Browser) If the certificate is accepted, client finishes SSL/TLS handshake using public key to safely send data to the server by Martin Kruliš (v 1. 0) Web Server 4. 11. 2020 25
HTTP Problematic Issues � HTTP Expected Usage ◦ Downloading contents from the server ◦ Uploading small amounts of data to the server � The Most Problematic Issues ◦ Stateless communication �Each request is treated without a context ◦ Client-initiated protocol �Server cannot initiate dialog (e. g. , send updates) ◦ Non-persistent connections �A HTTP connection is not maintained for long by Martin Kruliš (v 1. 0) 4. 11. 2020 26
Maintaining Status Info � Solution ◦ Additional layer that maintains session ◦ Session identification must be stored at both ends � Session Support ◦ Cookies �Text key-value pairs stored in browser �Associated with sites, transparently sent with each req. ◦ Browser Storage �Javascript APIs session. Storage and local. Storage ◦ PHP Sessions API by Martin Kruliš (v 1. 0) 4. 11. 2020 27
Bidirectional Communication � Comet Client starts asynchronous HTTP Request After timeout, an empty response is sent timeout Server postpones the response if there is nothing to report Client processes the event and issues another request … Client (Browser) Client immediately issues a new request Event notification is sent event Web Server Reportable event occurs by Martin Kruliš (v 1. 0) 4. 11. 2020 28
Web. Socket Protocol � Extension of HTTP(S) Protocols ◦ Two way communication ◦ Persistent connections ◦ Layered over TCP or SSL/TLS connection � Protocol Properties ◦ Defined in detail in RFC 6455 ◦ Handshake is compatible with HTTP handshake ◦ Simple message-based communication �User can specify custom sub-protocols (i. e. , the contents and semantics of the messages) by Martin Kruliš (v 1. 0) 4. 11. 2020 29
Bidirectional Communication � Web. Sockets Web. Socket Protocol Client sends a HTTP “upgrade” request Client (Browser) Server responds with 101 Switching Protocols Messages on the client side are sent/processed by a script Web. Socket protocol replaces HTTP protocol on the TCP channel Web. Socket message can be sent at any time by any party Web Server by Martin Kruliš (v 1. 0) 4. 11. 2020 30
Web RTC � Web Real-Time Communication ◦ API for direct p 2 p communication between browsers ◦ Originally designed for audiovisual data (videophone) Signaling channel (AJAX, WS, …) is required for establishing the connection RTC data are then passed directly or via TURN servers by Martin Kruliš (v 1. 0) 4. 11. 2020 31
Discussion by Martin Kruliš (v 1. 0) 4. 11. 2020 32
- Slides: 32