Intelligent Information Systems 1 Internet History Gio Wiederhold

  • Slides: 17
Download presentation
Intelligent Information Systems 1. Internet History Gio Wiederhold EPFL, April-June 2000, at 14: 15

Intelligent Information Systems 1. Internet History Gio Wiederhold EPFL, April-June 2000, at 14: 15 - 15: 15, room INJ 211 9/15/2020 EPFL 1 H - Gio spring 2000 1

Schedule for Seminar Course on Presentations in English -- but I'll try to manage

Schedule for Seminar Course on Presentations in English -- but I'll try to manage discussions in French and/or German. • I plan to cover the material in an integrating fashion, drawing from concepts in databases, artificial intelligence, software engineering, and business principles. 1. 13/4 Historical background, enabling technology: ARPA, Internet, DB, OO, AI. , IR, XML. 2. 27/4 Search engines and methods (recall, precision, overload, semantic problems). 3. 4/5 Digital libraries, information resources. Value of services, copyright. 4. 11/5 E-commerce. Client-servers. Portals. Payment mechanisms, dynamic pricing. 5. 19/5 Mediated systems. Functions, interfaces, and standards. Intelligence in processing. Role of humans and automation, maintenance. 6. 26/5 Software composition. Distribution of functions. Parallelism. [ww D. Beringer] 7. 31/5 Application to Bioinformatics. 8. 15/6 Semantic Interoperation. (Changed from original plan) 9. 22/6 Privacy protection and security. Security mediation. 10. 29/6 Educational challenges. Expected changes in teaching and learning. Summary and projection for the future. • Feedback and comments are appreciated. 9/15/2020 EPFL 1 H - Gio spring 2000 2

The origin: ARPAnet • Motivation – Share expensive computing resources funded at 5 principal

The origin: ARPAnet • Motivation – Share expensive computing resources funded at 5 principal research sites by ARPA – services needed • TELNET -- remote execution control • FTP -- file transfer needed for TELNET • messaging for synchronization -- email - SMTP – requirements • handle heterogeneity • survivability 9/15/2020 EPFL 1 H - Gio spring 2000 3

(D)ARPA • (Defense) – internal motivation, >, < f(political climate) • Advanced Research –

(D)ARPA • (Defense) – internal motivation, >, < f(political climate) • Advanced Research – not undertaken by industry (by need) • Projects – limited time, intense support • Agency – started 1958 - post sputnik - rocket science – Information science started ~1967 IPTO 9/15/2020 EPFL 1 H - Gio spring 2000 4

Technologies • Platform, representation independence – ascii(7), bcd (6), ebcdic (8), binary (any size)

Technologies • Platform, representation independence – ascii(7), bcd (6), ebcdic (8), binary (any size) • Packeting – limits buffer lengths, allows rerouting • Dynamic path determination – nodes decide next best node -- now by DNServers – (versus other systems -- initially • Uunet required specifying all nodes • NASA network had direct connections • VMnet central directory 9/15/2020 EPFL 1 H - Gio spring 2000 5

Growth - exponential • • • • 1969 - 5 nodes, 4 computing sites

Growth - exponential • • • • 1969 - 5 nodes, 4 computing sites 1972 - ~ 12 nodes, 37 sites ~1976 - ad hoc gateways to other nets for email 1979 - many computer scientists have/need access 1981 - Stanford & Xerox router / gateway protocols 1985 domain naming x. y. typ / Internet Protocol (4 segment) addresses 1991 - base for NREN, NSF backbone; except for x. y. mil 1992 - commercial domains permitted - ICANN established 1993 - 15 M users, 3 M paying 1994 - Digital Library initiative 1995 - fully commercial operation, research use by grants 1996 - NSF research initiative New Generation Internet 1999 - 2. 2 M sites on Internet, 288 M public pages 9/15/2020 EPFL 1 H - Gio spring 2000 6

Initial configuration • BBN - development node - IMP [Bob Kahn] – Lockheed mini-computer

Initial configuration • BBN - development node - IMP [Bob Kahn] – Lockheed mini-computer • SRI - documentation node - RFCs [John Postel – DEC PDP-10 • UCLA - network science node [Leonard Kleinrock] – IBM 360 BBN SRI • UCSB - software node [Feldman, … ] – SDS Sigma 7 (~ 360 architecture) SB UT LA • Utah - graphics hardware node [Ivan Sutherland, Evans, ] – SDS 940 Each node has multiple connections to other nodes Nodes can serve more than one computing site 9/15/2020 EPFL 1 H - Gio spring 2000 7

Packeting and IMPs IMP (interface message processor) must deal with A • Limited memory

Packeting and IMPs IMP (interface message processor) must deal with A • Limited memory M • Unreliable communications • Long sessions & big files b c TCP Transmission Control Protocol: Packeting and Packet Switching A, G, 2, 202, mno (used in Aloha. Net, 1962 [Abramson] ) • splits up messages, files into independently portable units: packets { from, to, number, size, data } A, G, 1, 256, i jk l Each node reads header, makes forwarding decisions based on a table (can change dynamically) B 9/15/2020 EPFL 1 H - Gio spring 2000 A here B use b C use c D use b E use c F use c G use b G C g A use b B here C use b 8

Development Informal, distributed over the user community • Request For Comment RFCs collected at

Development Informal, distributed over the user community • Request For Comment RFCs collected at SRI, adopted when they made sense to enough participants, as demonstrated by prototypes, can become standards • RFCs available now at Network Solutions Inc • allowed growth without a central authority X Is that a generalizable principle ? 9/15/2020 EPFL 1 H - Gio spring 2000 9

More early participants • IMPs could handle 4? ? computer sites – (I. e.

More early participants • IMPs could handle 4? ? computer sites – (I. e. , at SRI: SRI PDP-10, Stanford SAIL PDP-6, SUMEX) • added Terminal Interface Processors (TIPs) – for terminals (AT&T TTY, DEC VT 100, …) only • More IMPs, TIPs, but restricted to ARPA contractors Other networks, other technologies • IBM VMnet internal, then external IBM customers – central naming authority • NASA for sharing its satellite data processing – high bandwidth, mainly Telnetting • UUnet for Unix users and ARPAnet sites – periodic forwarding, name in message all intermediate nodes – access to Europe (fast via SEISMO in Norway) 9/15/2020 EPFL 1 H - Gio spring 2000 10

Email ? ! • Initially - at Telnet login show system status – local

Email ? ! • Initially - at Telnet login show system status – local time, up/busy/down, special situations • Add arbitrary messages • As need for remote computing diminished use of E-mail increased - new communication medium Formalized 1982 by RFC 821 Simple Mail Transmission Protocol X Serendipitous major social / research benefit • Many Related Functions – bulletin boards. . . 9/15/2020 EPFL 1 H - Gio spring 2000 11

ETHERNET Novel protocol for broadcast medium [Metcalfe, Bogg, Shoch] • Also developed in Aloha

ETHERNET Novel protocol for broadcast medium [Metcalfe, Bogg, Shoch] • Also developed in Aloha net (Hawaiian islands) • collision detection (CD) protocol – no synchronization, fully distributed – relatively long latency in space and on wire -causes collisions -- crossed, mixed signals – listen while sending, when coll. detected stop both! – resend with exponential backoff (wish humans would do that) – simple and stable to fairly high utilization • outperformed at high rate theoretically only Used for local networks, with gateways to Internet 9/15/2020 EPFL 1 H - Gio spring 2000 12

INTERNET Backbone providers – UUNet (1993), SPRINT, AT&T, MCI, GT&E, Worldcom +European PT&Ts serve

INTERNET Backbone providers – UUNet (1993), SPRINT, AT&T, MCI, GT&E, Worldcom +European PT&Ts serve regional networks, large users (CNN, …) • • share resources by free peering at gateway nodes Regional subnets and ISPs distribute bandwidth to 1. Consumers (#=n) may pay / are seen as having value X Metcalf’s law: the value of a net is ~ n 2 2. Smaller ISPs All dealers oversubscribe: Sell more bandwidth than they buy – count on fractional use – don’t need to buy for intra-region / intra-ISP traffic • Peripheral buffering services can reduce traffic further (MIT AKEMI) 9/15/2020 EPFL 1 H - Gio spring 2000 13

HTML • Hierarchical Text Markup Language – sharing of physics preprints [Tim Berners-Lee @CERN]

HTML • Hierarchical Text Markup Language – sharing of physics preprints [Tim Berners-Lee @CERN] – markup = embedded format commands for layout • Multi-part, multi-representation (text, figs) documents • Markups per SGML + (hyper = external) links – SGML = IBM initiated standard graphic document markup – basic commands are size, font, color independent, to be interpreted by the publisher for report, book, manual, . . . Alternative to (a. o. ) (also UNIX runoff, … ) • XEROX initiated Postscript (PS), Adobe PDF – exact bit-wise layout via executable script • TEX markup Detail (pretty math) [Knuth], LATEX macros [Lamport] – generates device independent format (DVI), then PS 9/15/2020 EPFL 1 H - Gio spring 2000 14

Web • • Browsers for HTML Mosaic [Andressen, Bina at UIUC] Netscape … Search

Web • • Browsers for HTML Mosaic [Andressen, Bina at UIUC] Netscape … Search engines (Topic 2) 9/15/2020 EPFL 1 H - Gio spring 2000 15

XML Machine Processable ! • return to origin? – – ARPAnet -- share heterogeneous

XML Machine Processable ! • return to origin? – – ARPAnet -- share heterogeneous machines Email -- people-to-people Digital Library -- people-to-machines E-commerce (E 2 B)-- people-to-machines • client-server – – Mediated -- people-to-services-to-machines Business (B 2 B)-- machine-to-machine(s) Business services -- machine-to-services-to-machines Ubiquitous -- gadget-to-gadget Future • (embedded) 9/15/2020 EPFL 1 H - Gio spring 2000 16

Fin Comments? • what was new / what was old or boring? • future

Fin Comments? • what was new / what was old or boring? • future emphasis – more technological detail? – more situational detail? – more extrapolation to the future 9/15/2020 EPFL 1 H - Gio spring 2000 17