Henning Schulzrinne Dept of Computer Science Columbia University
Henning Schulzrinne Dept. of Computer Science Columbia University Networking Research: Moving from a Hardware Model to a Service Model October 3, 2006
Overview • Disclaimers – network systems research – grossly oversimplified • Technology evolution • Research – the big picture • Networking research • New research themes: – user-created content – programmability – TCO Bell Labs 2
Lifecycle of technologies COTS (e. g. , GPS) traditional technology propagation: IM, digital photo military opex/capex doesn’t matter; expert support Can it be done? corporate capex/opex sensitive, but amortized; expert support Can I afford it? Bell Labs consumer capex sensitive; amateur Can my mother use it? 3
Evolution of Vo. IP “how can I make it stop ringing? ” long-distance calling, ca. 1930 “amazing – the phone rings” 1996 -2000 “does it do call transfer? ” going beyond the black phone catching up with the digital PBX 2000 -2003 Bell Labs 20044
IEEE Top 100 R&D Spenders Bell Labs 5
Does Research Pay? • Benefits tend to accrue to everyone – despite patent protection – long-term industrial research seems to require a monopoly • Old Bell Labs, Microsoft – national research: workforce training • But leadership requires research – vs. followers & copiers • What is research? – Is Cisco doing Research? Is Google? – Is research anything that is published in a peerreviewed journal? – Has research paid off for Microsoft? Bell Labs 6
Why Research? • Need goal clarity – multiple goals ok, but often differs between researcher and organization • Science – new fundamental insights • Engineering for the public good – e. g. , open standards • Support product engineering – new & better local products • IPR development – capture value developed elsewhere • Annual report gloss – should get funding from marketing Bell Labs 7
Science vs. Engineering • • • Computer Science has identity crisis: applied math, experimental science or engineering? Applied math – general abstractions & elegant models – reality only a distant motivator – metric: can it be published in J Applied Probability? Experimental science – emphasis on general insights – measurements & models – often reflective: “analyze Gnutella structure” – point solutions – metric: does it fit Small World and is it self-similar? is it optimal? Engineering – emphasis on real-world impact – constrained by existing large systems – system solutions: needs to play nice with rest of the world – metrics: scalability, cost, maintainability, implementability Honesty about what we’re doing Bell Labs 8
Traditional research • Inspired by physics or chemistry • Physics: Theory experiment lab bench prototype (semiconductor) product – Communications: Research advanced development • Necessary for hardware • Dubious for software-intensive systems – rewrite several times (if not forgotten) – less qualified each time – BL example: Unix Bell Labs 9
Who’s the customer? • Business units carrier consumer • Goals may not be identical – BU: preserve investment, confirm earlier choices • ATM, SS 7 – Carrier: preserve product differentiation, business model, customer lock-in, monopoly rent, … • walled gardens, WAP, AAA, DRM, IMS, … – Consumer: fashion, functionality, cost • search engines, Wi. Fi, MP 3, Skype, web hosting, … • Easier for some organizations – e. g. , Google: direct customer is advertiser, but revenue driven by page views consumer Bell Labs 10
Good ideas • • • Myth: Good ideas will win – “Build a better mousetrap and the world will beat a path to your door. ” (Ralph Waldo Emerson) – modern version: IEEE 802. 11 will dig through IEEE Infocom proceedings to find your master paper – even most Sigcomm papers have had no (engineering) impact Myth: Just ahead of its time – it will take 10 years to have impact – reality: most papers either have immediate impact or none, ever Mediocre ideas with commitment win over brilliant ideas without – particularly if part of a larger system – cost of understanding ideas – possible encumbrances (patents) – researchers need to accompany their “children” through teenage years Bell Labs 11
Translation into Practice • Relay model – research advanced development product – information loss rate of 95%? – lack of sense of ownership – hand-off: original owners have moved on to next project • Google model – repeated, continuous refinement – public beta – no separate “research” – still has problems with polish & completion Bell Labs 12
Big Bet vs. 1000 Flowers • Old research approach: – large bets, with huge, but rare pay-off – however, more often cancelled – nobody seems to be able to predict success early – IH syndrome “can’t be good” • Another approach: – low-cost projects – more feasible for software – try it on customer (“beta”) – fail early and often • but threshold to success is lower • find alternate routes to success (e. g. , spin off as open source project) Bell Labs 13
Standards Work • • Old approach – standards group goes to Geneva – Input: dinners – Output: Power. Point – software groups convert finished standard into products (maybe) New approach – standards contributors directly develop (or supervise) libraries, prototypes and other tools • possibly in conjunction with academic research groups – early, pre-completion feedback – rapid early release possible early implementation IPR – train development staff – participate in interop testing Bell Labs 14
Aside: Motivating Researchers • • Industry research attractors: – no fund-raising – better paid – larger, more mature research groups (not just grad students) – larger projects • not just 1 -student NSF projects – impact on real products, not just papers • engineering motivator vs. science motivator • want to see real-world impact Danger of losing attractors – projects that get cancelled semi-randomly – legacy-support research – rapid, fashion-driven changes in wind direction – directional uncertainty (“why are we here? ”) – discouragement of community building & maintenance Bell Labs 15
Some network technology crystal-ball gazing • • • Killer applications Internet evolution Resource scarcity: bits humans Programmability Reliability Bell Labs 16
Killer Application • Carriers looking for killer application – justify huge infrastructure investment – “video conferencing” (*1950 – † 2000) – ? • “There is no killer application” – Network television block buster You. Tube hit – “Army of one” – Users create their own custom applications that are important to them – Little historical evidence that carriers (or equipment vendors) will find that application if it exists • Killer app = application that kills the carrier Bell Labs 17
Service Providers • Old-style service providers: – want to avoid being bit pipes only • but only really successful at that – major innovation: custom ring tones • Boundary between service providers and software providers vanishing – “web 2. 0” – software? service? – APIs Bell Labs 18
Internet and networks timeline theory 1960 university prototypes 1970 port speeds Internet protocols production use in research 1980 100 kb/s email ftp queuing architecture commercial early residential 1990 1 Mb/s 2000 10 Mb/s DQDB, ATM Qo. S Vo. D Bell Labs 2010 100 Mb/s ATM BGP, OSPF Mbone IPsec HTTP HTML RTP DNS RIP UDP TCP SMTP SNMP finger routing cong. control broadband home 1 Gb/s XML OWL SIP Jabber p 2 p ad-hoc sensor 19
Networking research is fashion-driven workshop white paper DARPA, NSF $$ Nth EU framework trailing-edge research Sigcomm Infocom Mobicom ICNP ATM DQDB Qo. S networking courses First (European) workshop on X -YAP on X secondary conferences active networks Bell Labs mobile networks wireless ad-hoc, sensor 20
Cause of death for the next big thing Qo. S multicast mobile IP active networks not manageable across competing domains not configurable by normal users (or apps writers) no business model for ISPs no initial gain 80% solution in existing system increase system vulnerability IPsec IPv 6 (NAT) Bell Labs 21
Maturing network research • • Old questions: – Can we make X work over packet networks? • All major dedicated network applications (flight reservations, embedded systems, radio, TV, telephone, fax, messaging, …) are now available on IP – Can we get M/G/T bits to the end user? – Raw bits everywhere: “any media, anytime, anywhere” New questions: – Dependency on communications Can we make the network reliable? – Can non-technical users use networks without becoming amateur sysadmins? auto/zeroconfiguration, autonomous computing, selfhealing networks, … – Can we prevent social and financial damage inflicted through networks (viruses, spam, DOS, identity theft, privacy violations, …)? Bell Labs 22
Impact of network research • • • What’s promising/interesting – two different axes: – Intellectual merit interesting analysis, broadly applicable, … – Satisfies practical needs may not be a scientific breakthrough Field has few grand challenges and metrics – cf. , speech understanding or face recognition Depends largely on external technology inputs – faster CPUs, better optical gear, compression – typical performance improvements in queueing: 2050% • • • Bell Labs Networking research impact – on deployed systems and protocols? – on understanding network behavior? – on other papers? Which of the 10, 000 Qo. S papers had real impact? What papers were responsible for most important networking advances? – TCP , web? , email? 23
Recent network R&D successes • • • Early networking: success = thousands of other researchers, 80% Ph. D Success now = millions of users, 1% Ph. D – i. Pod ease of use – Blogs, Wiki, You. Tube, Wikipedia user-created content • ease of content creation for non-experts – PHP, Ruby-on-Rails • ease of development for non-experts – Skype • ease of configuration (none) Axiom: The chance of creating a successful new application is inversely proportional to the amount of formal network knowledge – HTTP/1. 0 would flunk any network design exam Bell Labs 24
What’s fashionable (and not) • • • Judging from Infocom submissions and NSF panels: – Security of any sort – Peer-to-peer networks – Sensor networks – Overlay networks – Network measurements Ideal paper – “Ad-hoc MIMO sensor network exploiting small-world phenomena in peer-to-peer overlays” What’s not: – Qo. S: scheduling, admission control, … – Active networks – Multicast Bell Labs 25
Infrastructure research questions: Scaling, Maintainability, Security, … • • • Scaling – no major changes for 20+ years (link-state, DV, etc. ) – two-layer (intra/inter) other routing paradigms Maintainability – protocols and systems are not designed with fault diagnosis capabilities – e. g. , “transparent” proxies, routing, DNS, hacked traceroute Security – secure routing protocols – DOS prevention (pushback, source discovery) Bell Labs 26
… and Reliability • • we don’t know precisely why network applications fails – components and backbones appear to pretty reliable – but we measured at 99. 5% of usable time far below 99. 999% in telecom networks – lots of possible culprits, including DNS and carrier interconnects temporary overloads reduce operator errors – e. g. , XCONF effort in IETF – inherently safe or fail-safe protocols? faster convergence in routing protocols – BGP up to 20 -30 minutes! Bell Labs 27
Why do good ideas fail? • • • Research: O(. ), CPU overhead – “per-flow reservation (RSVP) doesn’t scale” not the problem – at least now -- routinely handle O(50, 000) routing states Reality: – deployment costs of any new L 3 technology is probably billions of $ – coordination costs • The Qo. S problem is a lawyer problem, not an engineering problem Cost of failure: – conservative estimate (1 grad student year = 2 papers) – 10, 000 Qo. S papers @ $20, 000/paper $200 million Bell Labs Qo. S quality-ofservice IEEE 10, 377 12, 876 ACM 3, 487 4, 388 28
Resource Scarcity • • • Old model: – scarce disk memory CPU bandwidth New model: – disks at <$. 50/GB, memory at <$150/GB – 22 mio. SMEs – 100 mio. households – system administrator: call center in India + teenager (or son/daughter) Missing: – automated installation – self-diagnosis – automated scaling across multiple servers • single-server web-app: trivial • 2 -server web-app: circular master slave with custom file sync – automated backup and recovery Bell Labs 29
In more detail… • • • Deployment problems Layer creep Simple and universal wins Scaling in human terms Cross-cutting concerns, e. g. , – CPU vs. human cycles • we optimize the $100 component, not the $100/hour labor – introspection – graceful upgrades – no policy magic Bell Labs 30
Transition in cost balance • Total cost of ownership – Ethernet port cost $10 – about 80% of Columbia CS’s system support cost is staff cost • about $2500/person/year 2 new PCs/year • much of the rest is backup & license for spam filters • Does not count hours of employee or son/daughter time • PC, Ethernet port and router cost seem to have reached plateau – just that the $10 now buys a 100 Mb/s port instead of 10 Mb/s Bell Labs 31
CRF 2007 budget precis toner tapes supplies (cables, tools, parts, …) misc. (shipping, books, …) 16, 000 8, 000 3, 000 software licenses 22, 000 maintenance 12, 000 cell phones, cable modems 7, 000 hardware (servers) ~22, 000 file storage + backup, amortized Staff (5), incl. fringe 20, 000 500, 000 Total 618, 000 Bell Labs 32
User issues (guesses) • Lack of trust – small mistakes identity gone – waste time on spam, viruses, worms, spyware, … • Lack of reliability – 99. 5% instead of 99. 999% – even IETF meeting can’t get reliable 802. 11 connectivity • Lack of symmetry – asymmetric bandwidth: ADSL – asymmetric addressing: NAT, firewalls client(server) only, packet relaying via TURN or p 2 p • Users as “Internet mechanics” – why does a user need to know whether to use IMAP or POP? – navigate circle of blame Bell Labs 33
Technical infrastructure issues • Multi-homing and mobility – address vs. locator issues • Large-scale Internet – secure routing – routing scaling (60, 000 AS) • Architecture – standardization delays now routinely 3 -5 years for minor extensions – resistance to change at ≤ L 4 – difficulty in deploying new applications: • Internet service = outbound port 80 and 443 Bell Labs 34
What has gone wrong? • • • Familiar to anybody who has an old house… Entropy – as parts are added, complexity and interactions increase Changing assumptions – trust model: research colleagues far more spammers and phishers than friends • AOL: 80% of email is spam – internationalization: internationalized domain names, email character sets – criticality: email research papers transfers $B and dial “ 9 -11” – economics: competing providers • “Internet does not route money” (Clark) Backfitting – had to backfit security, I 18 N, autoconfiguration, … Tear down the old house, gut interior or more wall paper? Bell Labs 35
The transformation of protocol stacks Internet ca. 1995 Internet ca. 2005 application presentation session application SOAP HTTP transport TCP network IP IP-in-IP IP H. Zimmermann ca. 1980 MPLS, Po. E link 802. 3 Po. S, ATM physical Bell Labs 36
Simple wins (mostly) • • Examples: – Ethernet vs. all other L 2 technologies – HTTP vs. HTTPng and all the other hypertext attempts – SMTP vs. X. 400 – SDP vs. SDPng – TLS vs. IPsec (simpler to re-use) – no Qo. S & MPLS vs. RSVP – DNS-SD (“Bonjour”) vs. SLP – SIP vs. H. 323 (but conversely: SIP vs. Jabber, SIP vs. Asterisk) – the failure of almost all middleware – future: demise of 3 G vs. plain SIP Efficiency is not important – Bit. Torrent, P 2 P searching, RSS, … Bell Labs 37
Customer Programmability • Old model: – 1000 s of 5 ESS programmers in Naperville using one-of -a-kind language – successor model: JAIN, CAMEL • New model: – carrier programmers are probably no smarter than best early-adopter customers – see Linksys WRT 54 G – Google maps mash-ups – Google/Yahoo libraries Bell Labs 38
Programmable Applications • Old model: – mostly closed applications – sometimes SDKs, but highly complex • New model: – every application designed for humans will be integrated into other applications • Microsoft started trend with Office SDKs and OLE, but limited • web APIs • XUL (web browsers) – applications as components Bell Labs 39
(My) guidelines for a new Internet • • Maintain success factors, such as – service transparency – low barrier to entry – narrow interfaces New guidelines – optimize human cycles, not CPU cycles – design for symmetry – security built-in, not bolted-on – everything can be mobile, including networks – sending me data is a privilege, not a right – reliability paramount – isolation of flows • Bell Labs New possibilities: – another look at circuit switching? – knowledge and control (“signaling”) planes? – separate packet forwarding from control – better alignment of costs and benefit – better scaling for Internetscale routing – more general services 40
Academic Collaboration • Advantage: naïve – don’t know that something can’t be done and has never been done like that before • “Speak truth to power” • Not beholden to carrier business models • Often exposed to more modern programming models – e. g. , Eclipse vs. gcc • Graduate student demographic is closer to future customers – digital natives vs. digital immigrants • Can apply for NSF grants Bell Labs 41
Conclusion • • Clarity of goals and purpose – science, engineering for the greater good, annual report gloss or product development? Is networking research becoming like civil engineering: large, important infrastructure, but resistant to fundamental change? Challenges are in reliability and maintainability, rather than performance or packet-loss & jitter Qo. S As a community, need to learn more from our collective and individual mistakes… – Need series “The design mistakes in [(formerly) popular system or protocol]” Bell Labs 42
- Slides: 42