DataPlane Networking Robert Graham Errata Rob http blog
Data-Plane Networking Robert Graham @Errata. Rob http: //blog. erratasec. com/ http: //bit. ly/14 Nvtc 3
what this talk is about 1. an idea 2. tools/code using that idea
“The bridge's collapse had a lasting effect on science and engineering. Its failure also boosted research in the field of bridge aerodynamics-aeroelastics, the study of which has influenced the designs of all the world's great long-span bridges built since 1940. ” -- Wikipedia
“An International Ice Patrol was set up to monitor the presence of icebergs in the North Atlantic, and maritime safety regulations were harmonised internationally through the International Convention for the Safety of Life at Sea; both measures are still in force today. ”
Changed international building codes: - planes now considered a threat - better fireproofing - better escape
The lesson Take disasters seriously When bad things happen, we rewrite the textbooks
We have yet to rewrite the textbooks
Case study: “Bob Loblaw Law Blog” - Professional sets up blog - VPS, Apache, mod_php, Wordpress - Seems to work - Over time, more visitors from his profession, most posts - Over time, people start complaining that his blog is “down”
How to fix this? Bob keeps trying to find another hoster but the only solutions he can find are Apache mod_php Wordpress Gives up goes with Cloud. Flare
spambots “spiders” index your site (like Googlebot) “scrapers” steal your content/images “comment spam” “I really like your article www. makemoneyathome. com”
Scale
Scale
There is a fix to scale Asynchronous event driven design instead of synchronous threads Result: same performance regardless the number of threads
Hackers have already solved this Scrapers/indexers/bot already are event driven Bob’s site goes down because hackers do “scale” and Apache doesn’t
AMP - Apache, My. SQL, PHP late 1990 s Problem: getting prototype to work Solution: Apache, mod_php, My. SQL 2013 Problem: getting production/operations to work Solution: nginx, Fast. CGI If your web server isn’t scalable, bots will frequently take it down
Apache + mod_php does not work in production. . . not designed to be exposed to the Internet
Data-Plane nginx, lighthttpd, varnish hardware devices (IPS) Control-Plane Apache, mod_php Databases “Business logic”
Data-Plane DNS Control-Plane DNS
My projects “robdns” - DNS server https: //github. com/robertdavidgraham/robdns “masscan” - port scanner https: //github. com/robertdavidgraham/masscan
The Internet: it grows
Number of hosts
Gartner August 2013 Smartphones now outsell dumbphones 225 million smartphones sold last quarter 40% growth since 2012
One phone HTC One IEEE 802. 11 ac 83 mbps upload speed 50, 000 packets/second http: //www. geek. com/apple/2013 -time-capsule-802 -11 -ac-and-htc-one-combine-for-extreme-wifi-speed-1561071/
BIND: struggles at 50 kpps Authoritative server for. net zone
DNS -- essential infrastructure 448074 188925 67971 12645 6246 4717 BIND dnsmasq nominum Power. DNS unbound NSD http: //blog. erratasec. com/2013/09/im-scanning-udp 53 -right-now. html
wat
why? BIND is designed for the control-plane BIND is not designed to be exposed to the Internet (data-plane)
Let’s rewrite the textbook!
Today’s textbooks Let the operating system do the heavy lifting: -TCP/IP stack -Memory management -Synchronous threads -Thread synchronization
Tomorrow’s textbooks The “exo-kernel” that bypasses the kernel -Direct access to network hardware. . . and your own network stack -Huge-pages and custom memory management -Event-driven (one thread per core) -Lock-free thread synchronization
Exo-kernel: Custom network I’m using PF_RING DNS at the moment http: //ntop. org How it works Disconnect network hardware from kernel Connects network hardware to application Theoretical speed: 100 -gbps, 100 million packets per second
Receive Side Scaling Splits one physical into many virtual One virtual adapter per thread so threads don’t share data
Exo-kernel: Custom TCP/IP Packets are easy 14 bytes Ethernet 20 bytes IP 8 bytes UDP or 20 bytes TCP
masscan: 25 million packets/second https: //github. com/robertdavidgraham/masscan
robdns: 3 million pps per core 6 million requests per second port core (repeated queries) 3 million requests per second per core (random queries)
Test: . com zone ~200 million domains requires about 16 gigabytes of RAM requires 32 megabytes of page tables CPU has only 8 megabytes of cache = random DNS request misses cache twice
Exo-kernel: “huge pages” 4 k pagesize 32 megs of page tables for 16 gig memory random access = 200 nanoseconds 2 megs pagesize 64 k of page tables for 16 gig memory random access = 100 nanoseconds
Exo-kernel: thread synchronization Old textbook “mutex” to avoid memory corruption “futex” is a faster mutex New Textbook: Avoid data sharing among threads Pass messages with “rings” Read-copy-update (RCU)
Exo-kernel: avoid sharing Example: statistics each thread maintains their own separate counters separate thread sums them up and reports total
Exo-kernel: ring buffers
masscan Split scan into a transmit thread and receive thread Use “ring” to pass messages between the two Use RSS queues on network hardware 4 hyperthreaded cores 4 pairs of transmit/receive threads
robdns and RCU RSS splits incoming requests to one thread per CPU core these threads are “read only” on the database Updates use “read-copy-update” while other threads using old copy…. . . update creates a new copy of record when done, swaps it reclaims old copy when other threads done
masscan/robdns results Masscan is 10 x faster than other asynchronous scanners e. g. “zmap” is 1. 3 mpps, “masscan” is 25 mpps Robdns is faster than other DNS servers 100 x faster than BIND 10 x faster than NSD, yadifa, knotdns
but this is data-plane Nmap does “host-at-a-time” scanning Masscan does “port-at-a-time” Requires more work on backend to correlate results Robdns has no database Acts only as a slave Mirrors what’s on the master BIND 10 makes a good master
How DNS was designed in the 1980 s robdns BIND 10
Conclusions
We need to rewrite textbooks Apache/mod_php should not be exposed to the Internet BIND should not be exposed to the Internet Exo-kernel programming to achieve scale assume you’ll be connected to a public Internet
This is what your server is doing now
- Slides: 56