DNS Dynamic Update Performance Study 2014 10 The

























- Slides: 25
DNS Dynamic Update Performance Study 2014. 10
The Purpose • Dynamic update and XFR is key approach to perform zone data replication and synchronization, to study its performance limitation is meaningful to estimate the efficiency of the whole DNS system • Provide operational practice to DNS operators. • Provide improvements to DNS standard and software implementation
Test Method • Data flow: Primary master -> slave • Generate root zone file, and record the initialized SOA serial number s 0. • Record current time t 0 and start to keep sending n numbers of update requests to primary master without waiting for the ACK from server. Each request is to adding one new TLD which include one NS and one related glue. • At the same time, without waiting for the sending finish, keep querying all three servers, record the time when the SOA serial of respective server reaches to s 0 + n, record the final time t 1. • For each server the UPS(update per second) is (t 1 – t 0)/n
Factors may affect the performance • Zone size • Query pressure of slave node • DNSSEC (not only affect the zone size, but also complicate the update process) • Hard driver write performance
Test Environment • Network topology • Hardware configuration • OS/DNS software
Network Topology
Hardware Configuration Controller: OS: CPU: Memory: Hard driver: Centos 6. 4 x 86_64 Intel(R) Xeon(R) CPU E 5 -2403 v 2 1. 80 GHz DDR 3 1333 2 G ST 500 DM 002 -1 BD 142 7200 16 M Primary Master/master/slave: OS: Centos 6. 4 x 86_64/Freebsd 10. 0 x 86_64 CPU: Intel Xeon E 3 -1220 v 2 3. 1 GHZ 4 cores 4 Threads Memory: DDR 3 1333 ECC 32 G Hard driver: ST 500 DM 002 -1 BD 142 7200 16 M
Dns Software • Primary master – BIND(9. 9. 5) • Master – BIND(9. 9. 5) • Slave – BIND(9. 9. 5) – NSD(3. 2. 18) – KNOT(1. 5. 1)
UPS VS TLD Count(without DNSSEC)
UPS VS TLD Count(with DNSSEC)
UPS vs QPS on Slave Node
Performance Analysis • For primary master, the update procedure is: – Generate the difference (update validation) – Apply the diff to memory DB – Write to journal file – Mark zone to dirty and later synchronize memory data with zone file – Notify other name servers • The bottleneck is hard driver write – To make all the modification persistent, BIND will make sure the journal file is written into disk, which using fsync
Whether is better with SSD?
Hardware Configuration Primary Master (mac pro): OS: OS X 10. 9. 5 CPU: 2. 4 GHz Intel Core i 5 Memory: 8 GB 1600 MHz DDR 3 Hard driver:APPLE SSD SD 0256 F Media Slave (mac air): OS X 10. 9. 5 CPU: 2. 7 GHz Intel Core i 5 Memory: 4 GB 1600 MHz DDR 3 Hard driver:APPLE SSD SD 0256 F Media
UPS VS TLD Count(without DNSSEC)
UPS VS TLD Count(with DNSSEC)
UPS VS QPS (UDP/DO)
Persistent DB vs Memory DB • Like root server system, most distributed DNS system stores RRs into rational DB, using DNS server to provide query and zone synchronization service. • Modify BIND without generating journal file and synchronizing zone file with memory DB to promote the performance. • The following test result is based on the first test environment with modification BIND running on primary master.
UPS vs TLD Count(without DNSSEC)
UPS vs TLD Count(with DNSSEC)
UPS vs QPS (UDP/DO)
Conclusion • The updating for one zone is sequential, therefore multi-core won’t help. • Without persistent guarantee, dynamic update itself is quite efficient • DNSSEC affect the performance by 50% decrease • For each hierarchy level, the performance is dropped by 20~30% • If memory is sufficient, zone size has little impact on update performance. • UDP query pressure also has little impact. Mainly because computation resource and file descriptor resource are sufficient. • For slave node, under update pressure, if KNOT receive IXFR exceeding 1024 serial number change, it will fall back to AXFR which will cause more transfer time and zone file synchronization time. It is the reason why it slower than NSD at some point, and more bigger the zone size, more slower the transfer.
What’s next • The affection of hierarchy depth is tested, the width of it is another important factor of the performance, with more resources, the test will be performed in the near future. • The testing is under LAN, when transfer across WAN, the behavior should be different.
Q & A