Apache Zookeeper By Mahadev Konar Wednesday June 10
Apache Zookeeper By Mahadev Konar Wednesday, June 10, 2009 Santa Clara Marriott
What is Zoo. Keeper? A highly available, scalable, distributed, configuration, consensus, group membership, leader election, naming, and coordination service
Why use Zoo. Keeper? » Difficulty of implementing these kinds of services reliably › brittle in the presence of change › difficult to manage › different implementations lead to management complexity when the applications are deployed
What is Zoo. Keeper again? » File api without partial reads/writes » No renames » Ordered updates and strong persistence guarantees » Conditional updates (version) » Watches for data changes » Ephemeral nodes » Generated file names
Any Guarantees? 1. Clients will never detect old data. 2. Clients will get notified of a change to data they are watching within a bounded period of time. 3. All requests from a client will be processed in order. 4. All results received by a client will be consistent with results received by all other clients.
Data Model » Hierarchal namespace » Each znode has data and children » data is read and written in its entirety / services Ya. View servers stupidname morestupidity locks read-1 apps users
Zoo. Keeper API String create(path, data, acl, flags) void delete(path, expected. Version) Stat set. Data(path, data, expected. Version) (data, Stat) get. Data(path, watch) Stat exists(path, watch) String[] get. Children(path, watch) void sync(path)
Zoo. Keeper Service Leader Server Client » » Client Server Client All servers store a copy of the data (in memory) A leader is elected at startup Followers service clients, all updates go through leader Update responses are sent when a majority of servers have persisted the change
Use cases inside of Yahoo! » » » » Leader Election Group Membership Work Queues Configuration Management Cluster Management Load Balancing Sharding
Use of Zoo. Keeper in HBase » Leader Election › Ensure there is at most 1 active master at any time » Configuration Management › Store the bootstrap location » Group Membership › Discover tablet servers and finalize tablet server death » To Be Done: › Store the schema information › Store access control lists
Leader Election 1 2 3 4 5 getdata(“/servers/leader”, true) if successful follow the leader described in the data and exit create(“/servers/leader”, hostname, EPHEMERAL) if successful lead and exit goto step 1
Leader Election in Perl my $zkh = Net: : Zoo. Keeper->new(‘localhost: 7000’); my $req_path = “/app/leader”; $path = $zkh->get($req_path, ‘stat’=> $stat, ‘watch’=>$watch); if (defined $path) { #someone else is the leader #parse the string path that contains the leader address } else { $path = $zkh->create($req_path, “hostname: info”, 'flags' => ZOO_EPHEMERAL, 'acl' => ZOO_OPEN_ACL_UNSAFE) ; if (defined $path) { #we are the leader, continue leading } else { $path = $zkh->get($req_path, ‘stat’=> $stat, ‘watch’=>$watch); #someone else is the leader # parse the string path that contains the leader address } }
Leader Election in Python handle = zookeeper. init("localhost: 2181", my_connection_watcher, 10000, 0) (data, stat) = zookeeper. get(handle, “/app/leader”, True); if (stat == None) path = zookeeper. create(handle, “/app/leader”, hostname: info, [ZOO_OPEN_ACL_UNSAFE], zookeeper. EPHEMERAL) if (path == None) (data, stat) = zookeeper. get(handle, “/app/leader”, True) #someone else is the leader # parse the string path that contains the leader address else # we are the leader continue leading else #someone else is the leader #parse the string path that contains the leader address
Performance Numbers.
Where are we? » Multi Tenant (Quotas, connection management, chroot) » Recipes › Reusuable code libraries » Bindings Java, C, Perl, Python, REST
Where are we? » Book. Keeper (a contrib project) › System to reliably log streams of records › Ongoing work to use Book. Keeper optionally as edits log in the Name. Node (Hadoop-5189) › Using Book. Keeper and Zoo. Keeper for a pub system
What do we do next? » WAN – more testing › Cross colo quorum › Client server in different colo’s » Usability – timeouts from zookeeper clients are a headache » Partitioned zookeeper servers » Performance enhancements » Use longs and not ints
Q&A » Questions? » Links: http: //hadoop. apache. org/zookeeper/
- Slides: 18