VOLDEMORT VOLatile Distribution of Electronic Media Over Rsync
VOLDEMORT VOLatile Distribution of Electronic Media Over Rsync Transport May 22, 2003 LSCCW Steven Timm 1 May 22, 2003 VOLDEMORT-S. Timm--LSCCW
Introduction: l l 2 Disclaimer--Any resemblance to characters in Harry Potter books of J. K. Rowling is pure coincidence. Rsync is open-source package which allows to keep directories on remote machines synchronized with each other Common method at many installations of distributing volatile local files on machines that are already installed. Needs a set of supporting scripts to make it a useful tool. VOLDEMORT-S. Timm--LSCCW May 22, 2003
VOLDEMORT overview l l Currently deployed on over 700 machines at Fermilab Works on RH Linux 6, 7, 9, Advanced Server, Sun, SGI Written in shell scripts and perl Two major uses: – – 3 Keeping production computing farms installations up to date without reinstalling Partitioning of US/CMS computing dynamically VOLDEMORT-S. Timm--LSCCW May 22, 2003
Major Goals of Voldemort l l l 4 Replace NIS with system to put passwd files locally on each node Have a unified structure to push new files out to existing nodes and install them on new nodes Have a single place where each volatile file is modified. Keep current capability to have special files for a single farm, subcluster, or node. Production Farms plus US/CMS have at least 13 different hardware configurations VOLDEMORT-S. Timm--LSCCW May 22, 2003
Components of Voldemort 0. 6 l l l 5 Voldemort-push, includes rsync_push binary and scripts to clone slave servers. Installed on all servers. Voldemort, installed on all clients, includes pullrsync and a number of auxiliary scripts that are called by pusher and puller. Tree file structure, set up in $VOLDEMORT_DIR/clusters Databases to describe the file structure Available as RPM or in Fermi ups/upd format. VOLDEMORT-S. Timm--LSCCW May 22, 2003
Features l l 6 Pullrsync included in 7. 3 Farms workgroup post -install of Fermi Linux Scripts tested on all flavors of Linux Includes option to sync out changes in Fermi Linux comps file But—not tied to Fermi Linux. . can and does work with other installation systems such as Rocks or System Imager VOLDEMORT-S. Timm--LSCCW May 22, 2003
Why replace NIS? l l l 7 NIS was stable on CDF farms-169 nodes, no timeouts for months—BUT We had to have at least 64 NIS slave servers to accomplish this Pushing to all those slave servers is a network load in itself Yppush doesn’t gracefully handle when a node is down Initial configuration of ypinit –s is error prone and can’t be automated during install. VOLDEMORT-S. Timm--LSCCW May 22, 2003
Why replace NIS cont’d l l l 8 Malformed map on one slave server can mess up several nodes NIS is small amount of network traffic but is very sensitive to bigger network flows and is disrupted by them. On our farms, we don’t store any real passwords in NIS, accounts change rarely. Ideal situation to distribute files VOLDEMORT-S. Timm--LSCCW May 22, 2003
Installer vs on-line changes l l 9 Whenever we made a change to the farm, we had to change in two places…on the nodes and in the installer. Often this has been forgotten Method of making installer changes is not straightforward Need to make a system where any file that goes on the system is only changed in ONE place. VOLDEMORT-S. Timm--LSCCW May 22, 2003
Down nodes problem: l l 10 Right now, if we put extra files on the system, we have to go back and fix nodes that were down later, manually. Need a system that will remember which nodes were down, and keep retrying until it gets them all. VOLDEMORT-S. Timm--LSCCW May 22, 2003
Design goals of Voldemort l l l 11 Do not put any node-specific info into the Fermi Linux workgroup—we don’t want our whole structure available to world via anonymous FTP. (or our account names and groups or nfs servers). Replace /export/linux/Workgroups/Farms/nodes with a new structure that is used both by online activities and the installer. Keep our current capacity to have node specific, farmspecific, and subcluster-specific files VOLDEMORT-S. Timm--LSCCW May 22, 2003
$VOLDEMORT_DIR/clusters/common/d b/nodes. conf l Nodes. conf—database of nodes. Read by both rsync_push and pullrsync l fncdf 75: cdffarm 1: Linux+2. 4. 18: i-acd: 38400: : N: 2518 Reader l Fields: l – – – 12 Node name Cluster name Flavor Disk arrangement Baud Rate APIC used in install Node specific Subclusters VOLDEMORT-S. Timm--LSCCW May 22, 2003
$VOLDEMORT_DIR/clusters/common/d b/files. conf l l l Files. conf: Not fully populated yet. Three fields: – – – – 13 Full pathname to the file (example: fnsfo/files/Linux+2. 4. 18/etc/passwd) Files it depends on common/templates/Linux+2. 4. 18/etc/passwd fnsfo/templates/NULL/yppasswd Command used to make it (cat the above two files together). VOLDEMORT-S. Timm--LSCCW May 22, 2003
$VOLDEMORT_DIR/ clusters/fnpce/ l l l 14 Prescripts—scripts that have to be executed before a rpm or file can be installed RPMS Files—single files that are pushed out to worker nodes Scripts—usually run only by the installer Tarballs—Mainly for pushing out /local/ups directory to worker nodes VOLDEMORT-S. Timm--LSCCW May 22, 2003
$VOLDEMORT_DIR/clusters/fnpce/files l l l 15 Under each category, space for more than one flavor. Right now: Linux+2. 4. 18 (731) Linux+2. 4 (711) Linux+2. 2 (612) IRIX+6. 5 Can also define arbitrary flavor “foo” as long as database matches. VOLDEMORT-S. Timm--LSCCW May 22, 2003
$VOLDEMORT_DIR/clusters/fnpce/files/ Linux+2. 4. 18 l l l Each subdirectory of files directory gets pushed out independently—governed by. pushdir files Four subdirectories (typ) /etc, /root, /usr/local, /var/adm Three types of files: – – – 16 Passwd, group, netgroup, auto. *, . k 5 login Non-standard config files for RPMS in redhat base Hardware-specific or farm-specific files VOLDEMORT-S. Timm--LSCCW May 22, 2003
$VOLDEMORT_DIR/fnpce/tarballs/Linux +2. 4. 18 l l l 17 Currently one tarball Structure same as files (. pushdir governs) /local/ups/localups. tar Tarball should be created to be untarred in the directory it’s pushed into. Had to add this option because pushing a ups/upd tree of 19 K files (180 Mb) was too slow. VOLDEMORT-S. Timm--LSCCW May 22, 2003
$VOLDEMORT_DIR/clusters/fnpce/RPM S/Linux+2. 4. 18 l l 18 RPMS that go here are either farm-specific or hardware-specific. Anything for whole farm should go into farms workgroup. VOLDEMORT-S. Timm--LSCCW May 22, 2003
$VOLDEMORT_DIR/clusters/fnpce/[pre] scripts/Linux+2. 4. 18 l l l 19 Scripts and prescripts are mainly executed during the install Installer calls /sbin/pullrsync –I which forces running of all scripts Scripts should be smart enough to detect if the action has already been done VOLDEMORT-S. Timm--LSCCW May 22, 2003
Subclusters l l l 20 Subclusters can exist in any of five categories, files, tarballs, RPMS, scripts, prescripts Subcluster membership determined by the database Convention: All hardware specific files (ethernet, lm_sensors) go into a subcluster named after the motherboard type Node can be in more than one subcluster For files and tarballs, a. pushdir at the top level. VOLDEMORT-S. Timm--LSCCW May 22, 2003
Node-specific files l l 21 Can also have files specific to a single node Enabled by having field in database be “Y” instead of the default “N” VOLDEMORT-S. Timm--LSCCW May 22, 2003
Rsync_push l l l 22 Rsync_push reads through the database and pushes to every node that matches the command-line options it was called with. *IMPORTANT* Default is to push to everything! There is an are-you-sure option now that warns you what you are pushing. Rsync_push –r allows you to retry nodes that didn’t push successfully the first time. Default transport is kerberized rsh. Others can be used as well. To push to a node, host principal of the server must be in /root/. k 5 login of the client node. VOLDEMORT-S. Timm--LSCCW May 22, 2003
Rsync_push options 1 l l l l 23 -c Push for a given cluster -f Push for a given flavor -b Push for a list of nodes -B Push for a range of nodes -l <filename> push for all the nodes in <filename> If more than one is specified, we take the AND Example: rsync_push –c cdffarm –f Linux+2. 4. 18 –B fncdf “ 171 173 -176” VOLDEMORT-S. Timm--LSCCW May 22, 2003
Rsync_push options 2 l l l l 24 -R don’t push the RPMS -F don’t push the files -S don’t push the scripts -P don’t push the prescripts -T don’t push the tarballs -L don’t push the Linux /etc/workgroup Default is to push everything VOLDEMORT-S. Timm--LSCCW May 22, 2003
Rsync_push options 3 l l l 25 l -w specify the workgroup you are pushing (default is Farms) -e use an alternative rsh command besides /usr/krb 5/bin/rsh -q quiet—minimum or no output -v verbose—the more v’s, the more verbose -i Install mode—run new scripts and prescripts when they are pushed out -I Install mode—run all scripts and prescripts when they are pushed out -C Clear out the RPMS, scripts, and prescripts directory on the worker. VOLDEMORT-S. nodes. Timm--LSCCW May 22, 2003
pullrsync l l l 26 Determines node ID and type either from local config file or from database read Runs only if machine wasn’t shut down clean, and during the install Options –h help –H <hostname> –c <cluster> f <flavor> –M <module> -t <target server> -R – S –P –F –T –L –w –I –i –q –v (as in rsync_push) VOLDEMORT-S. Timm--LSCCW May 22, 2003
Future plans l l l 27 Version v 0_6 current, no known bugs right now. Next version needs better and faster database Also need ability to automatically distribute the push across slave servers Big task, integrating more closely with ROCKS and RH 9. 0. http: //wwwoss. fnal. gov/scs/public/farms/doc/voldemort. html VOLDEMORT-S. Timm--LSCCW May 22, 2003
- Slides: 27