Computer Center Farms Status Update Vladimr Bahyl ITFIOFS
Computer Center Farms Status Update Vladimír Bahyl IT/FIO/FS Vladimir. Bahyl@cern. ch
Overview n Current state of the Linux farm n Server cluster n Configuration management n n Key tools: n n CDB and Pan. GUIn CCConfig. pm – unique interface Notification – triggers actions on clients SPMA – Software Package Management Agent Conclusion + Resources
Current State n n n 2 major farms n LXPLUS – interactive – 100 nodes n LXBATCH – batch – 1100 nodes Several smaller – special ones – same core setup n Tape / Disk / Web Application / DB servers / Stagers Installing CERN Red. Hat Linux 7. 3. 2 n PXE boot over the network n Kick. Start – templates generated from configuration database • Common parts: cluster settings, selected packages • Node specific parts: partition table n Post installation script • distribution of additional packages (using SPMA) • configuration deployment (with SUE)
Current Picture
Server cluster n Serves internal infrastructure n n n Configuration database Contains package repository for SPMA Notification centre Hub for system accounting statistics Storage of GPG keys • Used for distribution of sensitive information n Repositories are: n n Synchronized Backed up Load balanced Redundant hardware
Configuration Management n CDB – configuration database n n European Data. Grid – WP 4 – Configuration task solution Set of templates in Pan language • • n n Hardware descriptions Package sets definitions Software components settings Functional definitions (cluster name, load balancing alias) Pan compiler XML • Is the result transferred to the clients n All files (templates, XML) stored internally in a CVS structure
CDB – Template example template hardware_SEIL; include hardware_base; "/hardware/CPUs" = list(create("cpu_Genuine. Intel_Pentium_III_Coppermine_999"), create("cpu_Genuine. Intel_Pentium_III_Coppermine_999")); "/hardware/harddisks" = nlist("hda", create("harddisk_WDC_WD 200 BB 00 CLB 0")); "/hardware/ram" = create("ram_512"); "/hardware/cards/nic" = list(create("nic_Intel_82557_rev_8")); structure template cpu_Genuine. Intel_Pentium_III_Coppermine_999; include cpu_base; "vendor" = "Genuine. Intel"; "model" = "Pentium_III_Coppermine"; "speed" = "999"; structure template cpu_base; "vendor" = undef; "model" = undef; "speed" = undef;
Pan. GUIn n n Graphical user interface to CDB Written in Java Using SOAP to access CDB client Server Cluster Pan temlates repository Pan compiler HTTP n client XML repository client
Pan. GUIn – screen shot
Key tools: CCConfig. pm n n Unique interface Provides local (on the client) access the configuration information Hides configuration data sources CCConfig: : Cluster( [<machine>] ) All scripts use it CCConfig: : Type( [<machine>] ) CCConfig: : Contract( [<machine>] ) CCConfig: : CCDBName( [<machine>] ) CCConfig: : Op. Sys( [<machine>] ) CCConfig: : LSFCluster( [<machine>] ) CCConfig: : SUEVersion( [<machine>] ) CCConfig: : IPaddress( [<machine>] ) CCConfig: : HWaddress( [<machine>] ) CCConfig: : Location( [<machine>] ) CCConfig: : Gateway( [<machine>] ) CCConfig: : Netmask( [<machine>] ) CCConfig: : Time. Server( [<machine>] ) CCConfig: : Name. Server( [<machine>] ) CCConfig: : CPUs( [<machine>] ) CCConfig: : Memory( [<machine>] ) CCConfig: : Hard. Disks( [<machine>] ) returns hash CCConfig: : KSPartition. Table( [<machine>] )
Key tools: Notification n n Enables execution of a predefined action Daemon running on each node subscribes to a server for given set of actions From a central node, when triggered, special tag is sent to each client Tag – symbolizes – Action Subscribe Notify SPMAupdate Configuration update Execute SPMA update Configuration update server clients Notify SPMA update Configuration update Repository Prepare/Load
Key tools: SPMA n n n Software Package Management Agent European Data. Grid – WP 4 – Installation task product Takes over of the RPM packages management on the node right after the Kick. Start part n n n Finishes the installation of additional packages Handles updates of all packages as well Set of packages specific for each service is defined in CDB Triggered by notification HTTP = transfer protocol
Software Templates Core Definition files • Release/CERN-CC/Interactive • Using functions "/software/packages"=pkg_add("parted-devel", "1. 4. 24 -3", "i 386"); "/software/packages"=pkg_add("passwd", "0. 67 -1", "i 386"); "/software/packages"=pkg_add("patch", "2. 5. 4 -12", "i 386"); "/software/packages"=pkg_add("patchutils", "0. 2. 11 -2", "i 386"); "/software/packages"=pkg_add("pax", "3. 0 -1", "i 386"); "/software/packages"=pkg_add("pciutils", "2. 1. 9 -2", "i 386"); # CERN RH 7. 3 release with security updates include pro_software_packages_cern_redhat 7_3_release; Combined files • Cluster specific XML – used by SPMA # CERN CC packages RH 73 include pro_software_packages_cern_redhat 7_3_cerncc_base; # Interactive packages – LXPLUS specific include pro_software_packages_cern_redhat 7_3_interactive; <nlist name="passwd" derivation="pro_decl_funct" type="table"> <nlist name="_30_2 e 67_2 d 1" derivation="pro_decl_funct, pro_sw_pkgs_cern_rh 7_3" type="record"> <string name="arch" derivation="pro_decl_funct, pro_sw_pkgs_cern_rh 7_3">i 386</string> <string name="rep" derivation="pro_decl_funct, pro_sw_pkgs_cern_rh 7_3, rep_cern_cc_i 386_rh 73">CERN_CC</string> </nlist> <nlist name="patch" derivation="pro_decl_funct" type="table"> <nlist name="_32_2 e 5_2 e 4_2 d 12" derivation="pro_decl_funct, pro_sw_pkgs_cern_rh 7_3" type="record"> <string name="arch" derivation="pro_decl_funct, pro_sw_pkgs_cern_rh 7_3">i 386</string> <string name="rep" derivation="pro_decl_funct, pro_sw_pkgs_cern_rh 7_3, rep_cern_cc_i 386_rh 73">CERN_CC</string> </nlist>
Conclusion Fault Mgmt System Monitoring System Node Installation System Configuration System
Resources n FIO/FS section n n CDB n n n http: //cern. ch/it-div-fio-is Piotr Poznanski http: //cern. ch/hep-proj-grid-fabric-config SPMA n n German Cancio Melia http: //cern. ch/wp 4 -install
Thank you n E-mail: Vladimir. Bahyl@cern. ch n http: //cern. ch/vlado
- Slides: 16