EDG Testbed Status Moving to Testbed Two Presenter
EDG Testbed Status Moving to Testbed Two Presenter Name Facility Name
Outline • • Current production status. Testbed at RAL. Testbed two. Changes from testbed one. LCFG -> LCFGng Software by node type. Status of integration of testbed two. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Current Application TB Status • Recommended testbed is still RH 6. 2 , edg 1. 4. 9 with LCFG. • Currently eight UK sites contribute to the total seventeen in datagrid. • Not changed this year since the BD-II was introduced. The RB is again the limiting factor now the information system is reliable. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Production Testbed at RAL. • 1 CE, 1 SE (350 GB), 10 WNs, 1 UI. • Top edgapp GIIS now at RAL. • 1 CE as a gatekeeper into tier 1 a system. – In use by Atlas, Babar, LHCb and DZero, 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Running Production Jobs One month to 27 th April 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Development Testbed at RAL • CE, SE, WN, MON, RLS, VOMS and LCFGng exist already. • UI exists (gppui 06. gridpp. rl. ac. uk) • RB, HLR and PA to be installed this week. • Updates happen twice a day on average. • Improvements now faster than the addition of software. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Testbed Two • Lots more node types. • Everything is incompatible including schema, globus, gridftp, … • Still a lot of testing to be done. Loose cannons are not yet loose. • GDMP vanishes which is good for integration of software into farms. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
LCFG -> LCFGng • LCFGng is definitely an improvement. – NIS clients can be configured. – DMA can be turned on. – Using autofs is now the default. – Node profile updates happen immediately and reliably. – Reboot for ever does not happen. – PXE support built in from the start. http: //gpplcfg. gridpp. rl. ac. uk/install. cgi – LCFGng generally is more complete. • WP 4 s instructions are better and time proof. Steve Traylen, s. traylen@rl. ac. uk 28 th April 2003 • LCFGng has a web interface. PPD
LCFG ->LCFGng(2) • Each node and object reports back its status. • http: //gpplcfg. gridpp. rl. ac. uk/status/. • Middleware configuration completed by the developers and is a lot more ‘intelligent’. • EDG profiles are modular and clearer than before. • Hardware support is still limited, e. G. Raid, SCSI, e 1000 all require special case kernels. • Post install notes will be supplied but are smaller, e. g gridmapdir is mounted, pool account lock files created, site GIIS configured to accept registrations from SE. Steve Traylen, s. traylen@rl. ac. uk 28 th April 2003 PPD
Compute Element Node • Now use’s Maui 3. 2. 6 for scheduling. – Information providers claimed to support this. • MPICH is installed. – Needed for wp 1’s support of MPI jobs. • GLOBUS v 2. 2. 4 supplied from VDT 1. 1. 8. – VDT. Virtual data toolkit from i. VDgl. • Gatekeeper, MDS and gridftp server. • R-GMA client – Publishing CE information via GIN. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Computing Element(2) • • • Grid. FTP logs published into R-GMA. MSA. Monitoring Sensor Agent. Application software. DGAS client. One CE per site is required. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Worker Node • • • Application Software. Globus clients, Grid. FTP. RFIO clients. MSA, Monitoring Sensor Agent. VOMS, R-GMA, RLS, Reptor, Netcost and SE clients. • At least one per site required. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Storage Element • Globus, Grid. FTP and MDS • Information Providers publishing via MDS and GIN, R-GMA. • Grid. FTP logs published into R-GMA. • Replica Location -> Site Replica mapping. • SE (apache) and SE web service (tomcat). • One per site, possibly one per media, e. g. CASTOR, Atlas Data Store, Disk? • MSA. Monitoring Sensor Agent. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
User Interface • Clients for: – Datagrid job submission. – Globus job submission. – R-GMA – SE – VOMS – RLS, Reptor, Optor – Network cost client. • Access required by all users of datagrid. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Monitor Box • Two distinct functions. • R-GMA servlets(tomcat) – CE, SE and application producers register here. • The fmon. Server collects information from all the MSA. – Sensors such as lm_sensors, load, uptime, network I/O collected on the MON box. • My. SQL. • One per site required, unless tomcat is moved elsewhere. Steve Traylen, s. traylen@rl. ac. uk 28 th April 2003 PPD
Replica Location Service Node • • Tomcat My. SQL R-GMA client, publishes service status. Replica catalogue. Metadata catalogue. Replaces the current replica catalogue. One node per VO until VOMS is integrated. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Information Catalogue Node • • R-GMA registry servlets. Tomcat My. SQL One node required per testbed. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Network Monitor • • • Iperf – Network bandwidth. Pinger – RTT time. UDPmon – UDP throughput. r. TPL – A combination of the above. Netagent – Network traffic from interface to router. • All published via apache and perl cgi. • One per site. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
VOMS and My. Proxy node • VOMS – Apache (mod_perl) – My. SQL – One per VO required. – VOMS will not be in TB 2. • My. Proxy – My. Proxy server. – At least one per testbed. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
WP 1 Nodes • DGAS (dg accounting service) 3 nodes. • Deployment commences this week. • RB ( Resource Broker) – At least one per testbed. • HLR (Home Location Register Node) – Stores the accounts of users and resources. – One per testbed. (or site? ). • PA (Price Authority Node) – Assigns prices to resources. – One per VO. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Status of Integration • Lots of parallel changes, lots of new software. – Impossible to follow. • Globus job submission is working, with some magic. • Fabric Management and Network Monitoring complete. • Information system , R-GMA was working well but introduction of GLUE has required a rerelease. • RLS service has been shown to work. • Integration of the SE 28 thand SE with RLS is not Steve Traylen, s. traylen@rl. ac. uk April 2003 PPD completed.
Status of Integration(2) • Introduction of WP 1 software this week and its success is critical to a release date. • Job management interfaces to almost everything though this is the final component. • May is expected to consist of continuous bug fixes once all software has been deployed. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
Testbed Two in the UK • For a UK independent grid the UK will need to support the gridpp VO. – RB, HLR(? ), PA, RLS, VOMS. • Once the RB is in place (IC) then it makes sense to move interested sites to testbed two. • What happens next? – LCG 1 , Crossgrid and EDG boundaries become blurred. 28 th April 2003 Steve Traylen, s. traylen@rl. ac. uk PPD
- Slides: 23