Pandabased Software Installation Tadashi Maeno BNL Installation using
Panda-based Software Installation Tadashi Maeno (BNL)
Installation using Panda (1/3) server Prod. Sys job pull site B https Installation job https job pilot submit site A pilot install $OSG_APP Operator Worker Nodes 2
Installation using Panda (2/3) Ø Someone submits installation jobs to Panda through usual HTTP I/F – The same I/F is used for production/analysis as well • Authentication • Scheduling (priority, retry, …) Ø Pilots retrieve jobs – Each pilot knows which type of jobs it should retrieve • Production pilots run ATLAS TRF • Installation pilots run Installation TRF 3
Installation using Panda (3/3) Ø Installation TRF 1. 2. 3. 4. 5. Downloads pacman-latest. tgz from http: //physics. bu. edu Setup pacman Scans destination dir to find setup. sh for Athena runtime Installs Atlas releases and/or Production caches when setup. sh is missing Runs Kit-Validation Ø Advantages – – – Automatization Scalability Panda infrastructure like monitoring 4
How to submit jobs import userinterface. Client as Client from taskbuffer. Job. Spec import Job. Spec job. List = [] for site in [‘SLACXRD’, ’AGLT 2’]: job = Job. Spec() job. transformation = ‘…/install. Atlas. SW‘ job. computing. Site = site List of sites TRF site job. Parameters="-s 12. 0. 6 -p 12. 0. 6 slc 3+gcc -c Atlas. Production_12_0_6_3_i 686_slc 3_gcc 323_opt, Atlas. Production_12_0_6_4_i 686_slc 3_gcc 323_opt“ … job. List. append(job) Release Client. submit. Jobs(job. List) + Package + Caches submit 5
Remaining Issues Ø The installation pilot needs write-permission on $OSG_APP – “Normal” pilots are mapped to usatlas 1 because schedulers are running with the production role – Special scheduler running with software role to map pilots to usatlas 2? – g. LExec? Ø Integration with schedconfig DB – schedconfig contains what releases are available at each site – An intelligent client is possible • Get a list of sites where a release is missing • Submit a bunch of jobs to install the release • Update schedconfig when installation is succeeded Ø Who has responsibility on operations? 6
Test at SLAC Ø Tried 13. 0. 25 as it is unused for production Ø Required modifications to SLAC – Outbound HTTP connection • BU : to download pacman • CNAF : to download KV cache – Gave temporary write-permission on $OSG_APP/13. 0. 25 to Nurcan’s DN Ø Submitted an job from BNL – Job=4531377 Ø Installation succeeded and KV passed – log 7
Conclusions Ø Release installation using Panda is ready Ø Tested to install 13. 0. 25 to SLAC successfully Ø A few issues – Permission of installation pilot – Operator – Integration with schedconfig 8
- Slides: 8