DUNE Software and Computing News and Announcements Tom

  • Slides: 16
Download presentation
DUNE Software and Computing News and Announcements Tom Junk DUNE Software and Computing General

DUNE Software and Computing News and Announcements Tom Junk DUNE Software and Computing General Meeting February 2, 2016

New Web Sites • dune-data. fnal. gov - Monte Carlo – Challenge 5. 0

New Web Sites • dune-data. fnal. gov - Monte Carlo – Challenge 5. 0 and future MC • MC samples and tiers - Data Files from the 35 -ton prototype • File list – automatically updated from file transfer script • samweb usage tips – tells you how to access files! • dune-young. hep. net - Content copied from lbne-young. hep. net (still not up to date) • lbne-dqm. fnal. gov - Online and Nearline monitoring for 35 -ton 2 02. 16 Tom Junk | DUNE S&C General News

New Build Node dunebuild 01. fnal. gov • • 16 Cores! (AMD Opteron 6320)

New Build Node dunebuild 01. fnal. gov • • 16 Cores! (AMD Opteron 6320) 32 GB of RAM, 5 GB of swap To be used for building code only (we’ll watch for misuse) mrb i –j 16 now gives you a big boost in speed d. Cache disks are not mounted /dune/data and /dune/data 2 however are still mounted. /build/<makeyourowndirectory> has 2. 8 TB in it. Not clear how to use this effectively. • Let Tom know if you need something different on it. • 16 Cores was chosen based on Lynn Garren’s build speed test: https: //indico. fnal. gov/contribution. Display. py? contrib. Id=9&conf. Id=10257 With builds using Blue. Arc (/dune/app), more cores than 16 gives diminishing returns in speed due to disk i/o bottlenecks. • 3 That and the fact that machines with > 16 cores are even less available than the one we got with 16 cores. 02. 16 Tom Junk | DUNE S&C General News

New Redmine Sites • dunebsm Exotic Physics with DUNE • dunefgt Fine-Grained Tracker •

New Redmine Sites • dunebsm Exotic Physics with DUNE • dunefgt Fine-Grained Tracker • dunelbl Long-Baseline Physics WG • dunendk Nucleon Deay • High. LAND Analysis Tool • WA 105 4 02. 16 Dual-Phase proto. DUNE Tom Junk | DUNE S&C General News

CILogon Certificates • Replacing OSG Grid certificates – DUNE VO user entries with OSG

CILogon Certificates • Replacing OSG Grid certificates – DUNE VO user entries with OSG Grid Certificates now given entries for CILogon certificates • Current OSG Grid certificates remain valid until their expiration – no need to hurry and get a replacement CILogon certificate but the next time it’s refreshed there will be a new procedure. • Eileen and Anne have contacted certificate users of the docdb’s and gave instructions for obtaining and using CILogon certificates with the docdb’s. • CILogon will replace KCA certificates too. - jobsub client called kx 509 to generate short-lived certificates using the user’s Kerberos ticket. - other uses, like SAM, required the user to execute kx 509 or cert. sh (which calls kx 509) to get a certificate. - Jobsub use of CILogon “to be transparent to the users” 5 02. 16 Tom Junk | DUNE S&C General News get-

AFS at Fermilab is being shut down Feb. 25, 2016 • Web sites at

AFS at Fermilab is being shut down Feb. 25, 2016 • Web sites at /afs/fnal. gov/files/expww are migrated to the NFS storage area /web/sites/. Available on FNALU and dunegpvm 01 (but not other dunegpvm’s) • Home areas in /afs/fnal. gov/home/room[1, 2, 3]/username being replaced with other networked storage. • I was never fond of our AFS home areas anyhow - Very small quotas in the home area: 500 MB (!) - Authentication token which expires after 26 hours has caused user confusion. - It has its own syntax for managing. Want to know your quota? fs lq. - Not available on grid workers (wouldn’t want that anyhow for the replacement. ) - Backups in /afs/fnal. gov/files/backup/home - AFS@FNAL documentation (becoming irrelevant) https: //computing. fnal. gov/unixatfermilab/html/afs. html 6 02. 16 Tom Junk | DUNE S&C General News

New Home Areas and Web sites • Used to have personal “professional” web areas

New Home Areas and Web sites • Used to have personal “professional” web areas in ~/public_html/index. html for example. • Accessed via http: //home. fnal. gov/~user/index. html • Directory listings over http disabled without a special Service Desk request. • Now there are NFS web areas dunegpvm*: /publicweb/<firstletter>/<youruser. ID> where <firstletter> is the first letter of your user ID (= kerberos principal) • Backups in /publicweb/. snapshot in case you accidentally delete something • Home area snapshots and backups in the post-AFS era to be defined and documented. 7 02. 16 Tom Junk | DUNE S&C General News

lbnegpvm*. fnal. gov dunegpvm*. fnal. gov • Users were in the lbne group active

lbnegpvm*. fnal. gov dunegpvm*. fnal. gov • Users were in the lbne group active users or recently active users given new accounts in the dune group • New dunegpvm 11 spun up with new group and new user list. • No /lbne/data, /lbne/data 2, /lbne/app mounts on new dune machine. Same areas are mounted under /dune • Still have /pnfs/lbne mounted (needed as some files are accessible only that way). Same with /scratch/lbne • Current status: migrated lbnegpvm 06 – lbnegpvm 10 to dunegpvm machines. Gave back dunegpvm 11. lbnegpvm 01 through lbnegpvm 05 (with dunegpvm convenience names) being converted as I write this. Finding missing things (like d. Cache mounts) and iterating with the Service Desk. 8 02. 16 Tom Junk | DUNE S&C General News

Blue. Arc Dismount on Grid Workers • Affects us in particular! - /lbne/data, /lbne/data

Blue. Arc Dismount on Grid Workers • Affects us in particular! - /lbne/data, /lbne/data 2 not mounted on dunegpvm 6 -10 machines, but still mounted on grid worker nodes. - /dune/data, /dune/data 2 not mounted on grid worker nodes (!). These mount points were made after the decision to migrate away from Blue. Arc on the grid was taken. - Two ways to store your data: • ifdh cp it to d. Cache: /pnfs/dune/persistent/users and /pnfs/dune/scratch/users Ask about tape-backed space! (We prefer SAM so the files won’t get lost) • ifdh cp the files to Blue. Arc (many people still do this). This too will be disabled! End of 2016 shutdown! 9 02. 16 Tom Junk | DUNE S&C General News

Metadata Changes • Existing data tiers: raw simulated detector-simulated full-reconstructed • New data tier:

Metadata Changes • Existing data tiers: raw simulated detector-simulated full-reconstructed • New data tier: sliced The slicer/stitcher input source only works on raw data – limited number of data products it has to know how to slice and stitch. A new problem: The slicer/stitcher reformats events based on a software trigger definition. Do we need to store which trigger def was used in metadata? Tack it on the end of the detector type string? 10 02. 16 Tom Junk | DUNE S&C General News

A Good Run List Proposal • So far only 35 -ton has data and

A Good Run List Proposal • So far only 35 -ton has data and thus needs a good-run list. • One person’s bad data is another person’s good data. • Alex Himmel suggested it would make SAM dataset queries simpler if good-run status were part of the metadata • Can request a new good-run metadata field: arbitrary string so we can encode various kinds of goodness or badness. • CDF had good run lists that were distributed as root trees and text files. Didn’t make sense to limit public datasets to a particular goodrun set because runs would be re-classified and it takes a long time to reprocess everything. • Need curation of the good run list. Who decides? Shift tool? Data Quality Team needed to make judgments. • For 35 -ton, we probably want analyzers to be tightly coupled to the data taking. Label special data runs for special analyses and record run numbers and ranges that are intended for subsequent analyses. 11 02. 16 Tom Junk | DUNE S&C General News

FIFE News • Summer 2016 FIFE Workshop during the week of June 20 •

FIFE News • Summer 2016 FIFE Workshop during the week of June 20 • Fermilab GPGrid new features: partitionable slots, priority queueing instead of quotas: https: //fermipoint. fnal. gov/organization/cs/scd/_layouts/15/Wopi. Frame. aspx? sourcedoc=/organization/cs/scd/CS%20 Liaison%20 Meet ings%20 Library/CSLiaison_01_13_16. pdf&action=default • Job Efficiency Links http: //web 1. fnal. gov/scoreboard/daily_reports/fife-efficiency. daily. latest http: //web 1. fnal. gov/scoreboard/weekly_reports/fife-efficiency. weekly. latest http: //web 1. fnal. gov/scoreboard/monthly_reports/fife-efficiency. monthly. latest 12 02. 16 Tom Junk | DUNE S&C General News

Job Resource Limits Enforced on FNAL GPGrid • Last year the grid was more

Job Resource Limits Enforced on FNAL GPGrid • Last year the grid was more forgiving about going over - time limits (not CPU, wall-clock time is what counts) - virtual memory size - disk space used • But now these limits are enforced. See the page https: //cdcvs. fnal. gov/redmine/projects/dune/wiki/Submitting_Jobs_at_Fermilab For examples of how to ask for resources and links to more documentation. • • What happens if your job goes over the limit? It doesn’t get killed, but rather gets Held. To find out what went wrong, jobsub_q --held --user=<username> • You can use fifemon. fnal. gov to monitor how many jobs you have in each state. • Policy may be different on non-FNAL OSG sites. 13 02. 16 Tom Junk | DUNE S&C General News

Very minor. . . • Users in the LBNE VO are getting e-mails saying

Very minor. . . • Users in the LBNE VO are getting e-mails saying that their AUP (Acceptable Use Policy) signatures are expiring (1 year). • Users can ignore these and use the DUNE VO instead. 14 02. 16 Tom Junk | DUNE S&C General News

/dune/app Filled up briefly yesterday 15 02. 16 Tom Junk | DUNE S&C General

/dune/app Filled up briefly yesterday 15 02. 16 Tom Junk | DUNE S&C General News

Reminder: DAQ Workshop at CERN Dates: Feb. 25 -26 at CERN https: //indico. fnal.

Reminder: DAQ Workshop at CERN Dates: Feb. 25 -26 at CERN https: //indico. fnal. gov/conference. Display. py? conf. Id=11372 DAQ Hardware, Software, and Offline Computing Infrastructure Ask Maxine (maxine@fnal. gov) about site access for non-CERN users. 16 02. 16 Tom Junk | DUNE S&C General News