CMS User Support and Beijing Site Xiaomei Zhang
CMS User Support and Beijing Site Xiaomei Zhang CMS IHEP Group Meeting June 27 2008
Outline • • • Backup on important local CMS directories The status of SE and data transferring Diagnosis of Grid Analysis Job Discussion on the local batch submission Space Usage Monitoring in the Beijing site 2
Backup on important local CMS directories • The backup system-AMANDA will be ready soon • The directories to be backed up – AFS home directory backup – CMS home directory • /home/cms/ – CMS public software directory • /cmsd 01/public/cms-software/ – Any other important directories? • Backup policy for these directory – AFS is backed up every week – Other directories once a week or one month? 3
The status of SE and data transferring • The debug tests in phedex are ok, we have established links except CNAF • The production transferring is ok – Mengxw 4 TB, one and a half day, massive speed more than 50 MB/s – Found our SE difficult to support such speed • More resources should be added – – support debug tests support production transferring support T 3 ->T 2 support test jobs and user jobs to return the result to our SE • We need to be patient before new resources arrive – Lower the speed of phedex, try to balance the transferring time – At the end of July, new resources including two servers and 200 TB will arrive 4
Diagnosis of Grid Analysis Job(1) • Refer to CRAB FAQ – Crab > 2_2_0, a new version of the copy protocol (srmv 2) is used by default • Check your certificate • Check your code locally • Use the latest crab version – If not, possibly can not get help from crab group – Sometimes the newest version can not work, eg the version of UI and crab not match 5
Diagnosis of Grid Analysis Job(2) • Check the site is available – http: //dashb-cms-sv. cern. ch/dashboard/request. py/siteview – Ask yan xiaofei and chen liang cheng about the Beijing site • Check dataset exists in DBS • Check if the format of your output is correct – >50 MB choose copy_data=1 • Send your questions to the crab Feedback Hypernews list – Document is too simple – Sometimes not every bug got answers – stay or join in crab group for a certain time 6
The advantage and disadvantage of grid job • advantage: – Use more resources besides the Beijing site – Able to do batch submission over datasets • disadvantage: – – Much more complicated than local farm Not easy to diagnose errors(too many possibility, elements, different time) It is not stable as local farm. eg. Frequent upgrade Crab has many bugs, not easy to get full support… • suggestions – When our SE has enough space – When the dataset is available in the Beijing site, it is much easier to run jobs in local farm 7
Discussion on local batch submission • Originally CRAB is not able to support local scheduler, such as PBS • Recently focused on CRAB server – Crab server 1. 0. 0 – Crab server only supported in T 1 or some certain sites? – Standalone crab server is developed to support local schedulers? • Do we need batch submission tool in future? 8
The Goals of Space Usage Monitoring • help administrators control the usage and distribution of T 3 and T 2 space • help users know the space status before doing the subscription in Ph. EDEx and moving dataset between T 3 and T 2 • help users and administrators know the subscription status in the Ph. EDEx and clean up the occupied space in time – it become more necessary when more and more storage resources are added • easy to find the discrepancy between the site space information and the information from the CMS central publishing system • find a way to count the correct space information in order to cope with the bug from d. Cache NFS system – When the file size >2 GB, the space information can not correctly be displayed – Only srmls works 9
The Functions and Realization of space monitoring system • Functions: – Display T 2 space usage – Display T 3 space usage in cmsd 01 and cmsd 02 – Display the Ph. EDEx subscription information • Policies: – Automatically use system commands and SRM commands to count the amount of space usage once or twice each day • • du and df srmls – The information outputted in XML format and displayed in web page – Collect information from the Ph. EDEx system 10
The first version of space monitoring • The web server is located in cmsui 01. ihep. ac. cn • The language used: perl, xml, xsl • The web page is : – http: //cmsui 01. ihep. ac. cn/space_mon/space_BJ. xml • Details: – T 2: name, directory, size, remark – T 3: name, directory, size, remark – Ph. EDEx subscription: requestor, the LFN of dataset or block, the size of dataset, the time when the data is created, remark • Guide and link – https: //twiki. ihep. ac. cn/twiki/bin/view/CMS/Space_Monitoring 11
- Slides: 11