ICAT Metadata management Tom Griffin STFC ISIS Facility



































![Data Searching • Datafile <-> Datafile. Parameter[type. name = ‘sample_temperature'] AND Datafile. Parameter[value = Data Searching • Datafile <-> Datafile. Parameter[type. name = ‘sample_temperature'] AND Datafile. Parameter[value =](https://slidetodoc.com/presentation_image_h2/39a28554e271016d8f44814d77345fbd/image-36.jpg)




- Slides: 40
ICAT – Metadata management Tom Griffin, STFC ISIS Facility Swiss. FEL ARAMIS Instrumentation Workshop PSI June 2012 tom. griffin@stfc. ac. uk
Background • Pulsed Neutron & Muon source • 35 instruments • 2500 users / year • >800 experiments /year • Scientific Computing – Data analysis, data management, HPC etc
ICAT • Search for data in a meaningful way e. g. taxonomy, Sample, temperature, pressure etc • Share data with colleagues • Access data anywhere via the web* • Annotate your data • Link to data from your publications
Example ISIS Proposal GEM – High intensity, high resolution neutron diffractometer H 2 -(zeolite) vibrational frequencies vs polarising potential of cations Proposals Experiment Analyzed Data Once awarded beamtime at ISIS, an entry will be created in ICAT that describes your proposed experiment. Data collected from your experiment will be indexed by ICAT (with additional experimental conditions) and made available to your experimental team You will have the capability to upload any desired analysed data and associate it with your experiments. B-lactoglobulin protein interfacial structure Publication Associate publications to your experiment and even reference data from your publications.
Who uses it? * *
Basic Metdata Model Investigators Sample Investigation Dataset Datafile Parameter Type Parameter
Where it fits in • Pre-experiment – Proposal – Safety Data – User office data – Apply permissions
Getting data into ICAT • During the experiment • ‘ICATIngest’ • Subscribes to file system events Read. Directory. Changes. W() • Invokes ‘Write. Raw’ and ‘Nx. Ingest’ • Caches groups of files for performance • Extracts ‘science’ metadata
Getting Data out - Web Access • Main point of access for ISIS data • HTTP file serving – pluggable architecture • Zipped or non-zipped • Basic ‘IDS’ available with ICAT 4. 0 • 3 method interface for custom data layer (e. g. FUSE? )
Top. CAT – basic web interface
Getting Data out - Mantid
ISIS – Volume and Rates Files/15 minutes Files/minute Files/second Peak 10857 724 12. 06 Mean 968 65 1. 08 Median 263 18 0. 29 Peak - 4 hours averaging of 6. 4 files/sec (384/minute) Peak of ~10% load on dual-core virtual server (limited by ISIS data archive)
Data Policy • Who can access what data when • ISIS policy published and implemented • ICAT is flexible • http: //www. isis. stfc. ac. uk/useroffice/data-policy 11204. html • tiny. cc/isisdp
Data Access - Authentication • Pluggable, stackable components • ISIS – Active Directory -> User database -> local database • Diamond • Simple interface – examples available • 4. 2 will be Umbrella native (? )
Data Access - Permissions • ‘Rules’ based + explict permissions • Admin when user in ‘ADMIN’ group • Read where user id = investigator id • Owner where user id = investigator id AND investigator type = PI • ADMIN where user id = instrument_sci_id and investigation. inst = inst. name
Data Access - Permissions • Read where investigation. type != commercial AND investigation. startdate < (now – 3 years) • Read where investigation. type = calibration
Data Searching • Similar ‘query language’ to permissions • List<Object> results = icat. search(session. Id, "Dataset") • Dataset INCLUDE Datafile, Dataset. Parameter, Datafile. Para meter • Dataset. id [type. name = 'GS' OR type. name = 'GQ']
Data Searching • Datafile <-> Datafile. Parameter[type. name = ‘sample_temperature'] AND Datafile. Parameter[value = 100] • Max, Min, Count Ave Sum • Between, Like, In • Pageing • “ 3, 5 Dataset. id ORDER BY id”
DOIs – data citation • Issued and sometimes used
ICAT Project • Open source (BSD) • code. google. com/p/icatproject • www. icatproject. org • Fortnightly telephone meetings • Regular face-to-face (e. g next Thursday)
Summary • Well used system at European P&N sources • Full tracking – not just RAW data • Flexible model • Flexible permissions • Rich search interface • Basic web GUI available • Rich API for custom clients
Questions. . .