ICAT Metadata management Tom Griffin STFC ISIS Facility

  • Slides: 40
Download presentation
ICAT – Metadata management Tom Griffin, STFC ISIS Facility Swiss. FEL ARAMIS Instrumentation Workshop

ICAT – Metadata management Tom Griffin, STFC ISIS Facility Swiss. FEL ARAMIS Instrumentation Workshop PSI June 2012 tom. griffin@stfc. ac. uk

Background • Pulsed Neutron & Muon source • 35 instruments • 2500 users /

Background • Pulsed Neutron & Muon source • 35 instruments • 2500 users / year • >800 experiments /year • Scientific Computing – Data analysis, data management, HPC etc

ICAT • Search for data in a meaningful way e. g. taxonomy, Sample, temperature,

ICAT • Search for data in a meaningful way e. g. taxonomy, Sample, temperature, pressure etc • Share data with colleagues • Access data anywhere via the web* • Annotate your data • Link to data from your publications

Example ISIS Proposal GEM – High intensity, high resolution neutron diffractometer H 2 -(zeolite)

Example ISIS Proposal GEM – High intensity, high resolution neutron diffractometer H 2 -(zeolite) vibrational frequencies vs polarising potential of cations Proposals Experiment Analyzed Data Once awarded beamtime at ISIS, an entry will be created in ICAT that describes your proposed experiment. Data collected from your experiment will be indexed by ICAT (with additional experimental conditions) and made available to your experimental team You will have the capability to upload any desired analysed data and associate it with your experiments. B-lactoglobulin protein interfacial structure Publication Associate publications to your experiment and even reference data from your publications.

Who uses it? * *

Who uses it? * *

Basic Metdata Model Investigators Sample Investigation Dataset Datafile Parameter Type Parameter

Basic Metdata Model Investigators Sample Investigation Dataset Datafile Parameter Type Parameter

Where it fits in • Pre-experiment – Proposal – Safety Data – User office

Where it fits in • Pre-experiment – Proposal – Safety Data – User office data – Apply permissions

Getting data into ICAT • During the experiment • ‘ICATIngest’ • Subscribes to file

Getting data into ICAT • During the experiment • ‘ICATIngest’ • Subscribes to file system events Read. Directory. Changes. W() • Invokes ‘Write. Raw’ and ‘Nx. Ingest’ • Caches groups of files for performance • Extracts ‘science’ metadata

Getting Data out - Web Access • Main point of access for ISIS data

Getting Data out - Web Access • Main point of access for ISIS data • HTTP file serving – pluggable architecture • Zipped or non-zipped • Basic ‘IDS’ available with ICAT 4. 0 • 3 method interface for custom data layer (e. g. FUSE? )

Top. CAT – basic web interface

Top. CAT – basic web interface

Getting Data out - Mantid

Getting Data out - Mantid

ISIS – Volume and Rates Files/15 minutes Files/minute Files/second Peak 10857 724 12. 06

ISIS – Volume and Rates Files/15 minutes Files/minute Files/second Peak 10857 724 12. 06 Mean 968 65 1. 08 Median 263 18 0. 29 Peak - 4 hours averaging of 6. 4 files/sec (384/minute) Peak of ~10% load on dual-core virtual server (limited by ISIS data archive)

Data Policy • Who can access what data when • ISIS policy published and

Data Policy • Who can access what data when • ISIS policy published and implemented • ICAT is flexible • http: //www. isis. stfc. ac. uk/useroffice/data-policy 11204. html • tiny. cc/isisdp

Data Access - Authentication • Pluggable, stackable components • ISIS – Active Directory ->

Data Access - Authentication • Pluggable, stackable components • ISIS – Active Directory -> User database -> local database • Diamond • Simple interface – examples available • 4. 2 will be Umbrella native (? )

Data Access - Permissions • ‘Rules’ based + explict permissions • Admin when user

Data Access - Permissions • ‘Rules’ based + explict permissions • Admin when user in ‘ADMIN’ group • Read where user id = investigator id • Owner where user id = investigator id AND investigator type = PI • ADMIN where user id = instrument_sci_id and investigation. inst = inst. name

Data Access - Permissions • Read where investigation. type != commercial AND investigation. startdate

Data Access - Permissions • Read where investigation. type != commercial AND investigation. startdate < (now – 3 years) • Read where investigation. type = calibration

Data Searching • Similar ‘query language’ to permissions • List<Object> results = icat. search(session.

Data Searching • Similar ‘query language’ to permissions • List<Object> results = icat. search(session. Id, "Dataset") • Dataset INCLUDE Datafile, Dataset. Parameter, Datafile. Para meter • Dataset. id [type. name = 'GS' OR type. name = 'GQ']

Data Searching • Datafile <-> Datafile. Parameter[type. name = ‘sample_temperature'] AND Datafile. Parameter[value =

Data Searching • Datafile <-> Datafile. Parameter[type. name = ‘sample_temperature'] AND Datafile. Parameter[value = 100] • Max, Min, Count Ave Sum • Between, Like, In • Pageing • “ 3, 5 Dataset. id ORDER BY id”

DOIs – data citation • Issued and sometimes used

DOIs – data citation • Issued and sometimes used

ICAT Project • Open source (BSD) • code. google. com/p/icatproject • www. icatproject. org

ICAT Project • Open source (BSD) • code. google. com/p/icatproject • www. icatproject. org • Fortnightly telephone meetings • Regular face-to-face (e. g next Thursday)

Summary • Well used system at European P&N sources • Full tracking – not

Summary • Well used system at European P&N sources • Full tracking – not just RAW data • Flexible model • Flexible permissions • Rich search interface • Basic web GUI available • Rich API for custom clients

Questions. . .

Questions. . .