i RODS Tutorial Rules and Microservices i RODS

  • Slides: 56
Download presentation
i. RODS Tutorial Rules and Microservices

i. RODS Tutorial Rules and Microservices

i. RODS Rules and Microservices I. Introduction to Rules and Microservices II. Simple Rules

i. RODS Rules and Microservices I. Introduction to Rules and Microservices II. Simple Rules and Database Queries III. Complex Rules and Scheduling 2

i. RODS Microservices Documentation Available from Amazon The integrated Rule-Oriented Data System (i. RODS)

i. RODS Microservices Documentation Available from Amazon The integrated Rule-Oriented Data System (i. RODS) Micro-service Workbook http: //www. amazon. com/dp/1466469129 Contains example rules for running each microservice. On-line documentation https: //www. irods. org/doxygen/ 3

I. Introduction to Rules and Microservices 4

I. Introduction to Rules and Microservices 4

Rules • See https: //www. irods. org/index. php/Changes_and_Improvements_to_the_R ule_Language_and_the_Rule_Engine • Rules implement data policy

Rules • See https: //www. irods. org/index. php/Changes_and_Improvements_to_the_R ule_Language_and_the_Rule_Engine • Rules implement data policy by running (workflows of) microservices – Retention, distribution, arrangement – Authenticity, provenance, description – Integrity, replication, synchronization – Deletion, trash cans, versioning Policy is the clear statement – Archiving, staging, caching of how data will be managed – Authentication, authorization, redaction over its life cycle. – Access, approval, IRB, audit trails, report generation – Assessment criteria, validation – Derived data product generation, format parsing 5

Microservices • C code • The unit of work within i. RODS • Run

Microservices • C code • The unit of work within i. RODS • Run by rules • Composed into workflows by rules 6

Running Rules • Rules triggered by events/policy points are contained in the (distributed) rule

Running Rules • Rules triggered by events/policy points are contained in the (distributed) rule base: – i. RODS/server/config/re. Configs/core. re – conditional execution – first rule with satisfied condition is executed; others are skipped • Can be run with irule - manual execution • Delayed execution – iqstat – iqmod 7

irule – to run a rule manually • Example rules to tweak and run

irule – to run a rule manually • Example rules to tweak and run are in the software distribution at i. RODS/clients/icommands/test/rules 3. 0 • irule -F rule_file_name. r • Some rules can only be run by admin users 8

An example – list all microservices • Download list. MS. r from /snic. Zone/home/pracerods/rules

An example – list all microservices • Download list. MS. r from /snic. Zone/home/pracerods/rules in i. RODS • list. MS. r (lists all available microservices) List. Available. MS { msi. List. Enabled. MS(*KVPairs); write. Key. Val. Pairs("stdout", *KVPairs, ": "); } INPUT null OUTPUT rule. Exec. Out • Run it with irule: irule –F list. MS. r 9

Format of a Rule_name{ microservice 1(…, *A, …, *B); *A and *B are workflow

Format of a Rule_name{ microservice 1(…, *A, …, *B); *A and *B are workflow variables. microservice 2(*A, …); } INPUT *A=”first_input", *B=”second_input" rule. Exec. Out accesses the OUTPUT rule. Exec. Out OR Rule_name(*arg) { on(exp) { microservice 1(…, *arg, *C); microservice 2(*C, …); } } INPUT null OUTPUT rule. Exec. Out internal rule. Exec. Info (rei) structure managed by i. RODS. • A rule can take arguments. • A rule can be executed conditionally. • Use “null” if there are no input parameters. 10

Rules and Parameters • Literals – constants: strings or numbers – a variable name

Rules and Parameters • Literals – constants: strings or numbers – a variable name not beginning with a special character (#, $ or *) is taken as string input – can only be used as input parameters (not output) • Workflow variables • Session state variables • Persistent state variables 11

Workflow Variables (*variables) • For example, in the following workflow chain: my. Rule{ msi.

Workflow Variables (*variables) • For example, in the following workflow chain: my. Rule{ msi. Data. Obj. Open(*file, *FD); msi. Data. Obj. Read(*FD, 10000, *BUF); write. Line(“stdout”, *BUF); … } INPUT *file=“/comp. Zone/home/leesa/hello” OUTPUT rule. Exec. Out (‘stdout’ is a structure managed by i. RODS. ) • *file is an input parameter • *FD is output from msi. Data. Obj. Open and input to msi. Data. Obj. Read. • *file, *FD, and *BUF are workflow variables 12

Session Variables ($variables) • contain temporary information maintained during a server session • contain

Session Variables ($variables) • contain temporary information maintained during a server session • contain information about client-server connection, data objects, user information, resource information, etc. • contain information that can be sent back to the client. Example: stdout, stderr • persistent across rule executions in the same session, so can be used to pass information between rule executions • pre-defined by i. RODS, stored as a complex C-structure (the rei structure). See https: //www. irods. org/index. php/Rule. Exec. Info_Structure_%26_$-variables 13

Session Variables ($variables) • $variables map to specific locations in the REI structure -

Session Variables ($variables) • $variables map to specific locations in the REI structure - mapping contained in server/config/re. Configs/core. dvm • Example: $obj. Path||rei->doi->obj. Path $obj. Path||rei->doinp->obj. Path (Mappings are not necessarily unique. ) $data. Type||rei->doi->data. Type $user. Name. Client||rei->uoic->user. Name $coll. Name||rei->coll. Name $coll. Parent. Name||rei->coll. Parent. Name • https: //www. irods. org/index. php/Session_State_Variables 14

Persistent State Variables (#variables) • See i. RODS Primer • “iquest” uses these field

Persistent State Variables (#variables) • See i. RODS Primer • “iquest” uses these field names: iquest attrs • https: //www. irods. org/index. php/Persistent_State_Information_Variables • i. RODS/lib/core/include/rods. Gen. Query. h defines the attributes available via the General Query interface. • Names begin with ‘COL_’ (column) for easy identification in the source code; mapping to variables without COL_ contained in i. RODS/lib/core/include/rods. Gen. Query. Names. h 15

Persistent State Information ZONE_ID ZONE_NAME ZONE_TYPE ZONE_CONNECTION ZONE_COMMENT ZONE_CREATE_TIME ZONE_MODIFY_TIME USER_ID USER_NAME USER_TYPE USER_ZONE

Persistent State Information ZONE_ID ZONE_NAME ZONE_TYPE ZONE_CONNECTION ZONE_COMMENT ZONE_CREATE_TIME ZONE_MODIFY_TIME USER_ID USER_NAME USER_TYPE USER_ZONE USER_DN USER_INFO USER_COMMENT USER_CREATE_TIME USER_MODIFY_TIME RESC_ID RESC_NAME RESC_ZONE_NAME RESC_TYPE_NAME RESC_CLASS_NAME RESC_LOC RESC_VAULT_PATH RESC_FREE_SPACE_TIME RESC_INFO RESC_COMMENT RESC_CREATE_TIME RESC_MODIFY_TIME RESC_STATUS DATA_ID DATA_COLL_ID DATA_NAME DATA_REPL_NUM DATA_VERSION DATA_TYPE_NAME DATA_SIZE DATA_RESC_GROUP_NAME DATA_RESC_NAME DATA_PATH DATA_OWNER_NAME DATA_OWNER_ZONE DATA_REPL_STATUS DATA_CHECKSUM DATA_EXPIRY DATA_MAP_ID DATA_COMMENTS DATA_CREATE_TIME DATA_MODIFY_TIME DATA_ACCESS_TYPE DATA_ACCESS_NAME DATA_TOKEN_NAMESPACE DATA_ACCESS_USER_ID DATA_ACCESS_DATA_ID COLL_NAME COLL_PARENT_NAME COLL_OWNER_ZONE COLL_MAP_ID COLL_INHERITANCE COLL_COMMENTS COLL_CREATE_TIME COLL_MODIFY_TIME COLL_ACCESS_TYPE COLL_ACCESS_NAME COLL_TOKEN_NAMESPACE COLL_ACCESS_USER_ID COLL_ACCESS_COLL_ID META_DATA_ATTR_NAME META_DATA_ATTR_VALUE META_DATA_ATTR_UNITS META_DATA_ATTR_ID META_DATA_CREATE_TIME 16

Persistent State Information META_DATA_MODIFY_TIME META_COLL_ATTR_NAME META_COLL_ATTR_VALUE META_COLL_ATTR_UNITS META_COLL_ATTR_ID META_NAMESPACE_COLL META_NAMESPACE_DATA META_NAMESPACE_RESC META_NAMESPACE_USER META_RESC_ATTR_NAME META_RESC_ATTR_VALUE

Persistent State Information META_DATA_MODIFY_TIME META_COLL_ATTR_NAME META_COLL_ATTR_VALUE META_COLL_ATTR_UNITS META_COLL_ATTR_ID META_NAMESPACE_COLL META_NAMESPACE_DATA META_NAMESPACE_RESC META_NAMESPACE_USER META_RESC_ATTR_NAME META_RESC_ATTR_VALUE META_RESC_ATTR_UNITS META_RESC_ATTR_ID META_USER_ATTR_NAME META_USER_ATTR_VALUE META_USER_ATTR_UNITS META_USER_ATTR_ID RESC_GROUP_RESC_ID RESC_GROUP_NAME USER_GROUP_ID USER_GROUP_NAME RULE_EXEC_ID RULE_EXEC_NAME RULE_EXEC_REI_FILE_PATH RULE_EXEC_USER_NAME RULE_EXEC_ADDRESS RULE_EXEC_TIME RULE_EXEC_FREQUENCY RULE_EXEC_PRIORITY RULE_EXEC_ESTIMATED_EXE_TIME RULE_EXEC_NOTIFICATION_ADDR RULE_EXEC_LAST_EXE_TIME RULE_EXEC_STATUS TOKEN_NAMESPACE TOKEN_ID TOKEN_NAME TOKEN_VALUE 2 TOKEN_VALUE 3 TOKEN_COMMENT AUDIT_OBJ_ID AUDIT_USER_ID AUDIT_ACTION_ID AUDIT_COMMENT AUDIT_CREATE_TIME AUDIT_MODIFY_TIME SL_HOST_NAME SL_RESC_NAME SL_CPU_USED SL_MEM_USED SL_SWAP_USED SL_RUNQ_LOAD SL_DISK_SPACE SL_NET_INPUT SL_NET_OUTPUT SL_CREATE_TIME SLD_RESC_NAME SLD_LOAD_FACTOR SLD_CREATE_TIME 17

Rule Condition • Boolean expression • Examples 1. Run if msi. Service succeeds: rule

Rule Condition • Boolean expression • Examples 1. Run if msi. Service succeeds: rule 1 { on (msi. Service >= 0) {. . . } } 2. Run if resource is demo. Resc 8: rule 2{ on ($resc. Name == demo. Resc 8) {…} } 3. Run if the pathname begins with /x/y/z: Rule 3{ on ($obj. Path like /x/y/z/*) {…} } • Same rule can give different actions depending on which condition is met • Many operators ==, !=, >, <, >=, <= %%, !! (and, or) expr like reg-expr , expr not like reg-expr , expr : : = string 18

Policy Enforcement Points Rule Triggers – see core. re • These are locations within

Policy Enforcement Points Rule Triggers – see core. re • These are locations within i. RODS framework where an event or state (of the environment) prompts a rule to execute – An action (eg, an icommand) may involve multiple policy enforcements points • Policy enforcement points – Pre-action policy (eg, selection of storage location) – Execution/action policy (eg, file deletion) – Post-action policy (eg, create secondary data products) • Actions (the triggered rules) and policy enforcement points are contained in i. RODS/server/config/re. Configs/core. re 19

core. re • Take a look at i. RODS/server/config/re. Configs/core. re • Contains rules

core. re • Take a look at i. RODS/server/config/re. Configs/core. re • Contains rules corresponding to a number of events – – Create, delete, modify users Open, create, put, copy, replicate, delete files Create, remove collections Create/modify metadata • Contains some rules to set general policy – ACL policy (strict or not) – Public user policy – Default resource policy • An empty rule corresponds to a trigger point for which no action has been defined 20

Some Policy Enforcement Points ACTION ac. Create. User ac. Delete. User ac. Get. Userby.

Some Policy Enforcement Points ACTION ac. Create. User ac. Delete. User ac. Get. Userby. DN ac. Trash. Policy ac. Acl. Policy ac. Set. Create. Conditions ac. Data. Delete. Policy ac. Rename. Local. Zone ac. Set. Resc. Scheme. For. Create ac. Resc. Quota. Policy ac. Set. Multi. Repl. Per. Resc ac. Set. Num. Threads ac. Vacuum ac. Set. Resource. List ac. Set. Copy. Number ac. Verify. Checksum ac. Create. User. Zone. Collections ac. Delete. User. Zone. Collections ac. Purge. Files ac. Register. Data ac. Get. Icat. Results ac. Set. Public. User. Policy ac. Create. Default. Collections ac. Delete. Default. Collections PRE-ACTION POLICY ac. Pre. Proc. For. Create. User ac. Pre. Proc. For. Delete. User ac. Pre. Proc. For. Modify. User. Group ac. Chk. Host. Access. Control ac. Pre. Proc. For. Coll. Create ac. Pre. Proc. For. Rm. Coll ac. Pre. Proc. For. Modify. AVUMetadata ac. Pre. Proc. For. Modify. Coll. Meta ac. Pre. Proc. For. Modify. Data. Obj. Meta ac. Pre. Proc. For. Modify. Access. Control ac. Preproc. For. Data. Obj. Open ac. Pre. Proc. For. Obj. Rename ac. Pre. Proc. For. Create. Resource ac. Pre. Proc. For. Delete. Resource ac. Pre. Proc. For. Modify. Resource. Group ac. Pre. Proc. For. Create. Token ac. Pre. Proc. For. Delete. Token ac. No. Chk. File. Path. Perm ac. Pre. Proc. For. Gen. Query ac. Set. Re. Server. Num. Proc ac. Set. Vault. Path. Policy POST-ACTION POLICY ac. Post. Proc. For. Create. User ac. Post. Proc. For. Delete. User ac. Post. Proc. For. Modify. User. Group ac. Post. Proc. For. Delete ac. Post. Proc. For. Coll. Create ac. Post. Proc. For. Rm. Coll ac. Post. Proc. For. Modify. AVUMetadata ac. Post. Proc. For. Modify. Coll. Meta ac. Post. Proc. For. Modify. Data. Obj. Meta ac. Post. Proc. For. Modify. Access. Control ac. Post. Proc. For. Open ac. Post. Proc. For. Obj. Rename ac. Post. Proc. For. Create. Resource ac. Post. Proc. For. Delete. Resource ac. Post. Proc. For. Modify. Resource. Group ac. Post. Proc. For. Create. Token ac. Post. Proc. For. Delete. Token ac. Post. Proc. For. File. Path. Reg ac. Post. Proc. For. Gen. Query ac. Post. Proc. For. Put ac. Post. Proc. For. Copy ac. Post. Proc. For. Create 21

Out-of-the-Box Services Microservices for… • Queries on metadata catalog • Interaction with web services

Out-of-the-Box Services Microservices for… • Queries on metadata catalog • Interaction with web services • Invocation of external applications • Workflow constructs (loops, conditionals, exit) • Remote and delayed execution control 22

Microservices print_hello_arg msi. Vacuum msi. Quota msi. Good. Failure msi. Set. Resource msi. Check.

Microservices print_hello_arg msi. Vacuum msi. Quota msi. Good. Failure msi. Set. Resource msi. Check. Permission msi. Check. Owner msi. Create. User msi. Create. Coll. By. Admin msi. Send. Mail recover_print_hello msi. Commit msi. Rollback msi. Delete. Coll. By. Admin msi. Delete. User msi. Add. User. To. Group msi. Set. Default. Resc msi. Set. Resc. Sort. Scheme msi. Sys. Repl. Data. Obj msi. Stage. Data. Obj msi. Set. Data. Obj. Preferred. Resc msi. Set. Data. Obj. Avoid. Resc msi. Sort. Data. Obj msi. Sys. Chksum. Data. Obj msi. Set. Data. Type. From. Ext msi. Set. No. Direct. Resc. Inp msi. Set. Num. Threads msi. Delete. Disallowed msi. Opr. Disallowed msi. Data. Obj. Create msi. Data. Obj. Open msi. Data. Obj. Close msi. Data. Obj. Lseek msi. Data. Obj. Read msi. Data. Obj. Write msi. Data. Obj. Unlink msi. Data. Obj. Repl msi. Data. Obj. Copy msi. Extract. Nara. Metadata msi. Set. Multi. Repl. Per. Resc msi. Adm. Change. Core. IRB msi. Adm. Show. DVM msi. Adm. Show. FNM msi. Adm. Append. Top. Of. Core. IRB msi. Adm. Clear. App. Rule. Struct msi. Adm. Add. App. Rule. Struct msi. Get. Obj. Type msi. Associate. Key. Value. Pairs. To. Obj msi. Extract. Template. MDFrom. Buf msi. Read. MDTemplate. Into. Tag. Struct msi. Data. Obj. Put msi. Data. Obj. Get msi. Data. Obj. Chksum msi. Data. Obj. Phymv msi. Data. Obj. Rename msi. Data. Obj. Trim msi. Coll. Create msi. Rm. Coll msi. Repl. Coll msi. Coll. Repl msi. Phy. Path. Reg msi. Obj. Stat msi. Data. Obj. Rsync msi. Free. Buffer msi. No. Chk. File. Path. Perm msi. No. Trash. Can msi. Set. Public. User. Opr while. Exec for. Exec delay. Exec remote. Exec for. Each. Exec msi. Sleep write. String write. Line write. Bytes. Buf write. Pos. Int write. Key. Val. Pairs msi. Get. Diff. Time msi. Get. System. Time msi. Human. To. System. Time msi. Str. To. Bytes. Buf msi. Apply. DCMetadata. Template msi. List. Enabled. MS msi. Send. Stdout. As. Email msi. Print. Key. Val. Pair msi. Get. Val. By. Key msi. Add. Key. Val assign if. Exec break apply. All. Rules msi. Exec. Str. Cond. Query. With. Options msi. Exec. Gen. Query msi. Make. Gen. Query msi. Get. More. Rows msi. Add. Select. Field. To. Gen. Query msi. Add. Condition. To. Gen. Query msi. Print. Gen. Query. Out. To. Buffer msi. Exec. Cmd msi. Set. Graft. Path. Scheme msi. Set. Random. Scheme msi. Check. Host. Access. Control msi. Get. Icat. Time msi. Get. Tagged. Value. From. String msi. Xmsg. Server. Connect msi. Xmsg. Create. Stream msi. Create. Xmsg. Inp msi. Send. Xmsg msi. Rcv. Xmsg msi. Xmsg. Server. Dis. Connect msi. String 2 Key. Val. Pair msi. Str. Array 2 String msi. Rda. To. Stdout 23

Microservices msi. Rda. To. Data. Obj msi. Rda. No. Results msi. Rda. Commit msi.

Microservices msi. Rda. To. Data. Obj msi. Rda. No. Results msi. Rda. Commit msi. AW 1 msi. Rda. Rollback msi. Rename. Local. Zone msi. Rename. Collection msi. Acl. Policy msi. Remove. Key. Value. Pairs. From. Obj msi. Data. Obj. Put. With. Options msi. Data. Obj. Repl. With. Options msi. Data. Obj. Chksum. With. Options msi. Data. Obj. Get. With. Options msi. Set. Re. Server. Num. Proc msi. Get. Stdout. In. Exec. Cmd. Out msi. Get. Stderr. In. Exec. Cmd. Out msi. Add. Key. Val. To. Msp. Str msi. Print. Gen. Query. Inp msi. Tar. File. Extract msi. Tar. File. Create msi. Phy. Bundle. Coll msi. Write. Rods. Log msi. Server. Mon. Perf msi. Flush. Mon. Stat msi. Digest. Mon. Stat msi. Split. Path msi. Get. Session. Var. Value msi. Auto. Replicate. Service msi. Data. Obj. Auto. Move msi. Get. Cont. Inx. From. Gen. Query. Out msi. Set. ACL msi. Set. Resc. Quota. Policy msi. Properties. New msi. Properties. Clear msi. Properties. Clone msi. Properties. Add msi. Properties. Remove msi. Properties. Get msi. Properties. Set msi. Properties. Exists msi. Properties. To. String msi. Properties. From. String msi. Recursive. Coll. Copy msi. Get. Data. Obj. ACL msi. Get. Collection. ACL msi. Get. Data. Obj. AVUs msi. Get. Data. Obj. PSmeta msi. Get. Collection. PSmeta msi. Get. Data. Obj. AIP msi. Load. Metadata. From. Data. Obj msi. Export. Recursive. Coll. Meta msi. Copy. AVUMetadata msi. Get. User. Info msi. Get. User. ACL msi. Create. User. Accounts. From. Data. Obj msi. Load. User. Mods. From. Data. Obj msi. Delete. Users. From. Data. Obj msi. Load. ACLFrom. Data. Obj msi. Get. Audit. Trail. Info. By. User. ID msi. Get. Audit. Trail. Info. By. Object. ID msi. Get. Audit. Trail. Info. By. Action. ID msi. Get. Audit. Trail. Info. By. Keywords msi. Get. Audit. Trail. Info. By. Time. Stamp msi. Set. Data. Type msi. Guess. Data. Type msi. Merge. Data. Copies msi. Is. Coll msi. Is. Data msi. Get. Collection. Contents. Report msi. Get. Collection. Size msi. Struct. File. Bundle msi. Collection. Spider msi. Flag. Data. Objwith. AVU msi. Flag. Infected. Objs 24

Microservice Modules • A module must be compiled with the code, though pluggable modules

Microservice Modules • A module must be compiled with the code, though pluggable modules (coming) will no longer require recompiling the whole code • Consult the Microservice book: The integrated Rule-Oriented Data System (i. RODS) Micro-service Workbook to see which module a microservice is contained in • Enable that module: Enabled… yes in info. txt • Example – > irule -F rulemsi. Copy. AVUMetadata. r ERROR: rc. Exec. My. Rule error. status = -1102000 NO_MICROSERVICE_FOUND_ERR Level 0: DEBUG: exec. Micro. Service 3: no micro service found line 12, col 2 msi. Flag. Data. Objwith. AVU(*Source, *Flag, *Status); – msi. Flag. Data. Objwith. AVU is contained in module ERA – enable module ERA –. /irodsctl istop –. /irodssetup 25

Creating New Microservices Modules Define Function Siganture Register m. Service Mapping Create Internal Function

Creating New Microservices Modules Define Function Siganture Register m. Service Mapping Create Internal Function Describe m. Service Any function can be converted into a microservice, but it’s important to implement recovery microservices Important!! Implement recovery m. Service This procedure will change with coming releases. 26

II. Simple Rules and Database Queries 27

II. Simple Rules and Database Queries 27

Example Rules • See i. RODS/clients/icommands/test/rules 3. 0 in the i. RODS distribution •

Example Rules • See i. RODS/clients/icommands/test/rules 3. 0 in the i. RODS distribution • See /comp. Zone/home/leesa/rules in comp. Zone data grid • See /comp. Zone/home/rods/rules in comp. Zone data grid • See /snic. Zone/home/pracerods/rules in snic. Zone • See http: //www. renci. org/~leesa/rules/ 28

Listing the Rule Base show. Core. r show. Core. Rules { # Listing of

Listing the Rule Base show. Core. r show. Core. Rules { # Listing of the core. re file # # Input parameters: # none msi. Adm. Show. Core. RE(); } INPUT null OUTPUT rule. Exec. Out An admin user can execute the rule to show the rule base: –irule –v. F show. Core. r 29

rulemsi. Exec. Cmd. r Run an executable script my. Test. Rule { #Input parameters

rulemsi. Exec. Cmd. r Run an executable script my. Test. Rule { #Input parameters are: # Command to be executed - located in directory irods/server/bin/cmd # Optional command argument string # Optional host address for command execution # Optional hint for remote data object path, command is executed on host # where the file is stored # Optional flag. If > 0, use the resolved physical data object path as first argument #Output parameter is: # Structure holding status, stdout, and stderr from command execution #Output: # Command result is: “Hello world written from irods” # msi. Exec. Cmd(*Cmd, *Arg, "null", *Result); msi. Get. Stdout. In. Exec. Cmd. Out(*Result, *Out); write. Line("stdout", "Command result is"); write. Line("stdout", "*Out"); } “hello” is an executable script INPUT *Cmd=”hello", *Arg="written" in i. RODS/server/bin/cmd. OUTPUT rule. Exec. Out 30

Example Policy Implementation Using “as. Post. Proc. For. Put” to implement policy, depending on

Example Policy Implementation Using “as. Post. Proc. For. Put” to implement policy, depending on resource Data coming in to a target i. RODS resource triggers a script that takes some desired action, triggers message to admin (unix) user ac. Post. Proc. For. Put{ on($resc. Name like ”unc. Resc") { write. Line("server. Log", "USER, OBJPATH, and FILEPATH: $user. Name. Client, $obj. Path and $file. Path"); msi. Exec. Cmd("resource-trigger. sh", "$resc. Name $obj. Path $user. Name. Client", "null", *Out); msi. Send. Mail("leesa@renci. org", "resource $resc. Name", "User $user. Name. Client just ingested file $obj. Path into $resc. Name. "); } • ac. Post. Proc. For. Put is contained in } i. RODS/server/config/re. Configs/core. re • resource-trigger. sh is contained in /server/bin/cmd and must be executable!! 31

Example script resource-trigger. sh • > more resource-trigger. sh #!/bin/sh # echo "exec. Cmd.

Example script resource-trigger. sh • > more resource-trigger. sh #!/bin/sh # echo "exec. Cmd. Rule: "$exec. Cmd. Rule resc. Name=$1 obj. Path=$2 user. Name. Client=$3 echo "User $user. Name. Client just ingested file $obj. Path into $resc. Name" > /tmp/resource. out 32

Example Policy Implementation Using as. Post. Proc. For. Put to implement policy: inputs to

Example Policy Implementation Using as. Post. Proc. For. Put to implement policy: inputs to a specific resource Data coming in to a target i. RODS collection triggers a script that takes some desired action (sending data to a remote ftp site) ac. Post. Proc. For. Put{ on($obj. Path like ”/snic. Zone/home/prace-class/*") { write. Line("server. Log", "$user. Name. Client sending $obj. Path to NCDC. "); msi. Split. Path($file. Path, *file. Dir, *file. Name); msi. Exec. Cmd(“outgoing. sh", "*file. Dir *file. Name", "null", *Out); msi. Send. Mail("leesa@renci. org", "send to NCDC", "User $user. Name. Client sent $obj. Path to NCDC. "); } • ac. Post. Proc. For. Put is contained in } i. RODS/server/config/re. Configs/core. re • ac. Post. Proc. For. Put is the same rule in both examples! Just using different conditions. 33

Example script outgoing. sh • > more outgoing. sh #!/bin/sh HOST=ftp. ****. *** USER=anonymous

Example script outgoing. sh • > more outgoing. sh #!/bin/sh HOST=ftp. ****. *** USER=anonymous PASSWD=leesa@renci. org src. Dir=$1 src. File=$2 echo $src. Dir echo $src. File #echo "Place holder for outgoing script. Dir: $src. Dir, File: $src. File" > /tmp/test. out 34

iquest – for constrained querying of the i. CAT • “iquest attrs” shows the

iquest – for constrained querying of the i. CAT • “iquest attrs” shows the attributes you can query on • Query i. CAT of remote zone. A: iquest –z zone. A … Example: User rods#comp. Zone, logged into comp. Zone, gives the command > iquest -z snic. Zone "SELECT COLL_NAME, DATA_NAME WHERE COLL_NAME like '/snic. Zone/home/rods%'" Zone is snic. Zone COLL_NAME = /snic. Zone/home/rods#comp. Zone DATA_NAME = hello ------------------------------The response comes from the snic. Zone i. CAT, not the user’s home comp. Zone i. CAT. 35

More on iquest Set SQL logging to see actual SQL queries generated using iquest

More on iquest Set SQL logging to see actual SQL queries generated using iquest – Edit scripts/perl/irodsctl. pl - uncoment the line $sp. Log. Sql = "1”; –. /irodsctl irestart – Logged into i. RODS/server/log files 36

Specific Queries More on iquest • iadmin asq – add SQL query Example: show

Specific Queries More on iquest • iadmin asq – add SQL query Example: show all data files with metadata, along with the metadata > iadmin asq 'select object_id, data_name, meta_attr_value from r_objt_metamap, r_meta_main, r_data_main where r_meta_main. meta_id = r_objt_metamap. meta_id and object_id = data_id; ' file-metadata Now any user can make this query using iquest: Alias to the query > iquest –-sql file-metadata Data grid administrators: beware what you enable with specific queries; the i. CAT contains passwords, ACLs, etc. Control carefully the access you give. 37

Gen. Query • The primary query interface in IRODS • msi. Make. Gen. Query

Gen. Query • The primary query interface in IRODS • msi. Make. Gen. Query constructs a SQL string that’s passed to the i. CAT by a subsequent call to msi. Exec. Gen. Query • Sets up a data structure, so use msi. Get. Val. By. Key to get values out of the call to msi. Exec. Gen. Query • Many i. RODS commands (eg, ils) use gen. Query behind the scenes • Generates SQL to perform the query subject to access control • Simplified SQL-like interface – user doesn’t need to know the i. CAT schema to make queries • Queries on same attributes as iquest 38

Gen. Query Example test 0. r – list all data objects with the name

Gen. Query Example test 0. r – list all data objects with the name “hello” mygen. Query{ # Input: # string containing attribute list to select on # string containing condition # Output: # Gen. QInp: an internal data structure # msi. Make. Gen. Query("COLL_NAME", *Condition, *Gen. QInp); msi. Exec. Gen. Query(*Gen. QInp, *Gen. QOut); foreach(*Gen. QOut){ msi. Get. Val. By. Key(*Gen. QOut, "COLL_NAME", *outcoll); write. Line("stdout", "*outcoll/hello"); } } INPUT *Condition="DATA_NAME = 'hello'" OUTPUT rule. Exec. Out 39

Gen. Query Example irule –F test 0. r /comp. Zone/home/leesa/hello /comp. Zone/home/outgoing/hello /comp. Zone/home/rods/hello

Gen. Query Example irule –F test 0. r /comp. Zone/home/leesa/hello /comp. Zone/home/outgoing/hello /comp. Zone/home/rods/hello /comp. Zone/trash/home/rods/hints/linux-hints/ischia 2 -hints/hello This rule was run as user rods (admin user) on comp. Zone. Gen. Query will not allow all users to make all queries. ACL permissions are respected with Gen. Query. Examples: see /comp. Zone/home/rods/rules/gen. Query or /snic. Zone/home/pracerods/rules/gen. Query Try tweaking and running some examples. 40

Database Resources • Data. Base Resource (DBR): an external database (or similar tabular information)

Database Resources • Data. Base Resource (DBR): an external database (or similar tabular information) that can be queried and updated via SQL statements (or other, for non-SQL) • Database object (DBO): an interface to the DBR - typically a query that returns results • DBOs typically contain SQL • Query results can be stored to an i. RODS data object, a DBO Results file (DBOR). • idbo command – to run the DBO query • access controls imposed by i. RODS ACLs • https: //www. irods. org/index. php/Database_Resources and https: //www. irods. org/index. php/Database_Resource_Administration 41

Setting Up a DBR • See https: //www. irods. org/index. php/Database_Resource_Administration 1. Install a

Setting Up a DBR • See https: //www. irods. org/index. php/Database_Resource_Administration 1. Install a non-IES server (an additional resource) on a machine that has access to the remote DB. Example: install on iren. renci. org. 2. On iren. renci. org, edit server/config/dbr. config - add a line containing: DBR-name DBMS-username DBMS-password DB-type (postgres, oracle, mysql) Example: leesabase leesapw postgresql 3. On iren. renci. org, put a. odbc. ini file into $HOME [IRODS_DBR_leesabase] Driver=/home/leesa/tutorial-testing/pgsql/libodbcpsql. so. 2. 0. 0 Debug=0 Comm. Log=0 (Can copy – mostly – from Servername=informatics. renci. org $HOME/. odbc. ini Database=leesabase on your data grid server. ) Read. Only=no Ksqo=0 Port=5432 42

Setting Up a DBR 4. Edit i. RODS/config. mk: – uncomment DBR=1 – define

Setting Up a DBR 4. Edit i. RODS/config. mk: – uncomment DBR=1 – define POSTGRES_HOME: POSTGRES_HOME=/home/leesa/tutorial-testing/pgsql 5. Make sure the LD_LIBRARY_PATH includes the postgres libraries: setenv LD_LIBRARY_PATH /home/leesa/tutorial-testing/pgsql/lib 6. Do a new "make" of the code in i. RODS/and then. /irodsctl irestart 43

Setting Up a DBR 7. Go over the main data grid (ischia 2. renci.

Setting Up a DBR 7. Go over the main data grid (ischia 2. renci. org) and use iadmin to define the DBR in the ICAT. See 'iadmin h mkresc’. mkresc Name Type Class Host [Path] (make Resource) Example: mkresc leesabase database postgresql iren. renci. org /home/leesa/tutorial-testing/i. RODS 3. 1/Vault 8. Give ownership of the DBR to the appropriate user(s) - for write/query permission to the DBR: 'ichmod -R own User-name DBR-name’ Example: ichmod -R own rods leesabase 9. See i. RODS/server/icat/dbr, containing example DBOs. Put a query into a file and ingest into i. RODS: iput lt. pg 44

Setting Up a DBR 10. Use isysmeta to declare this as a DBO type:

Setting Up a DBR 10. Use isysmeta to declare this as a DBO type: isysmeta mod lt. pg datatype 'database object' 11. Use idbo to execute a query (contained in a dbo) on the DBR (leesabase in this case), running as a user that has ‘own’ permission on the DBR: idbo exec dbrname dboname Example: idbo exec leesabase lt. pg Note: when giving this command, you either need to be in the collection that contains the dbo file, or you need to give the full path name to it. 45

Using a DBR • idbo to execute a query contained in a DBO •

Using a DBR • idbo to execute a query contained in a DBO • Store results into a DBO Result file (DBOR) by using idbo as an interactive shell: >idbo> output lt. out 'exec' results will be stored in /comp. Zone/home/rods/dbr-examples/lt. out idbo>exec leesabase lt. pg Output written to /comp. Zone/home/rods/dbr-examples/lt. out idbo>quit • Examples: in /comp. Zone/home/rods/dbr-examples NB: For this tutorial the DBR hasn’t been set up on snic. Zone, so DBOs must be run on comp. Zone. 46

III. Complex Rules and Scheduling 47

III. Complex Rules and Scheduling 47

Delayed Execution • Example my. Test. Rule{ delay("<PLUSET>1 m</PLUSET>"){ write. Line("stdout", ”Writing message with

Delayed Execution • Example my. Test. Rule{ delay("<PLUSET>1 m</PLUSET>"){ write. Line("stdout", ”Writing message with a delay. "); msi. Send. Stdout. As. Email(*Mailto, "Sending email"); } } INPUT *Mailto="leesa@renci. org" OUTPUT rule. Exec. Out • Queue management: – iqstat – iqdel – iqmod 48

Periodic Execution Example my. Test. Rule { # Input parameters are: # Source collection

Periodic Execution Example my. Test. Rule { # Input parameters are: # Source collection path # Target collection path # Optional target resource # Optional synchronization mode: IRODS_TO_IRODS # Output parameter is: # Status of the operation # Output from running the example is: # Synchronized collection 1 with collection 2 # delay("<PLUSET>5 m</PLUSET>EF>1 h</EF>"){ msi. Coll. Rsync(*src. Coll, *dest. Coll, *Resource, "IRODS_TO_IRODS", *Status); write. Line("stdout", "Synchronized collection *src. Coll with collection *dest. Coll"); } } INPUT *src. Coll="/comp. Zone/home/leesa/tutorials", *dest. Coll="/comp. Zone/home/leesa/tutorials 2", *Resource="demo. Resc" OUTPUT rule. Exec. Out 49

More Interesting Rules Policy to control where data is stored upon ingestion (courtesy of

More Interesting Rules Policy to control where data is stored upon ingestion (courtesy of i. Plant) • Data ingested into TACC collections will be forced to resource “corral”: ac. Set. Resc. Scheme. For. Create { ON($obj. Path like "/tacc/Collections/*") {msi. Set. Default. Resc(corral, forced); } } (This forces physical location, with no effect whatsoever on logical location. ) • All other data goes into iplant resource group, randomly distributed among resources there: ac. Set. Resc. Scheme. For. Create { msi. Set. Default. Resc("iplant. RG", "preferred"); msi. Set. Resc. Sort. Scheme("random"); } (This means that a default resource doesn’t need to be specified by users. ) 50

Interesting Rules Rule to load metadata on a file (courtesy of Reagan Moore) my.

Interesting Rules Rule to load metadata on a file (courtesy of Reagan Moore) my. Rulemsi. Add. Key. Val { #Input parameters # Key-value buffer (may be empty) # Key # Value msi. Add. Key. Val(*Keyval, *Key, *Val); msi. Associate. Key. Value. Pairs. To. Obj(*Keyval, *File. Path, "-d"); msi. Get. Data. Obj. PSmeta(*File. Path, *Buf); write. Bytes. Buf("stdout", *Buf); } INPUT *File. Path="/lifelib. Zone/home/rwmoore/foo 1", *Key="School", *Val="INLS" OUTPUT rule. Exec. Out 51

Interesting Rules Set default resources and enforce size limits on IDRISZone 1 (courtesy of

Interesting Rules Set default resources and enforce size limits on IDRISZone 1 (courtesy of Anges Ansari) ac. Set. Resc. Scheme. For. Create { ON($obj. Path like "/IDRISZone 1/*") { if (double($data. Size) <= 104857600) { msi. Set. Default. Resc("Resc 1", "null")} else {msi. Write. Rods. Log("File size not allowed", "status"); fail; } } ON($obj. Path like "/ccin 2 p 3/*") { msi. Set. Default. Resc("lyon 2", "null"); } } 52

Complex Rule – Verify integrity and number of replicas scheduler. Replicas { # This

Complex Rule – Verify integrity and number of replicas scheduler. Replicas { # This rule requires i. RODS version 3. 1 (msi. Close. Gen. Query mods) # The replicas for each file are updated to the most recent version # Each file is checked to verify existence of all required replicas and validity of checksums # As replicas are created, the algorithm round robins through available storage vaults # Checks that the number of storage resources used within a collection is greater than or # equal to the number of desired replicas. # This uses a just in time scheduler that slows down the processing rate # to complete the task within the specified number of seconds (*Delt) # Checks a TEST_DATA_ID parameter associated with the collection # to determine enable restarts after system interrupts # Writes a log file stored as Check-Timestamp in directory *Coll/log # Get current time, Timestamp is YYY-MM-DD. hh: mm: ss msi. Get. System. Time(*Time. S, "unix"); msi. Get. System. Time(*Time. H, "human"); *Num. Bad. Files = 0; *Num. Rep. Created = 0; *Num. Files = 0; (Courtesy of Reagan Moore) *Runsize = double(0); *Sleeptime = 0; *colldata. ID = "0"; 53

Verify integrity and number of replicas (continued) # This is used to round robin

Verify integrity and number of replicas (continued) # This is used to round robin through available storage resources *Jround = 0; # Check whether a collection was defined msi. Is. Coll(*Coll, *Result, *Status); if(*Result == 0 || *Status < 0) { write. Line("stdout", "Input path *Coll is not a collection"); fail; } #====== create a collection for log files if it does not exist ===== *LPath = "*Coll/log"; msi. Is. Coll(*LPath, *Result, *Status); if(*Result == 0 || *Status < 0) { msi. Coll. Create(*LPath, "0", *Status); if(*Status < 0) { write. Line("stdout", "Could not create log collection"); fail; } } 54

Verify integrity and number of replicas (continued) # create file into which results will

Verify integrity and number of replicas (continued) # create file into which results will be written *Lfile = "*LPath/Check-*Time. H"; *Dfile = "dest. Resc. Name=*Res++++force. Flag="; msi. Data. Obj. Create(*Lfile, *Dfile, *L_FD); # check whether the attribute TEST_DATA_ID has been set from a prior execution *Val = "0"; msi. Exec. Str. Cond. Query("SELECT COUNT(META_COLL_ATTR_NAME) where COLL_NAME = '*Coll' and META_COLL_ATTR_NAME = 'TEST_DATA_ID'", *Gen. QOut 2); foreach (*Gen. QOut 2) { msi. Get. Val. By. Key(*Gen. QOut 2, "META_COLL_ATTR_NAME", *Val); } if(int(*Val) == 0) { *Str 1 = "TEST_DATA_ID=0"; msi. String 2 Key. Val. Pair(*Str 1, *kvp); msi. Associate. Key. Value. Pairs. To. Obj(*kvp, *Coll, "-C"); write. Line("*Lfile", "added TEST_DATA_ID attribute to collection *Coll"); 55 }

Verify integrity and number of replicas (continued) # on a restart TEST_DATA_ID will be

Verify integrity and number of replicas (continued) # on a restart TEST_DATA_ID will be greater than 0 msi. Make. Gen. Query("META_COLL_ATTR_VALUE", "COLL_NAME = '*Coll' and META_COLL_ATTR_NAME = 'TEST_DATA_ID'", *Gen. QInp 2); msi. Exec. Gen. Query(*Gen. QInp 2, *Gen. QOut 2); foreach(*Gen. QOut 2) { msi. Get. Val. By. Key(*Gen. QOut 2, "META_COLL_ATTR_VALUE", *colldata. ID); } # *colldata. ID is the string identifier of the last file that has been checked msi. Close. Gen. Query(*Gen. QInp 2, *Gen. QOut 2); msi. Make. Gen. Query("count(DATA_NAME), sum(DATA_SIZE)", "COLL_NAME = '*Coll' and DATA_ID > '*colldata. ID'", *Gen. QInp 2); # this counts all files that have not yet been checked including replicas msi. Exec. Gen. Query(*Gen. QInp 2, *Gen. QOut 2); foreach(*Gen. QOut 2) { and on for many msi. Get. Val. By. Key(*Gen. QOut 2, "DATA_NAME", *num); more pages… msi. Get. Val. By. Key(*Gen. QOut 2, "DATA_SIZE", *sizetotal); } 56