Developer APIs to Condor A Tutorial on Condors

  • Slides: 41
Download presentation
Developer APIs to Condor + A Tutorial on Condor’s Web Service Interface Computer Sciences

Developer APIs to Condor + A Tutorial on Condor’s Web Service Interface Computer Sciences Department University of Wisconsin-Madison condor-admin@cs. wisc. edu http: //www. cs. wisc. edu/condor

Interfacing Applications w/ Condor › Suppose you have an application which › › needs

Interfacing Applications w/ Condor › Suppose you have an application which › › needs a lot of compute cycles You want this application to utilize a pool of machines How can this be done? http: //www. cs. wisc. edu/condor 2

Some Condor APIs › Command Line tools › › › hcondor_submit, condor_q, etc DRMAA

Some Condor APIs › Command Line tools › › › hcondor_submit, condor_q, etc DRMAA Condor GAHP JSDL RDBMS Condor Perl Module SOAP http: //www. cs. wisc. edu/condor 3

Command Line Tools › Don’t underestimate them! › Your program can create a submit

Command Line Tools › Don’t underestimate them! › Your program can create a submit file on disk and simply invoke condor_submit: system(“echo universe=VANILLA > /tmp/condor. sub”); system(“echo executable=myprog >> /tmp/condor. sub”); . . . system(“echo queue >> /tmp/condor. sub”); system(“condor_submit /tmp/condor. sub”); http: //www. cs. wisc. edu/condor 4

Command Line Tools › Your program can create a submit file and give it

Command Line Tools › Your program can create a submit file and give it to condor_submit through stdin: PERL: C/C++: fopen(SUBMIT, “|condor_submit”); print SUBMIT “universe=VANILLAn”; . . . int s = popen(“condor_submit”, “r+”); write(s, “universe=VANILLAn”, 17/*len*/); . . . http: //www. cs. wisc. edu/condor 5

Command Line Tools › Using the +Attribute with condor_submit: universe = VANILLA executable =

Command Line Tools › Using the +Attribute with condor_submit: universe = VANILLA executable = /bin/hostname output = job. out log = job. log +webuser = “zmiller” queue http: //www. cs. wisc. edu/condor 6

Command Line Tools › Use -constraint and –format with condor_q: % condor_q -constraint ‘webuser==“zmiller”’

Command Line Tools › Use -constraint and –format with condor_q: % condor_q -constraint ‘webuser==“zmiller”’ -- Submitter: bio. cs. wisc. edu : <128. 105. 147. 96: 37866> : bio. cs. wisc. edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 213503. 0 zmiller 10/11 06: 00 0+00: 00 I 0 0. 0 hostname % condor_q -constraint 'webuser=="zmiller"' -format "%it" Cluster. Id -format "%sn" Cmd 213503 /bin/hostname http: //www. cs. wisc. edu/condor 7

Command Line Tools › condor_wait will watch a job log file and wait for

Command Line Tools › condor_wait will watch a job log file and wait for a certain (or all) jobs to complete: system(“condor_wait job. log”); › can specify a timeout http: //www. cs. wisc. edu/condor 8

Command Line Tools › condor_q and condor_status –xml › › option So it is

Command Line Tools › condor_q and condor_status –xml › › option So it is relatively simple to build on top of Condor’s command line tools alone, and can be accessed from many different languages (C, PERL, python, PHP, etc). However… http: //www. cs. wisc. edu/condor 9

DRMAA › DRMAA is a GGF standardized job› › › submission API Has C

DRMAA › DRMAA is a GGF standardized job› › › submission API Has C (and now Java) bindings Is not Condor-specific -- your app could submit to any job scheduler with minimal changes (probably just linking in a different library) Source. Forge Project http: //sourceforge. net/projects/condor-ext http: //www. cs. wisc. edu/condor 10

DRMAA › Easy to use, but › Unfortunately, the DRMAA API does not support

DRMAA › Easy to use, but › Unfortunately, the DRMAA API does not support some very important features, such as: h. Two-phase commit h. Fault tolerance h. Transactions http: //www. cs. wisc. edu/condor 11

Condor GAHP › The Condor GAHP is a relatively low-level protocol › › based

Condor GAHP › The Condor GAHP is a relatively low-level protocol › › based on simple ASCII messages through stdin and stdout Supports a rich feature set including two-phase commits, transactions, and optional asynchronous notification of events Is available in Condor 6. 7. X http: //www. cs. wisc. edu/condor 12

Example: GAHP, cont R: $Gahp. Version: 1. 0. 0 Nov 26 2001 NCSA Co.

Example: GAHP, cont R: $Gahp. Version: 1. 0. 0 Nov 26 2001 NCSA Co. G Gahpd $ S: GRAM_PING 100 vulture. cs. wisc. edu/fork R: E S: RESULTS R: E S: COMMANDS R: S COMMANDS GRAM_JOB_CANCEL GRAM_JOB_REQUEST GRAM_JOB_SIGNAL GRAM_JOB_STATUS GRAM_PING INITIALIZE_FROM_FILE QUIT RESULTS VERSION S: VERSION R: S $Gahp. Version: 1. 0. 0 Nov 26 2001 NCSA Co. G Gahpd $ S: INITIALIZE_FROM_FILE /tmp/grid_proxy_554523. txt R: S S: GRAM_PING 100 vulture. cs. wisc. edu/fork R: S S: RESULTS R: S 0 S: RESULTS R: S 1 R: 100 0 S: QUIT R: S http: //www. cs. wisc. edu/condor 13

JSDL and Condor › Grid. SAM: open › source web service for job submission

JSDL and Condor › Grid. SAM: open › source web service for job submission and monitoring Condor plugin for Grid. SAM enables JSDL submissions to Condor. http: //www. cs. wisc. edu/condor 14

RDMS: Quill › Job Class. Ads Master Startd …Schedd Job Queue log Quill RDBMS

RDMS: Quill › Job Class. Ads Master Startd …Schedd Job Queue log Quill RDBMS Queue + History Tables › › information mirrored into an RDBMS Both active jobs and historical jobs Benefits BOTH scalability and accessibility http: //www. cs. wisc. edu/condor 15

Condor Perl Module › Perl module to parse the “job log file” › Recommended

Condor Perl Module › Perl module to parse the “job log file” › Recommended instead of polling w/ › › condor_q Call-back event model (Note: job log can be written in XML) http: //www. cs. wisc. edu/condor 16

Web Service Interface › Simple Object Access Protocol h. Mechanism for doing RPC using

Web Service Interface › Simple Object Access Protocol h. Mechanism for doing RPC using XML (typically over HTTP or HTTPS) h. A World Wide Web Consortium (W 3 C) standard › SOAP Toolkit: Transform a WSDL to a client library http: //www. cs. wisc. edu/condor 17

Benefits of a Condor SOAP API › Condor becomes a service h. Can be

Benefits of a Condor SOAP API › Condor becomes a service h. Can be accessed with standard web service tools › Condor accessible from platforms › where its command-line tools are not supported Talk to Condor with your favorite language and SOAP toolkit http: //www. cs. wisc. edu/condor 18

Condor SOAP API functionality › › › Submit jobs Retrieve job output Remove/hold/release jobs

Condor SOAP API functionality › › › Submit jobs Retrieve job output Remove/hold/release jobs Query machine status Query job status http: //www. cs. wisc. edu/condor 19

Getting machine status via SOAP Your program condor_collector query. Startd. Ads() Machine List SOAP

Getting machine status via SOAP Your program condor_collector query. Startd. Ads() Machine List SOAP library SOAP over HTTP http: //www. cs. wisc. edu/condor 20

Lets get some details… http: //www. cs. wisc. edu/condor 21

Lets get some details… http: //www. cs. wisc. edu/condor 21

The API › Core API, described with WSDL, is designed to be as flexible

The API › Core API, described with WSDL, is designed to be as flexible as possible h. File transfer is done in chunks h. Transactions are explicit › Wrapper libraries aim to make common tasks as simple as possible h. Currently in Java and C# h. Expose an object-oriented interface http: //www. cs. wisc. edu/condor 22

Things we will cover › › › Condor setup Necessary tools Job Submission Job

Things we will cover › › › Condor setup Necessary tools Job Submission Job Querying Job Retrieval Authentication with SSL and X. 509 h. An important addition in late 6. 7 http: //www. cs. wisc. edu/condor 23

Condor setup › Start with a working condor_config › The SOAP interface is off

Condor setup › Start with a working condor_config › The SOAP interface is off by default h. Turn it on by adding ENABLE_SOAP=TRUE › Access to the SOAP interface is denied by default h. Set ALLOW_SOAP and DENY_SOAP, they work like ALLOW_READ/WRITE/… h. See section 3. 7. 4 of the v 6. 7 manual for a description h. Example: ALLOW_SOAP=*/*. cs. wisc. edu http: //www. cs. wisc. edu/condor 24

Necessary tools › You need a SOAP toolkit h Apache Axis (Java) - http:

Necessary tools › You need a SOAP toolkit h Apache Axis (Java) - http: //ws. apache. org/axis/ h Microsoft. Net - http: //microsoft. com/net/ All our h g. SOAP (C/C++) - http: //gsoap 2. sf. net/ examples are h ZSI (Python) - http: //pywebsvcs. sf. net/ in Java using h SOAP: : Lite (Perl) - http: //soaplite. com/ › You need Condor’s WSDL files Apache Axis h Find them in lib/webservice/ in your Condor release › Put the two together to generate a client library h $ java org. apache. axis. wsdl. WSDL 2 Java › condor. Schedd. wsdl Compile that client library h $ javac condor/*. java http: //www. cs. wisc. edu/condor 25

Helpful tools › The core API has some complex spots › A wrapper library

Helpful tools › The core API has some complex spots › A wrapper library is available in Java and C# h. Makes the API a bit easier to use (e. g. simpler file › transfer & job ad submission) h. Makes the API more OO, no need to remember and pass around transaction ids We are going to use the Java wrapper library for our examples h. You can download it from http: //www. cs. wisc. edu/condor/birdbath. jar h Will be included in Condor release http: //www. cs. wisc. edu/condor 26

Submitting a job › The CLI way… cp. sub: universe = vanilla executable =

Submitting a job › The CLI way… cp. sub: universe = vanilla executable = /bin/cp arguments = cp. sub cp. worked should_transfer_files = yes transfer_input_files = cp. sub when_to_transfer_output = on_exit queue 1 clusterid = X procid = Y owner = matt requirements = Z Explicit bits Implicit bits $ condor_submit cp. sub http: //www. cs. wisc. edu/condor 27

Submitting a job • The SOAP way… 1. Begin transaction Repeat to submit multiple

Submitting a job • The SOAP way… 1. Begin transaction Repeat to submit multiple clusters 2. Create cluster 3. Create job 4. Send files 5. Describe job 6. Commit transaction Repeat to submit multiple jobs in a single cluster http: //www. cs. wisc. edu/condor 28

Submission from Java Schedd schedd = new Schedd(“http: //…”); Transaction xact = schedd. create.

Submission from Java Schedd schedd = new Schedd(“http: //…”); Transaction xact = schedd. create. Transaction(); 1. Begin transaction xact. begin(30); int cluster = xact. create. Cluster(); 2. Create cluster int job = xact. create. Job(cluster); 3. Create job File[] files = { new File(“cp. sub”) }; xact. submit(cluster, job, “owner”, Universe. Type. VANILLA, “/bin/cp”, “cp. sub cp. worked”, “requirements”, null, files); xact. commit(); 4&5. Send files & describe job 6. Commit transaction http: //www. cs. wisc. edu/condor 29

Submission from Java Schedd’s location Schedd schedd = new Schedd(“http: //…”); Transaction xact =

Submission from Java Schedd’s location Schedd schedd = new Schedd(“http: //…”); Transaction xact = schedd. create. Transaction(); Max time between calls (seconds) xact. begin(30); int cluster = xact. create. Cluster(); int job = xact. create. Job(cluster); File[] files = { new File("cp. sub") }; Job owner, e. g. “matt” xact. submit(cluster, job, “owner”, Universe. Type. VANILLA, “/bin/cp”, “cp. sub cp. worked”, “requirements”, null, files); xact. commit(); Requirements, e. g. “Op. Sys==“Linux”” Extra attributes, e. g. Out=“stdout. txt” or Err=“stderr. txt” http: //www. cs. wisc. edu/condor 30

Querying jobs › The CLI way… $ condor_q -- Submitter: localhost : <127. 0.

Querying jobs › The CLI way… $ condor_q -- Submitter: localhost : <127. 0. 0. 1: 1234> : localhost ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 1. 0 matt 10/27 14: 45 0+02: 46: 42 C 0 1. 8 sleep 10000 … 42 jobs; 1 idle, 1 running, 1 held, 1 unexpanded http: //www. cs. wisc. edu/condor 31

Querying jobs › The SOAP way from Java… String[] status. Name = { “”,

Querying jobs › The SOAP way from Java… String[] status. Name = { “”, “Idle”, “Running”, “Removed”, “Completed”, “Held” }; Also, get. Job. Ads given a int cluster = 1; int job = 0; constraint, e. g. “Owner==“matt”” Schedd schedd = new Schedd(“http: //…”); Class. Ad ad = new Class. Ad(schedd. get. Job. Ad(cluster, job)); int status = Integer. value. Of(ad. get(“Job. Status”)); System. out. println(“Job is “ + status. Name[status]); http: //www. cs. wisc. edu/condor 32

Retrieving a job › The CLI way. . › Well, if you are submitting

Retrieving a job › The CLI way. . › Well, if you are submitting to a local › Schedd, the Schedd will have all of a job’s output written back for you If you are doing remote submission you need condor_transfer_data, which takes a constraint and transfers all files in spool directories of matching jobs http: //www. cs. wisc. edu/condor 33

Retrieving a job › The SOAP way in Java… int cluster = 1; Discover

Retrieving a job › The SOAP way in Java… int cluster = 1; Discover available files int job = 0; Schedd schedd = new Schedd(“http: //…”); Transaction xact = schedd. create. Transaction(); xact. begin(30); Remote file File. Info[] files = xact. list. Spool(cluster, job); for (File. Info file : files) { xact. get. File(cluster, job, file. get. Name(), file. get. Size(), new File(file. get. Name())); } xact. commit(); Local file http: //www. cs. wisc. edu/condor 34

Authentication for SOAP › Authentication is done via mutual SSL authentication h Both the

Authentication for SOAP › Authentication is done via mutual SSL authentication h Both the client and server have certificates and identify themselves › Possible in late-late 6. 7 (available by 6. 8) › It is not always necessary, e. g. in some controlled › environments (a portal) where the submitting component is trusted A necessity in an open environment -- remember that the submit call takes the job’s owner as a parameter h. Imagine what happens if anyone can submit to a Schedd running as root… http: //www. cs. wisc. edu/condor 35

Authentication setup › Create and sign some certificates › Use Open. SSL to create

Authentication setup › Create and sign some certificates › Use Open. SSL to create a CA h. CA. sh -newca › Create a server cert and password-less key h. CA. sh -newreq && CA. sh -sign hmv newcert. pem server-cert. pem hopenssl rsa -in newreq. pem -out server-key. pem › Create a client cert and key h. CA. sh -newreq && CA. sh -sign && mv newcert. pem client-cert. pem && mv newreq. pem client-key. pem http: //www. cs. wisc. edu/condor 36

Authentication config › Config options… h. ENABLE_SOAP_SSL is FALSE by default h<SUBSYS>_SOAP_SSL_PORT • Set

Authentication config › Config options… h. ENABLE_SOAP_SSL is FALSE by default h<SUBSYS>_SOAP_SSL_PORT • Set this to a different port for each SUBSYS you want to talk to over ssl, the default is a random port • Example: SCHEDD_SOAP_SSL_PORT=1980 h. SOAP_SSL_SERVER_KEYFILE is required and has no default • The file containing the server’s certificate AND private key, i. e. “keyfile” after cat server-cert. pem server-key. pem > keyfile http: //www. cs. wisc. edu/condor 37

Authentication config › Config options continue… h. SOAP_SSL_CA_FILE is required • The file containing

Authentication config › Config options continue… h. SOAP_SSL_CA_FILE is required • The file containing public CA certificates used in signing client certificates, e. g. demo. CA/cacert. pem › All options except SOAP_SSL_PORT have an optional SUBSYS_* version h. For instance, turn on SSL for everyone except the Collector with • ENABLE_SOAP_SSL=TRUE • COLLECTOR_ENABLE_SOAP_SSL=FALSE http: //www. cs. wisc. edu/condor 38

One last bit of config › The certificates we generated have a principal name,

One last bit of config › The certificates we generated have a principal name, which › › › is not standard across many authentication mechanisms Condor maps authenticated names (here, principal names) to canonical names that are authentication method independent This is done through mapfiles, given by SEC_CANONICAL_MAPFILE and SEC_USER_MAPFILE Canonical map: SSL. *email. Address=(. *)@cs. wisc. edu. * 1 User map: (. *) 1 “SSL” is the authentication method, “. *email. Address…. *” is a pattern to match against authenticated names, and “1” is the canonical name, in this case the username on the email in the principal http: //www. cs. wisc. edu/condor 39

HTTPS with Java › Setup keys… h keytool -import -keystore truststore -trustcacerts -file ›

HTTPS with Java › Setup keys… h keytool -import -keystore truststore -trustcacerts -file › demo. CA/cacert. pem h openssl pkcs 12 -export -inkey client-key. pem -in clientcert. pem -out keystore All the previous code stays the same, just set some properties h javax. net. ssl. trust. Store, javax. net. ssl. key. Store. Type, javax. net. ssl. key. Store. Password h Example: java -Djavax. net. ssl. trust. Store=truststore Djavax. net. ssl. key. Store=keystore Djavax. net. ssl. key. Store. Type=PKCS 12 Djavax. net. ssl. key. Store. Password=pass Example https: //… http: //www. cs. wisc. edu/condor 40

Questions? http: //www. cs. wisc. edu/condor 41

Questions? http: //www. cs. wisc. edu/condor 41