Pseudo dynamic DAG control Version 1 Outline Goal


















- Slides: 18
Pseudo dynamic DAG control Version 1
Outline • • • Goal Solution Restrictions Example Case Study
Goal • The user should be able to redirect the control in his workflow upon the outcome of any job. The exit value set by the executable of the user’s job determines weather a job fails or succeeds • The failed job therefore should not stop the overall operation, and no real subsequent computational activity must be started in the branch proved to be false. • The solution should be a clean “user level” one not requesting any change in the P-GRADE Portal middleware
The Solution The solution is the introduction of a suggested Job template where the frame of the job is standardized.
Solution details • Enveloping the executable of the original job in a standard wrapper program which terminates as TRUE. The wrapper program is written in C and downloadable as http: //www. sztaki. hu/~ghermann/Szeme lyes/Pseudo. Dynamic. DAGControl/wrapper. exe • Adding standard logical Input / Output channels to the wrapped job to control the flow
Restriction of the solution • The solution handles only internal (programmed) job failures • Failures due to the environment (resource, authentication and communication problems) are recognized by the DAGMAN and can be handled by the Rescue feature of the P-GRADE Portal
Original input files I/O convention for Job Wrapper Extension of an original job Original output files Modified job: Original If(LOG_INPUT. value) Executable(); EXECUTABLE_INPUT port LOG_INPUT. value= Executable(). exit; TRUE_OUTPUT. value = LOG_INPUT. value; FALSE_OUTPUT. value = ! LOG_OUTPUT. value; TRUE_OUTPUT port FALSE_OUTPUT port
Possible Input. Data states LOG_INPUT I Fake Output gen. F_OUTPUT T_OUTPUT LOG_INPUT execute F_OUTPUT Output. Data Input. Data Fake Output gen. T_OUTPUT LOG_INPUT Output. Data Input. Data III execute F_OUTPUT II T_OUTPUT Output. Data Animation of wrapper job operation Token with valueinput TRUE on Logical FALSE value on or FALSE arrives on triggers the execution of the LOG_INPUT activates LOG_INPUT program of the user subsequent jobs connected In the different cases pro forma to the F(ALSE)_OUTPUT (fake) output will be generated to “cheat” the DAGMAN Non Zero (false) exit value on “execute” activates the subsequent jobs connected to the F(ALSE)_OUTPUT Real Output data will be “execute” may return false forwarded only if the user job or true exit value “execute” succeeds Zero (true) exit value on “execute” activates the subsequent jobs connected to the T(RUE)_OUTPUT
RULES FOR EXTENDED JOBS • • • The Job Executable is a special wrapper program (wrapper. exe) The genuine (user) executable returns the exit value Two additional input Ports and two additional output Ports are introduced each with standard Internal File Name: the genuine executable is associated as “EXECUTABLE_INPUT”, the file delivering the executing permission is “LOG_INPUT”, the name of files delivering the propagated permissions for the subsequent jobs in the proper direction are “TRUE_OUTPUT” and “FALSE_OUTPUT” The logical input and output ports accept special files with content {TRUE|FALSE} The Internal File Names of the output files which may be produced by the user executable must be listed after the genuine arguments separated by the keyword –outputs. This list is needed because if the LOG_INPUT delivers FALSE value or the user job fails then the wrapper must create pro forma (fake) output data files substituting the not running or not properly running executable of the user. In the lack of these files the DAGMAN would abort the job while attempting to copy the not existing files to the subsequent jobs.
EXAMPLE: IF(C 1) E 1 ELSE IF(C 2) E 2 ELSE E 3 Owerview
EXAMPLE: IF(C 1) E 1 ELSE IF(C 2) E 2 ELSE E 3 Detailes new LOG_INPUT port (Value: TRUE, FALSE) new FALSE_OUTPUT branch (value: TRUE, FALSE) Each Internal File Name Job executable is the original inputcan databeport of which produced thefiles standard by the genuine user “wrapper. exe” executable must be listed after the separator attribute -outputs new EXECUTABLE_INPUT port to upload the genuine executable original output data port new TRUE_OUTPUT port (value: TRUE, FALSE)
Example A LOG_INPUT port not to any IF(C 1) E 1 ELSE IF(C 2)connected E 2 ELSE E 3(logical) output ports must be Environment associated to a file containing the ascii string “TRUE” LOG_INPUT EXECUTABLE_INPUT TRUE_OUTPUT FALSE_OUTPUT
II Part (A case study) The case study is an IF THEN ELSE type simple workflow containing three jobs. The tested application can be downloaded as: http: //www. sztaki. hu/~ghermann/Szemelyes/Pseudo Dynamic. DAGControl/Test. Program/SZTAKI_hermann_IF _THEN_ELSE_fork_seegrid. tar. gz
II Part (Case study) Input port definition to upload the executable “exit. With. Arg. exe” Thejob jobof“TRUEBR” connected by The first wrapper The test job IFarg. Eq 0 is the wrapper the port of the job type. TRUE_OUTPUT must of therun executable “exit. With. Arg. exe” “IFarg. Eq 0” will not its user The jobexecute “FALSEBR” connected unconditionally therefore Port to definewhich the user exits the same value it has “multiply. exe” defined at to the port getsprogram a file containing been defined as FALSE_OUTPUT Attributes i. e. of executable “multiply. exe” the. IFarg. Eq 0 port: 1 will run in our “TRUE” as LOG_INPUT we expect that the workflow will experiment executing the user execute the job FALSEBR program “Copy. And. Time” defined (connected to the FALSE_OUTPUT at the port: 1 port )
Result of the case study
Job IFArg. Eq 0 output listing Message of the embedded user program “Exit. With. Arg” As this program has no “real” data output the warning can be left out of consideration The wrapper reports its decision which determines the activation of subsequent jobs
Job TRUEBR output listing As the preceding wrapper job resulted the value “FALSE” on the TRUE_OUPUT port the user executable of this job will not be executed
Job FALSEBR output listing Message of the embedded user program “Copy. And. Time” The wrapper reports its decision which determines the activation of subsequent jobs