Condor NT Condor ported to Win 32 Todd

  • Slides: 16
Download presentation
Condor NT Condor ported to Win 32 Todd Tannenbaum Computer Sciences Department University of

Condor NT Condor ported to Win 32 Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison tannenba@cs. wisc. edu http: //www. cs. wisc. edu/condor

Overview › Intro to Condor NT › What does Condor NT do? › How

Overview › Intro to Condor NT › What does Condor NT do? › How does Condor NT differ from › › Condor for Unix? What are the current limitations of Condor NT? Future Work www. cs. wisc. edu/condor 2

Intro to Condor NT › First pre-release at Condor ver 6. 1. 8 ›

Intro to Condor NT › First pre-release at Condor ver 6. 1. 8 › “Deep port” of Condor › Daemons run as a system Service › under the Local. System account Shares as much source code with Condor for Unix as possible www. cs. wisc. edu/condor 3

Condor NT Downloads www. cs. wisc. edu/condor 4

Condor NT Downloads www. cs. wisc. edu/condor 4

What can it do? › Almost everything Condor for Unix can… h. Submit, run,

What can it do? › Almost everything Condor for Unix can… h. Submit, run, manage queues of jobs • Jobs run “in the background” h. Nearly all Condor tools included h. Class. Ads • Full compliment of attributes (load average, RAM, benchmarks, free swap, key/mouse idle times, image size, CPU usage, etc) h. Everything needed for a Central Manager www. cs. wisc. edu/condor 5

What can it do? (cont) › Support for SMP machines › Several security mechanisms

What can it do? (cont) › Support for SMP machines › Several security mechanisms (more later…) › Suspend, continue, soft-kill (WM_CLOSE), › › › hard-kill jobs Correctly manage multi-process jobs Send email notifications Yada, yada, … www. cs. wisc. edu/condor 6

What’s missing? › Only VANILLA universe included h. No STANDARD, PVM, GLOBUS, SCHEDULER universe

What’s missing? › Only VANILLA universe included h. No STANDARD, PVM, GLOBUS, SCHEDULER universe h. Note: MPI being done on both Unix and Win 32 › Ability to run the job as the submitting › › user Ability to access shared volumes as the submitting user So – who does the job run as, how does the job get its files? www. cs. wisc. edu/condor 7

Job Start on Condor NT › On execute machine, Condor creates h. New temporary

Job Start on Condor NT › On execute machine, Condor creates h. New temporary user account h. New temporary working directory h. New temporary, non-visible desktop › Permissions (ACLs) set › Files transferred by Condor › Job spawned www. cs. wisc. edu/condor 8

While Job is Running… › Condor watches the job and updates dynamic attributes about

While Job is Running… › Condor watches the job and updates dynamic attributes about the job in the job Class. Ad h. Disk usage, cpu usage, … › Enforces the machine owner’s policy www. cs. wisc. edu/condor 9

On Job Vacate/Exit… › Condor conditionally transfers any output files back to the submit

On Job Vacate/Exit… › Condor conditionally transfers any output files back to the submit machine h. Can be told filenames, or automatically send back files which have changed h. File transfers are atomic › Cleanup www. cs. wisc. edu/condor 10

Some points on shared (network) filesystem access › On Condor Unix, VANILLA requires a

Some points on shared (network) filesystem access › On Condor Unix, VANILLA requires a shared filesystem h. Not true on Condor NT › Condor NT can access a shared filesystem h… but only as user “Guest” or only if the share password is provided by the job www. cs. wisc. edu/condor 11

Difficulties of running as the user › Forwarding credentials problem h. Windows NTLM in

Difficulties of running as the user › Forwarding credentials problem h. Windows NTLM in NT 4. 0 can impersonate the peer on a socket, but only one “jump” A B C › On Windows NT, cannot just setuid() www. cs. wisc. edu/condor 12

Current Work To Do › Improve situation for access to shared filesystem h. As

Current Work To Do › Improve situation for access to shared filesystem h. As user “condor”, or h. As user who submitted the job › Run jobs as the submitting user h. On NT 4. 0 : store the password, forward it encrypted h. On Windows 2000: same or PKI www. cs. wisc. edu/condor 13

Current Work Todo, Cont. › Windows 2000 support h. Current release mostly works on

Current Work Todo, Cont. › Windows 2000 support h. Current release mostly works on Win 2 k… h. Take advantage of Win 2 k enhancements › Add in Scheduler Universe h. And therefore DAGMan support › Add in the MPI Universe www. cs. wisc. edu/condor 14

Future Work › Add remaining missing Condor Universes h. STANDARD • Requires addition of

Future Work › Add remaining missing Condor Universes h. STANDARD • Requires addition of process checkpoint and/or remote system call h. GLOBUS • Requires Globus Toolkit client libs on Win 32 h. PVM www. cs. wisc. edu/condor 15

Thank You! www. cs. wisc. edu/condor 16

Thank You! www. cs. wisc. edu/condor 16