ETICS 2 JRA 1 Planning Becky Gietzel Computer
ETICS 2 JRA 1 Planning Becky Gietzel Computer Sciences Department University of Wisconsin-Madison bgietzel@cs. wisc. edu
JRA 1 Goals h. Cross-site migration improvements ( now -> month 3 ) h. Web Service Interface specification (month 3) h. IPv 6 compliance analysis really means (month 6) h. Virtualization ( month 9 ) h. Co-scheduling (month 12) www. cs. wisc. edu/~bgietzel
Additional JRA 1 tasks h. Infrastructure maintenance and improvements to support the goals. Includes monitoring, service availability, user support, scalability h. Improved tools for monitoring job status h. Better configuration management (production, test, dev) with clear direction on who makes changes to these machines and when. www. cs. wisc. edu/~bgietzel
Cross-site Migration h. Basic features work presently h. JRA 1 works to make migration more robust in handling networking or outside failures h. Better error reporting (and training) for users to better understand status of migrated jobs. h. Version compatibility of Condor/Metronome is not clear. JRA 1 creates chart of version/feature availability. www. cs. wisc. edu/~bgietzel
Web Service Interface h. JRA 1 + SA 1 involvement h. JRA 1 produces timeline for implementing desired features h. Deciding on which features to support and their level of importance could present a challenge. www. cs. wisc. edu/~bgietzel
IPv 6 compliance h. Condor natively uses IPv 4 now h. Other applications running on top of or alongside of Condor are fine to use IPv 6 networking if the underlying hardware supports it. h. Metronome provides communication parameters for co-scheduled jobs to use IPv 6 between nodes for testing. h. Is this sufficient for applications? www. cs. wisc. edu/~bgietzel
Virtualization h. Set up the infrastructure required h. JRA 1 + SA 1 collaboratively gather user and infrastructure requirements. h. Automated testing as root becomes much simpler. h. Challenges - technical details of resource provisioning www. cs. wisc. edu/~bgietzel
Co-scheduling h. Currently users may define multiple platforms and the resources are claimed simultaneously to run coscheduled jobs. h. Improved communication tools between execute nodes are desired h. Scheduling policy could use improvements. www. cs. wisc. edu/~bgietzel
For more information on Condor or Metronome h. Metronome documentation is available on the NMI site, nmi. cs. wisc. edu h. Condor documentation is at www. cs. wisc. edu/condor www. cs. wisc. edu/~bgietzel
- Slides: 9