VLab Collaborative Grid Services and Portals to Support
VLab: Collaborative Grid Services and Portals to Support Computational Material Science Mehmet Nacar, Mehmet Aktas, Marlon Pierce, Zhenyu Lu, Gordon Erlebacher, Dan Kigelman, Evan F. Bollig, Cesar De Silva, Benny Sowell, and David A. Yuen GCE’ 05 18 November 2005 Seattle, WA
Introduction • • Grid and Web Service-based system for enabling distributed and collaborative computational chemistry and material science applications for the study of planetary materials. The requirements of The Virtual Laboratory for Earth and Planetary Materials (VLab) include – – • • • job preparation and submission, job monitoring data storage and analysis distributed collaboration. These components are divided into client entry (input file creation, visualization of data, task requests) and backend services (storage, analysis, computation). Clients and services communicate through Narada. Brokering We describe two aspects of VLab in this paper: 1. Data entry and submission, 2. Visualization web client/service.
Research Issues • VLab presents several interesting problems. Driving Grid research issues include the following: – Persistently managing user inputs as archived, hierarchical project metadata. – Simplifying complicated, multi-staged job submissions and monitoring using Grid portal technology. – Integrating VLab applications with Grid messaging infrastructure to virtualize resource usage, provide fault tolerance, and enable collaboration.
Managing Plane Wave Self Consistent Field (PWSCF) Calculations User session archive management, since VLab submissions may involve dozens or hundreds of simulation runs per user. We also anticipate the addition of more codes from the Quantum Espresso suite and the need to couple these into workflows.
PWSCF Input Forms
JSF Grid Beans • Using one bean that is a bean factory for Generic. Grid. Task beans. • We create and manage multiple beans for each task. – That is, I submit the job four times in one session. – Similarly, we can create multiple task graph clones. • Beans have listeners and maintain state. – Unsubmitted, active, suspended, resumed are “live” • Stored in live repository – Failed, canceled, completed, unknown are “dead” • Stored in archive (WS-Context or other) • JSF grid beans can be easily serialized with XML. – Castor, XML Beans – Marshal and un-marshal user input for persistent storage in XML storage services • OGSA-DAI, GPIR, WS-Context • Data model classes handles monitoring of submitted jobs. These classes handled by JSF Data Table element.
Task Management Class Structure
Constructing Task Graphs
Managing Grid Tasks and Task. Graphs • Task Manager handles independent user requests, or tasks, from the portlet client in Grid services. – The user request-generating objects are simply Java Bean class instances that wrap common Grid actions (launching remote commands, transferring data, performing remote file operations) using Java COG classes. • Task. Graph Manager handles multiple-step task submission – The Task. Graph Manager coordinates user requests with Task. Graph backing beans. – Each Task. Graph bean is itself composed of Generic. Grid. Bean implementation instances (for file transfer, job submission, etc. ). – Express the dependencies using JSF tag library extensions, so that the JSF application developer can encode the composite task graph workflow out of reusable tags. – The Task. Graph Manager submits and monitors Task. Graphs through an action method when the user launches the job.
Task. Graph Submission Form Corresponding JSF snippets <o: task. Graph id="my. Graph" method="#{taskgraph. test}" > <o: task id="task 1" method="task. create" type="File. Transfer" /> <o: task id="task 2" method="task. create" type="Job. Submit" /> <o: task id="task 3" method="task. create" type="File. Transfer" /> <o: task. Add name="task 1" method="taskgraph. add" /> <o: task. Add name="task 2" depends="task 1" method="taskgraph. add" /> <o: task. Add name="task 3" depends="task 2" method="taskgraph. add" /> </o: task. Graph> <h: panel. Grid columns="3" > <h: output. Text value="Hostname (*) "/> <h: input. Text value="#{task. hostname}"/> </h: panel. Grid> <h: panel. Grid columns="3" > <h: output. Text value="Provider (*) "/> <h: input. Text value="#{task. provider}"/> </h: panel. Grid> <h: panel. Grid columns="2"> <h: command. Button id="submit" value="Submit" action="#{taskgraph. submit. Action}"/> <h: command. Button value="Clear" type="Reset"/> </h: panel. Grid>
Task Monitoring with JSF Data Model Corresponding Java class. public class Job { private String job. Id; private String status; private String submit. Date; private String finish. Date; } <h: data. Table value="#{job. Data. jobs}" var="job"> <h: column> <f: facet name="header"> <h: output. Text style="font-weight: bold" value="Job ID" /> </f: facet> <h: output. Text value="#{job. Id}"/> </h: column> <f: facet name="header"> <h: output. Text style="font-weight: bold" value="Submit Date" /> </f: facet> <h: output. Text value="#{job. submit. Date}"/> </h: column> <f: facet name="header"> <h: output. Text style="font-weight: bold" value="Finish Date" /> </f: facet> <h: output. Text value="#{job. finish. Date}"/> </h: column> <f: facet name="header"> <h: output. Text style="font-weight: bold" value="Status" /> </f: facet> <h: output. Text value="#{job. status}"/> </h: column> </h: data. Table>
Narada. Brokering • Publish/subscribe paradigm, • Reliable/robust flexible messaging. • Middleware infrastructure is designed around a scalable distributed network of cooperating message routers and processors. • Narada. Brokering supports - High-performance Collaborative environments. - Core Web and Grid capabilities Current Pub/Sub Capabilities • Multiple transport support • Subscription Formats • Messaging Related Compliance Current Grid/Web Application Support • Reliable delivery • Ordered delivery • Recovery and Replay • Security • Message Payload options • Grid Application Support • Web Services
Collaborative Web Services (1)
Collaborative Web Services (2) • A network of three Narada. Brokering brokers is deployed across Indiana University, Florida State University and University of Minnesota. Attached to the network are two wavelet services, with their service adaptors, two schedulers, and several client adaptors. Clients can access the wavelet service via a wavelet applet using standard browsers.
Visualization • In our wavelet application, the applet displays the coefficients as spheres centered at the location occupied by the center of the 3 D wavelet. High performance graphics are obtained through the use of JOGL, an Open. GL API for Java.
Conclusion and Future Work • We have addressed three important components of this framework: data entry, job submission, and backend services. Our work is guided by the principle of ease of use, fault tolerance, collaboration, and persistent records. • Our future work will focus on improving the flexibility of collaborative environment • Utilizing data flow model to formulate the task execution and providing the directory service to search the available services. • The Web Service Resource Framework and particularly the WSNotification specification family may be used to replace our pre-Web Service topic system. • Support for WS-Notification is currently being developed in Narada. Brokering. When this is available we will evaluate its use in our system.
Conclusion • Although we include at two entities of each type in our distributed system (2+ schedulers, 2+ wavelet services, etc. ), there is as yet no attempt to take network and server load into account when choosing which units will perform the actual work. It is currently a first come first chosen approach. We will evaluate existing work and enhance our schedulers and services to activate themselves based on a more realistic measure of instantaneous or extended load. • Collaboration is a natural attribute of our system. Two user tasks that subscribe to identical topics automatically receive the same information. We will investigate approaches to achieve this collaboration both at the visual level (shared user interfaces and displays), with the possibility of multiple users controlling the input. Much work has been done in this area, albeit (to the authors’ knowledge) not within the context of publish/subscribe middleware. • Complex workflows are important within VLab. Recent research has shown how to implement strategies for specifying workflows across multiple services. This work will e integrated within our system to properly link input, job submission, analysis, feedback to the user, and finally, automatic (or semi-automatic) decisions regarding the next set of simulations to submit.
- Slides: 17