Simple Grid Toolkit Enabling Efficient Learning and Development

Simple. Grid Toolkit: Enabling Efficient Learning and Development of Tera. Grid Science Gateway Shaowen Wang 1, 2, Yan Liu 1, 2, Nancy Wilkins-Diehr 3, Stuart Martin 4, 5 1. Cyber. Infrastructure and Geospatial Information Laboratory (CIGI) Department of Geography 2. National Center for Supercomputing Applications (NCSA) University of Illinois at Urbana-Champaign 3. San Diego Supercomputer Center (SDSC) University of California at San Diego 4. Argonne National Laboratory 5. University of Chicago November 11, 2007

Purpose Simply the learning of science gateways n Expedite the prototyping process of developing science gateways n 2

Background Grid computing n Science and engineering gateway n Problem solving environments (PSE) n 3

Related Work n Active area – Evidenced by Tera. Grid Science Gateway activities n Examples – Gridport – Gridsphere Vine – OGCE 4

The State of the Art n n Evolving and sophisticated web portal technologies – – Grid. Sphere Liferay Sakai Jetspeed Missing simple, robust, and reusable interfaces between applications and portals – Significant gap between Grid technologies and application problem solving environments n Grid middleware complexity – Grid technologies focus on enabling resource sharing and federation – The development of problem solving environments requires extensible, programmable, reusable, application-oriented software components that support customizable access to Grid and VO capabilities 5

Simple. Grid Motivation Grid and web portal technologies are complex, and still rapidly evolving n An effort to close the gap between Grid computing and scientific applications n 6

Simple. Grid – Component. Based Design 7

Architecture – External Interfaces 8

Architecture – Internal Interactions 9

Efficient Learning and Development n Three-stage learning n Simple installation and deployment n Reusable components for development n Development environment setup – Command-line – Grid-enabled java application development – Portlet development – Java, Ant, Tomcat, Grid. Sphere – Globus Toolkit 4. 0+ only for command-line stage – Simple. Grid APIs – JSP and Velocity templates – Manual for Simple. Grid setup in Eclipse 10

From Individual to Community n n n Tera. Grid command-line tools for individual use Simple. Grid APIs to automate the access to cyberinfrastructure resources Simple. Grid portlets to enable community access to scientific problem solving capabilities as deployable components in science gateway portals 11

Simple. Grid APIs Simple. Cred: Grid proxy management n Simple. Tran: Data transfer to/from Grids n Simple. Run: Grid job management n Simple. Viz: Visualization component n Simple. Info: Grid information provider n – Under development – Current Grid information is provided statically through a configuration file 12

Simple. Cred n Fetch Grid credentials – – Local proxy loading or instantiation Remote proxy instantiation through My. Proxy n Automatic credential renewal n Grid community user support n Programming interface n Portlet interface – Simple interface for Grid proxy renewal, i. e. , Simple. Cred. get() – A global Simple. Cred instance can be stored in portal as a shared object for users using the same community account – load(), logon(), get() – Grid credentials can be managed explicitly through a User. Portlet interface 13

Simple. Tran A wrapper of Grid. FTP n Threaded implementation n – Allow responsive interactions between portal and client browser 14

Simple. Run n A wrapper of GRAM and WS-GRAM – Support both GT 2 and GT 4 job submission – User selectable Depends on Simple. Tran to transfer datasets n Programming interface n – execute() – get. Status() 15

Simple. Viz n Visualization mechanisms – JFree. Chart – Google map – Para. View (under development) Threaded implementation n Portlet interface n – Google map-based Java. Script library 16

Portlet Components and Interfaces n n n User. Portlet – – – User information and Grid credential management Interface: JSP Portlet: Grid. Sphere Action. Portlet – – – A typical scientific computational analysis process Interface: Velocity Portlet: Velocity. Portlet DMSPortlet container – Grid. Sphere http: //www. collab-ogce. org/ogce 2/velocity-portlets. html 17

Case Study n n Two-dimensional spatial interpolation in Geographic Information Systems Nearest-neighbor search procedure – Computing intensive for large spatial datasets and/or high-resolution interpolation n A fast two-dimensional spatial interpolation algorithm called DMS (Dynamically Memorized Search) – Parameter-sweeping application for sensitivity analysis 18

Tera. Grid-Based DMS Analysis n n n n n Request an individual or community account on Tera. Grid Install DMS executables on three Tera. Grid sites Prepare a dataset on a local machine Transfer a specified dataset to a Tera. Grid site (e. g. , NCSA) Submit a Grid job to the specified Tera. Grid site with a parameter value The submitted job is scheduled to be executed on one compute node on the specified Tera. Grid cluster When the job is finished, the analysis result is written into the data directory of DMS installation on the Tera. Grid cluster Transfer the result back to the local machine Visualize the result using the DMS visualization tool 19

DMS Analysis Portlet 20

Case Study Summary n 16 participants – various levels of software development experience and Grid computing knowledge n 2. 5 hours, all participants including those with minimum Java programming knowledge – Master the Simple. Grid APIs for the DMS analysis – Successfully set up a portlet for the analysis in a Grid. Sphere portal server 21

Concluding Discussion n The Simple. Grid toolkit – Makes an abstraction of generic Grid middleware services n Enables science gateway developers to concentrate on developing PSE by working on reusable and extensible software components – Hides the complexity of evolving web portal technologies by tailoring to application requirements for developing PSE n n Service-oriented architecture Component-based framework Simplify science gateway development Help overcome the learning curve of science gateway technologies 22

Ongoing Work n APIs – Grid-based visualization – Simple. Info – Workflow n Automation tools – Enable automatic application integration as science gateway portal components (portlets) – User interface definition and generation – Workflow code stubs and Grid-related server-side code skeletons 23

Acknowledgements Cyber. Infrastructure and Geospatial Information Laboratory (CIGI) n National Center for Supercomputing Applications (NCSA) n NSF Tera. Grid n 24

Demo 25
- Slides: 25