GRACE CERN 2004 GRACE Jan Fiete GrosseOetringhaus CERN

  • Slides: 31
Download presentation
GRACE - CERN 2004 GRACE Jan Fiete Grosse-Oetringhaus CERN IT/EGE 29. 11. 04 CERN

GRACE - CERN 2004 GRACE Jan Fiete Grosse-Oetringhaus CERN IT/EGE 29. 11. 04 CERN – Jan Fiete Grosse-Oetringhaus 1

GRACE - CERN 2004 Grid Search and Categorization Engine Image by Hector Garcia Puigcerver

GRACE - CERN 2004 Grid Search and Categorization Engine Image by Hector Garcia Puigcerver CERN – Jan Fiete Grosse-Oetringhaus 2

GRACE - CERN 2004 GRACE Workflow • CERN’s Tasks • • • Content Source

GRACE - CERN 2004 GRACE Workflow • CERN’s Tasks • • • Content Source Integration Grid Testing CERN – Jan Fiete Grosse-Oetringhaus 3

GRACE - CERN 2004 Content Sources Integration • Content Source • • Input: Search

GRACE - CERN 2004 Content Sources Integration • Content Source • • Input: Search Query Output: Search Results • HTML output • OAI (Open Archives Infrastructure) compliant output • • Personalized configuration file for each Content Source (SPEC file) Integration Steps • • • Submit the search Parse the result Retrieve associated documents CERN – Jan Fiete Grosse-Oetringhaus 4

GRACE - CERN 2004 Step 1: Submit the Search • • Goal: Submit Search

GRACE - CERN 2004 Step 1: Submit the Search • • Goal: Submit Search Query Input: Query in GRACE format Go to content source, find search fields Add field to SPEC file <get-param name='p'> <paramval name='/query/Quick-Search'/> </get-param> CERN – Jan Fiete Grosse-Oetringhaus 5

GRACE - CERN 2004 Step 2: Parse the Result • • Goal: Produce result

GRACE - CERN 2004 Step 2: Parse the Result • • Goal: Produce result sets interpretable by GRACE Input: Search Result in HTML format CERN – Jan Fiete Grosse-Oetringhaus 6

GRACE - CERN 2004 Step 2: Parse the Result CERN – Jan Fiete Grosse-Oetringhaus

GRACE - CERN 2004 Step 2: Parse the Result CERN – Jan Fiete Grosse-Oetringhaus 7

GRACE - CERN 2004 Step 2: Parse the Result • • Goal: Produce result

GRACE - CERN 2004 Step 2: Parse the Result • • Goal: Produce result sets interpretable by GRACE Input: Search Result in HTML format Identify Fields: Title, Author, Abstract, … Produce XPath Expressions e. g. /root/html/body/div/a Produce XSL (e. Xtended Stylesheet Language) transformation code Produce code for retrieval of associated documents Output: XML result sets CERN – Jan Fiete Grosse-Oetringhaus 8

GRACE - CERN 2004 Step 3: Test your file • Test application (sea. Lion):

GRACE - CERN 2004 Step 3: Test your file • Test application (sea. Lion): part of GRACE application • • Submits search using a given SPEC file Returns GRACE result set Provides debug output CSTest script • • • Uses sea. Lion Validates results Batch testing CERN – Jan Fiete Grosse-Oetringhaus 9

GRACE - CERN 2004 Results • • 16 Content Sources integrated Input for Deliverable

GRACE - CERN 2004 Results • • 16 Content Sources integrated Input for Deliverable 6. 1 • • • How. To: Configuration of Content Sources for Integration with GRACE • • Usable by content providers who want to integrate their content source into GRACE Test. Kit • • Workflow of Integration Status, common problems and risks Test application & scripts How. To & Test. Kit available on GRACE website CERN – Jan Fiete Grosse-Oetringhaus 10

GRACE - CERN 2004 Grid Integration • Two Grid components: • • • Text

GRACE - CERN 2004 Grid Integration • Two Grid components: • • • Text Normalizing Categorizing Components provided by partners, CERN responsible for integration CERN – Jan Fiete Grosse-Oetringhaus 11

GRACE - CERN 2004 First approach • “One for all” (model M 1) •

GRACE - CERN 2004 First approach • “One for all” (model M 1) • • Parallel execution of simultaneous searches O(hours) for complete process CERN – Jan Fiete Grosse-Oetringhaus 12

GRACE - CERN 2004 M 1 Performance CERN – Jan Fiete Grosse-Oetringhaus 13

GRACE - CERN 2004 M 1 Performance CERN – Jan Fiete Grosse-Oetringhaus 13

GRACE - CERN 2004 How to parallelize? CERN – Jan Fiete Grosse-Oetringhaus 14

GRACE - CERN 2004 How to parallelize? CERN – Jan Fiete Grosse-Oetringhaus 14

GRACE - CERN 2004 Parallelized Model • Split text normalization CERN – Jan Fiete

GRACE - CERN 2004 Parallelized Model • Split text normalization CERN – Jan Fiete Grosse-Oetringhaus 15

GRACE - CERN 2004 Parallelized Model • • Split outside the Grid Launch N

GRACE - CERN 2004 Parallelized Model • • Split outside the Grid Launch N jobs • • Monitor Status Launch Categorization job • • • Perform text normalization Store results in the Grid (using Replica Manager) Pick up documents from the Grid and merges them Perform Categorization Get result from Categorization job CERN – Jan Fiete Grosse-Oetringhaus 16

GRACE - CERN 2004 Simulation • Simulate parallelized model including • • • Submission

GRACE - CERN 2004 Simulation • Simulate parallelized model including • • • Submission time Grid overhead Application performance Interesting values • • User (UI) Waiting Time Spent Computing Time CERN – Jan Fiete Grosse-Oetringhaus 17

GRACE - CERN 2004 Simulation CERN – Jan Fiete Grosse-Oetringhaus 18

GRACE - CERN 2004 Simulation CERN – Jan Fiete Grosse-Oetringhaus 18

GRACE - CERN 2004 Conclusions from Simulation • Derived rules for splitting parameters •

GRACE - CERN 2004 Conclusions from Simulation • Derived rules for splitting parameters • • Minimize user waiting time Kopt Save “unnecessary” resources by splitting less than optimal value. Therefore let the user wait 20% more (unnoticeable) Keff • Calculated formulas for splitting parameters • Implemented in Java class for GRACE application CERN – Jan Fiete Grosse-Oetringhaus 19

GRACE - CERN 2004 Measured Results CERN – Jan Fiete Grosse-Oetringhaus 20

GRACE - CERN 2004 Measured Results CERN – Jan Fiete Grosse-Oetringhaus 20

GRACE - CERN 2004 Measured Results CERN – Jan Fiete Grosse-Oetringhaus 21

GRACE - CERN 2004 Measured Results CERN – Jan Fiete Grosse-Oetringhaus 21

GRACE - CERN 2004 Results • JDLs for Grid Jobs created for both models

GRACE - CERN 2004 Results • JDLs for Grid Jobs created for both models • • Description of Grid Jobs • • GRACE can run on the Grid Input for Deliverable 6. 1 Parallelized Job Model • Used in Grid Tests CERN – Jan Fiete Grosse-Oetringhaus 22

GRACE - CERN 2004 Grid Tests • • • Test plan for both models

GRACE - CERN 2004 Grid Tests • • • Test plan for both models and comparison Creation of input corpus Creation of test scripts for semi-automatic testing Creation of scripts for validation of output and parsing of logging General tests started 20. 10. 04 Main test period from 05. 11 to 25. 11. 04 Tests performed in GILDA testbed Submitted more than 1000 jobs Made about 1 million Java API calls CERN – Jan Fiete Grosse-Oetringhaus 23

GRACE - CERN 2004 Results CERN – Jan Fiete Grosse-Oetringhaus 24

GRACE - CERN 2004 Results CERN – Jan Fiete Grosse-Oetringhaus 24

GRACE - CERN 2004 Comparison CERN – Jan Fiete Grosse-Oetringhaus 25

GRACE - CERN 2004 Comparison CERN – Jan Fiete Grosse-Oetringhaus 25

GRACE - CERN 2004 Results • Input for Deliverable 7. 2 • • •

GRACE - CERN 2004 Results • Input for Deliverable 7. 2 • • • Validation of simulated results Intensive use of GILDA • • Validation of the suitability of GRACE for the Grid Performance testing of the Application Validation of the parallelized model Feedback to GILDA Feedback to EGEE • New requirements list CERN – Jan Fiete Grosse-Oetringhaus 26

GRACE - CERN 2004 What else… CERN – Jan Fiete Grosse-Oetringhaus 27

GRACE - CERN 2004 What else… CERN – Jan Fiete Grosse-Oetringhaus 27

GRACE - CERN 2004 g. Container • • SSL Web service container following WSRF

GRACE - CERN 2004 g. Container • • SSL Web service container following WSRF standard Based upon WSRF: : Lite Service discovery Load management Factory service Can start and manage arbitrary service Hosted services • • Grid Access Service API Service for Communication with ROOT CERN – Jan Fiete Grosse-Oetringhaus 28

GRACE - CERN 2004 Grid Access Service (GAS) • • The Grid Access Service

GRACE - CERN 2004 Grid Access Service (GAS) • • The Grid Access Service represents the user entry point to a set of core services Composed by different modules File Catalogue Metadata client GAS CERN – Jan Fiete Grosse-Oetringhaus WMS 29

GRACE - CERN 2004 Trips • CERN School of Computing, Vico Equense • •

GRACE - CERN 2004 Trips • CERN School of Computing, Vico Equense • • GRACE General Meeting, Brussels • • • Grid Computing Physics Computing Software Techniques Project Meeting Workshop at Global Grid Forum EGEE JRA 1 Design Team Meeting, Padova • Presenting the Grid Access Service CERN – Jan Fiete Grosse-Oetringhaus 30

GRACE - CERN 2004 Thanks… … for your attention … this very nice time

GRACE - CERN 2004 Thanks… … for your attention … this very nice time at CERN! CERN – Jan Fiete Grosse-Oetringhaus 31