Enabling Grids for Escienc E Concepts of grid

Enabling Grids for E-scienc. E Concepts of grid computing Mike Mineter mjm@nesc. ac. uk www. eu-egee. org INFSO-RI-508833

Acknowledgements Enabling Grids for E-scienc. E • This talk was prepared by Mike Mineter of Ne. SC and includes slides from previous tutorials and talks delivered by: – Dave Berry, Richard Hopkins, Guy Warner (National e-Science Centre) – the EDG training team – Ian Foster, Argonne National Laboratories – Jeffrey Grethe, SDSC – EGEE colleagues – Mark Baker, The Distributed Systems Group, University of Portsmouth, http: //dsg. port. ac. uk/mab • Sources and data in slides from talks at 3 rd EGEE conference by – Kyriakos Baxevanidis, Deputy Head, Unit of Research Infrastructures, European Commission, DG INFSO – Dr Spyros Konidaris, European Commission – DG INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 2

Goals of this module Enabling Grids for E-scienc. E • To introduce the concepts of Grid computing assuming no previous knowledge INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 3

Contents Enabling Grids for E-scienc. E • • • “The Grid” vision What is “a grid” ? Drivers of grid computing Current status of grids The basis: authentication, authorisation, security INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 4

The Grid Metaphor Enabling Grids for E-scienc. E Mobile Access G R I D Workstation M I D D L E W A R E Supercomputer, PC-Cluster Data-storage, Sensors, Experiments Visualising Internet, networks INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 5

The grid vision Enabling Grids for E-scienc. E • The grid vision is of “Virtual computing” (+ information services to locate computation, storage resources) – Compare: The web: “virtual documents” (+ search engine to locate them) • MOTIVATION: collaboration through sharing resources (and expertise) to expand horizons of – Research – Commerce – engineering, … “the knowledge economy” – Public service – health, environment, … INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 6

Contents Enabling Grids for E-scienc. E • “The Grid” vision • What is “a grid” ? INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 7

“A grid” Enabling Grids for E-scienc. E • The initial vision: “The Grid” • The present reality: Many “grids” • Each grid is an infrastructure enabling one or more “virtual organisations” to share computing resources • What’s a VO? – People in different organisations seeking to cooperate and share resources across their organisational boundaries • Why establish a Grid? – Share data – Pool computers – Collaborate INFSO-RI-508833 Concepts of Grid Computing VO Institute A Institute B Institute C Institute D Grid Technologies for Digital Libraries, Athens 8

The Single Computer Enabling Grids for E-scienc. E • The Operating System enables easy use of – Input devices – Processor – Disks – Display – Any other attached devices Application Software Operating System Disks, Processor, Memory, … INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 9

Resources on a Local Area Network Enabling Grids for E-scienc. E User just perceives “shared resources”, with no regard to location in the organisation: - Authenticated by username / password - Authorised to use own files, … Application Software Middleware for sharing computers, servers, printers, … Operating System on each computer Resources connected by a LAN INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 10

Resources on a grid Enabling Grids for E-scienc. E Application Software Interface between app. and grid Grid Middleware: “collective services” Grid Middleware on each resource Operating System on each resource Resources connected by internet INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 11

A grid Enabling Grids for E-scienc. E • Grid middleware runs on each shared resource – Data storage – (Usually) batch jobs on pools of processors • Users join VO’s • Virtual organisation negotiates with sites to agree access to resources INTERNET • Distributed services (both people and middleware) enable the grid, allow single sign-on INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 12

What characterises a grid? Enabling Grids for E-scienc. E • • Co-ordinated resource sharing No centralised point of control Different administrative domains. Standard, open, general-purpose protocols and interfaces • Delivering non-trivial qualities of service • Co-ordinated to deliver combined services, greater than sum of the individual components • http: //www. gridtoday. com/02/0722/100136. html INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 13

The components of a Grid Enabling Grids for E-scienc. E • Resources – networking, computers, storage, data, instruments, … • Grid Middleware – the “operating system of the grid” • Operations infrastructure – Run enabling services (people + software) • Virtual Organization management – Procedures for gaining access to resources INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 14

Key concepts Enabling Grids for E-scienc. E • Virtual organisation: people and resources collaborating - across admin, organisational boundaries • Single sign-on – I connect to one machine – some sort of “digital credential” is passed on to any other resource I use, basis of: § Authentication: How do I identify myself to a resource without username/password for each resource I use? § Authorisation: what can I do? Determined by • My membership of VO • VO negotiations with resource providers • Grid middleware runs on each resource • User just perceives “shared resources” with no concern for location or owning organisation INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 15

Contents Enabling Grids for E-scienc. E • “The Grid” vision • What is “a grid” ? • Drivers of grid computing INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 16

The first driver: e-Science Enabling Grids for E-scienc. E • What is e-Science? Collaborative science that is made possible by the sharing across the Internet of resources (data, instruments, computation, people’s expertise. . . ) – Often very compute intensive – Often very data intensive (both creating new data and accessing very large data collections) – data deluges from new technologies – Crosses organisational boundaries • Examples…. INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 17

Astronomy Enabling Grids for E-scienc. E No. & sizes of data sets as of mid-2002, grouped by wavelength • 12 waveband coverage of large areas of the sky • Total about 200 TB data • Doubling every 12 months • Largest catalogues near 1 B objects INFSO-RI-508833 Concepts of Grid Computing Data and images courtesy Alex Szalay, John Hopkins University Grid Technologies for Digital Libraries, Athens 18

Earth Observation Enabling Grids for E-scienc. E ESA missions: • 100’s of Gbytes of data per day Grid contribution to EO: • Enhance the ability to access high level products • Allow reprocessing of large historical archives • Improve Earth science complex applications (data fusion, data mining, modelling …) Federico. Carminati , EU review presentation, 1 March 2002 INFSO-RI-508833 Concepts of Grid Computing Derived from: L. Fusco, June 2001 Grid Technologies for Digital Libraries, Athens 19

Large Hadron Collider at CERN Enabling Grids for E-scienc. E • Data Challenge: – 10 Petabytes/year of data !!! – 20 million CDs each year! • Simulation, reconstruction, analysis: – LHC data handling requires computing power equivalent to ~100, 000 of today's fastest PC processors! • Operational challenges Mont Blanc (4810 m) – Reliable and scalable through project lifetime of decades INFSO-RI-508833 Concepts of Grid Computing Downtown Geneva Grid Technologies for Digital Libraries, Athens 20

Enabling Grids for E-scienc. E Input file Seq 1 > dcscdssdcsdcdsc dedzedzd zedezdze dedzedzd cdscsdcsc zedezdze dedzedzd dssdcsdcd cdscsdcsc zedezdze dedzedzd scbscdsbc dssdcsdcd cdscsdcsc zedezdze dedzedzd bjbfscbscdsbc dssdcsdcd cdscsdcsc Seq 1 zedezdze> bjbfscbscdsbc dssdcsdcd dedzedzdzedezdze cdscsdcsc bjbfscbscdsbc cdscsdcscdssdcsdcd bjbfdscbscdsbcbjbdfn scbscdsbc dfjvbndfbnbnfbjn bjbf bjxbnxbjk: nxbf bscdsbcbjbfvbfvbvfbvbvbhvbhs vbhdvbhfdbvfd Seq 2 > bvdfvfdvhbdfvb bhvdsvbhvbhdvrefghefgdscgdfg csdycgdkcsqkc … Seqn > bvdfvfdvhbdfvb bhvdsvbhvbhdvrefghefgdscgdfg csdycgdkcsqkchdsqhfduhdhdhq edezhhezldhezhfehflezfzejfv BLAST gridification Computing element dedzedzd zedezdze dedzedzd cdscsdcs zedezdze dedzedzd cdssdcsd cdscsdcs zedezdze dedzedzd cdscbscd cdssdcsd cdscsdcs zedezdze dedzedzd sbcbjbf cdscbscd cdssdcsd cdscsdcs zedezdze sbcbjbf dedzedzd cdscbscd cdssdcsd cdscsdcs zedezdze sbcbjbf cdscbscd cdssdcsd cdscsdcs sbcbjbf cdscbscd cdssdcsd sbcbjbf cdscbscd BLAST UI DB BLAST sbcbjbf dedzedzd zedezdze dedzedzd cdscsdcsc zedezdze dedzedzd dssdcsdcd cdscsdcsc Seq 2 zedezdze> scbscdsbc dssdcsdcd dedzedzdzedezdze cdscsdcsc bjbfscbscdsbc cdscsdcscdssdcsdcd bjbfdscbscdsbcbjbdfn scbscdsbc dfjvbndfbnbnfbjn bjbf bjxbnxbjk: nxbf dedzedzd Seqn zedezdze> dedzedzdzedezdze cdscsdcscdssdcsdcd dscbscdsbcbjbdfn scbscdsbc dfjvbndfbnbnfbjn bjbf bjxbnxbjk: nxbf dedzedzd zedezdze dedzedzd cdscsdcsc zedezdze dedzedzd dssdcsdcd cdscsdcsc zedezdze dedzedzd scbscdsbc dssdcsdcd cdscsdcsc zedezdze bjbfscbscdsbc dssdcsdcd cdscsdcsc bjbfscbscdsbc dssdcsdcd bjbfscbscdsbc bjbf dedzedzd zedezdze dedzedzd cdscsdcsc zedezdze dedzedzd dssdcsdcd cdscsdcsc zedezdze dedzedzd scbscdsbc dssdcsdcd cdscsdcsc zedezdze dedzedzd bjbfscbscdsbc dssdcsdcd cdscsdcsc zedezdze bjbfscbscdsbc dssdcsdcd cdscsdcsc bjbfscbscdsbc dssdcsdcd bjbfscbscdsbc BLAST DB bjbf RESULT dedzedzdzedezdzecdscsdcscdssdcsd cdscbscdsbcbjbfvbfvbvfbvbvbhvbh svbhdvbhfdbvdfvfdvhbdfvbhd bhvdsvbhvbhdvrefghefgdscgdfgcsd ycgdkcsqkcqhdsqhfduhdhdhqedezh dhezldhezhfehflezfzeflehfhezfhehfe zhflezhflhfhfelhfehflzlhfzdjazslzdh fhfdfezhfehfizhflqfhduhsdslchlkchu dcscscdscdscdscsddzdzeqvnvqvnq! Vqlvkndlkvnldwdfbdbd wdfbfbndblnblkdbdfbwfdbfn INFSO-RI-508833 DB dedzedzd zedezdze dedzedzd cdscsdcsc zedezdze dedzedzd dssdcsdcd cdscsdcsc zedezdze dedzedzd scbscdsbc dssdcsdcd cdscsdcsc zedezdze bjbfscbscdsbc dssdcsdcd cdscsdcsc bjbfscbscdsbc dssdcsdcd bjbfscbscdsbc bjbf dedzedzd zedezdze dedzedzd cdscsdcsc zedezdze dssdcsdcd cdscsdcsc scbscdsbc dssdcsdcd bjbfscbscdsbc bjbf Concepts of Grid Computing BLAST dedzedzd zedezdze dedzedzd cdscsdcsc zedezdze dssdcsdcd cdscsdcsc scbscdsbc dssdcsdcd bjbfscbscdsbc bjbf DB Computing element Grid Technologies for Digital Libraries, Athens 21

Enabling Grids for E-scienc. E DAME: Grid based tools and Inferstructure for Aero-Engine Diagnosis and Prognosis Engine flight data London Airport Airline office New York Airport Grid • “A Significant factor in the success of the Rolls-Royce campaign to power the Boeing 7 E 7 with the Trent 1000 was the emphasis on the new aftermarket support service for the engines provided via DS&S. Boeing personnel were shown DAME as an example of the new ways of gathering and processing the large amounts of data that could be retrieved from an advanced aircraft such as the 7 E 7, and they were very impressed”, DS&S 2004 Diagnostics Centre Maintenance Centre American data center European data center XTO Companies: Rolls-Royce DS&S Cybula INFSO-RI-508833 Universities: York, Leeds, Sheffield, Oxford Concepts of Grid Computing Engine Model Case Based Reasoning Grid Technologies for Digital Libraries, Athens 22

Academic drivers: not only e-science!! Enabling Grids for E-scienc. E The impact of grids when they support… Curation, discovery, reuse of knowledge e-Research e-Science INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 23

Academic drivers Enabling Grids for E-scienc. E • E l • e. D ai rg ni • i. E t nagrl e l s i e b a r r a c r h • Centrality of curation, preservation • Under-recognised by many researchers • Virtual Digital Data Libraries needed for research as well as learning Derived from a slide by the UK’s JISC • AAA Services • e-Infrastructure INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 24

Political drivers Enabling Grids for E-scienc. E • Entering the “knowledge society” from the “industrial society” – industrial society: also enabled by communications infrastructure • Lisbon strategy: Research and Innovation will be the most important factors in determining Europe’s success through the next decades • THE GOAL: UNLEASH CREATIVITY- by investment in – Human skills – Infrastructures • Growth of e-infrastructure – phase 1: mainly academia, some in industry: “an elite, privileged to do this job” – phase 2: ordinary people doing distributed work; SMEs, adopt, adapt and use – phase 3: the next generations § will transform e-infrastructure and its uses § We don’t know how others will use what we devise INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 25

Empowerment in the Information Age Dr Spyros Konidaris European Commission – DG INFSO, 3 rd EGEE Conference ge A n o ti a m r o f In l a i tr w o p t n e erm Em e g A us d In s) + ks r e o tw (N R “G l oo T S ID ” c ru t as r f In e 1781 23 rd November 2004 20 2000 To change: View -> Header and Footer re u t 26

Commercial drivers Enabling Grids for E-scienc. E • European organisations could save € 4. 5 Billion by adopting basic Grid technologies • Industry Analysts Gartner and Giga both estimate that standardisation and consolidation can save between 8. 5% and 20% of IT budgets • www. oracle. com/technology/tech/grid INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 27

Contents Enabling Grids for E-scienc. E • • • “The Grid” vision What is “a grid” ? Drivers of grid computing Some examples Current status of grids INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 28

Enabling Grids for E-scienc. E If “The Grid” vision leads us here… … then where are we now? INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 29

Grid projects Enabling Grids for E-scienc. E Many Grid development efforts — all over the world • UK – OGSA-DAI, Reality. Grid, Geo. Dise, • NASA Information Power Grid Comb-e-Chem, Discovery. Net, DAME, • DOE Science Grid Astro. Grid, Grid. PP, My. Grid, GOLD, e. Diamond, Integrative Biology, … • NSF National Virtual Observatory • Netherlands – VLAM, Polder. Grid • NSF Gri. Phy. N • Germany – UNICORE, Grid proposal • DOE Particle Physics Data Grid • France – Grid funding approved • NSF Tera. Grid • Italy – INFN Grid • DOE ASCI Grid • Eire – Grid proposals • DOE Earth Systems Grid • Switzerland - Network/Grid proposal • DARPA Co. ABS Grid • Data. Grid (CERN, . . . ) • Hungary – Demo. Grid, Grid proposal • NEESGrid • Euro. Grid (Unicore) • Norway, Sweden - Nordu. Grid • Data. Tag (CERN, …) • DOH BIRN • Astrophysical Virtual Observatory • NSF i. VDGL • GRIP (Globus/Unicore) • GRIA (Industrial applications) • Grid. Lab (Cactus Toolkit) • Cross. Grid (Infrastructure Components) • EGSO (Solar Physics) INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 30

Grids: where are we now? Enabling Grids for E-scienc. E • Many key concepts identified and known • Many grid projects have tested, and benefit from, these • Major efforts now on establishing: – Standards (a slow process) (e. g. Global Grid Forum, http: //www. gridforum. org/ ) – Production Grids for multiple VO’s § “Production” = Reliable, sustainable, with commitments to quality of service • In Europe, EGEE • In UK, National Grid Service • In US, Teragrid § One stack of middleware that serves many research (and other!!!) communities § Operational procedures and services (people!, policy, . . ) – New user communities • … whilst research & development continues INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 31

The vision of 2001: convergence of Web Services and Grids Enabling Grids for E-scienc. E Open Grid Services Architecture ts n e m b we v de “b p o l e Web services World-wide web ig Sc OGSI Grid prototypes ien ce ”r es ea rch High-end computing High throughput-computing INTERNET INFSO-RI-508833 Concepts of Grid Computing Massively parallel computing Grid Technologies for Digital Libraries, Athens 32

Grid security and trust -1 Enabling Grids for E-scienc. E • Providers of resources (computers, databases, . . ) need risks to be controlled: they are asked to trust users they do not know – They trust a VO – The VO trusts its users • User’s need – single sign-on: to be able to logon to a machine that can pass the user’s identity to other resources – To trust owners of the resources they are using • Build middleware on layer providing: – Authentication: who wants to use/provide resource – Authorisation: what the user is allowed to do – Security: reduce vulnerability, e. g. from outside the firewall – Non-repudiation: knowing who did what • Digital credentials and the “Grid Security Infrastructure” middleware the basis of production grids INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 33

Grid security and trust -2 Enabling Grids for E-scienc. E • Currently, achieved by Certification: – User’s identity has to be certified by one of the national Certification Authorities (CAs) § mutually recognized http: //www. gridpma. org/, for EU go via here to http: //marianne. in 2 p 3. fr/datagrid/ca/catable-ca. html to find your CA • E. g. In UK go to http: //www. grid-support. ac. uk/ca/ralist. htm – Resources are also certified by CAs • User – User joins a VO – Digital certificate is basis of AA – Identity passed to other resources you use, where it is mapped to a local account – the mapping is maintained by the VO • Common agreed policies establish rights for a Virtual Organization to use resources INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 34

Grid security and trust -3 Enabling Grids for E-scienc. E • Certification and GSI provides – Authentication § Resource can trust user § User can trust the resource provider § …. So long as certificates are protected – they are your grid identity – A basis for Authorisation § so a VO can manage access to resources § Resource providers trust the VO § The VO trusts the user – Mechanism for checking message integrity § Messages are passed between machines § Public/private key pairs protect message integrity as well as authentication • Not (usually) encrypted but message-integrity is checked INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 35

Summary of grid computing concepts Enabling Grids for E-scienc. E • Flexible collaboration across multiple administrative domains – sharing data, computers, instruments, application software, . . • Single sign-on to resources in multiple organisations • Need for people-services - credential authorities, VO managers, SLAs - as well as middleware services • Drives are towards – Production services (reliable, sustainable, … – against which research projects, etc… can plan with confidence) § In Europe, EGEE § In UK, National Grid Service – Standards – Empowering new user communities INFSO-RI-508833 Concepts of Grid Computing Grid Technologies for Digital Libraries, Athens 36
- Slides: 36