Joef Stefan Institute Agile Computing Federation and Slovenian

  • Slides: 27
Download presentation
Jožef Stefan Institute Agile Computing Federation and Slovenian plans for EINFRA-1 Jan Jona Javoršek

Jožef Stefan Institute Agile Computing Federation and Slovenian plans for EINFRA-1 Jan Jona Javoršek Jožef Stefan Institute jona. javorsek@ijs. si SLING – Slovenian Initiative for National Grid http: //www. ijs. si/ http: //www. sling. si/

Slovenian Point of View ● ● Strong HEP base some large international projects NREN

Slovenian Point of View ● ● Strong HEP base some large international projects NREN support, NGI, EGI membership, official gvt. support, equipemnt funding promises Increasingly integrated ARC-using HTC users and centres ARC-using users from other domains (KT, biomedical, mathematics, statistis, applied linguistics)

Slovenian Point of View ● ● Blocked PRACE, politicking Cloud. eu Centre integration /

Slovenian Point of View ● ● Blocked PRACE, politicking Cloud. eu Centre integration / funding pressures (no central management) NREN: public cloud expectations EGU Gain integration FUD regarding lack of EGI planning, envisioned grid future, CERN standing, distributing computing future ● Huge demands for future projects (Belle II) ● Expectations of uninterrupted functionality

Agile Computing Federation Goals: ● ● Reuse of existing technologies Reuse of existing infrastructure:

Agile Computing Federation Goals: ● ● Reuse of existing technologies Reuse of existing infrastructure: institutional, national, international ● No interruption of existing services ● Unified user access regardless of tech ● Coordination with ongoing efforts 4

Similar infrastructures Available resources for researchers: ● National and international grids ● HPC centres

Similar infrastructures Available resources for researchers: ● National and international grids ● HPC centres ● Private and public clouds 5

Computing Centre Dilemma Different situations vs. size: ● ● Multiple independant infrastructures Multiple centres

Computing Centre Dilemma Different situations vs. size: ● ● Multiple independant infrastructures Multiple centres in the same instututions, with different networks Multiple infastructures in the same centres and network Resource reuse / reallocation / pain 6

Confusing access paths ● ● ● HPC centres: application policies and local comitees Grid:

Confusing access paths ● ● ● HPC centres: application policies and local comitees Grid: PKI, certificates and VOs Cloud: different APIs, PR confusion, vendor/site isolation 7

Parallel efforts NREN / large institute can have: ● Custom services ● Internal cloud

Parallel efforts NREN / large institute can have: ● Custom services ● Internal cloud ● HPC centre ● 1 or more grid sites ● Public/private cloud One agile setup required! 8

Required components ● Provisioning system ● Local resource management ● User management, authorization ●

Required components ● Provisioning system ● Local resource management ● User management, authorization ● Runtime envs / modules / VM images ● Information system, brockering ● Service mngt, check-pointing, transfer ● Data management 9

CVMFS Salt Choices. . . CERN Agile model Key. Stone Ceph Globus Nordu. Grid

CVMFS Salt Choices. . . CERN Agile model Key. Stone Ceph Globus Nordu. Grid ARC g. Lite Cinder PKI VOMS d. Cache Torque Open. MP SLURM Open. Stack g. FTP Glance Open. Nebula o. Virt Puppet science portals VRC 10

Architectural Alternatives ● Grid/Cloud integration ● Unified Grid/Cloud access ● Virtualized Grid Services ●

Architectural Alternatives ● Grid/Cloud integration ● Unified Grid/Cloud access ● Virtualized Grid Services ● Integrated hybrid Grid/Cloud 11

Goals ● ● ● No interruption in existing services: grid, HPC, cloud Soft transitions:

Goals ● ● ● No interruption in existing services: grid, HPC, cloud Soft transitions: – Glite → ARC – o. Virt → Open. Stack – PBS → SLURM Unification of stack: for admins (agile stack, monitoring, repos) for users (auth/z, RTE/images, projects) 12

User Satisfaction ● Single point of access: (Edu. Gain + VOMS + Keystone? )

User Satisfaction ● Single point of access: (Edu. Gain + VOMS + Keystone? ) ● Single interface (Horizon + ARC? ) ● Flexible runtime environment or image repository (ARC RTE + Glance: VM vs RTE, Open. MP, Open. CL) 13

Necessary steps ● ● Provisioning, abstraction and birtualization of resources and services Hybrid resource

Necessary steps ● ● Provisioning, abstraction and birtualization of resources and services Hybrid resource manager Customization of infosys: predefined instance flavors, service registration, quotas Storage and data management abstractions 14

Hard Problems ● Data management (hard) ● Workflow managment (harder) ● Task management (hardest)

Hard Problems ● Data management (hard) ● Workflow managment (harder) ● Task management (hardest) 15

Micro and Macro Climate ● ● ● H 2020 ARC development project (with 2

Micro and Macro Climate ● ● ● H 2020 ARC development project (with 2 nordic parners, ATLAS interest, interest for EU-T 0) H 2020 Dirac development project (Belle 2 involvement) National projects (ongoing HTC Puppet – HTC enabling grid-compatible provisoning and agile enhancement) ● National commitment: equipment, technical stuff, operation costs, int. project cofunding ● Continued EGI, HTC, Géant, WLCG ● No clear HPC / PRACE / HTC COE / Cloud 16

Involvements ● CERN: ATLAS ● Belle 2, Pierre Auger ● Support existing groups (KT,

Involvements ● CERN: ATLAS ● Belle 2, Pierre Auger ● Support existing groups (KT, biomed, HPC/fluid dynamic) ● Interest (NG, Finland, Hungary, SE EU) ● Close/Opportunistic: Open. Archive, Clarin, Dariah. . . 17

Projection ● ● 1 Y: Experimental setup: large institute (multiple sites) 2 Y: National

Projection ● ● 1 Y: Experimental setup: large institute (multiple sites) 2 Y: National federation: NREN + NGI + HPC centres 3 Y: International federation: Project, other groups / projects 4 Y: „hard“ problems 18

Questions? ? Jan Jona Javoršek Jožef Stefan Institute jona. javorsek@ijs. si SLING – Slovenian

Questions? ? Jan Jona Javoršek Jožef Stefan Institute jona. javorsek@ijs. si SLING – Slovenian Initiative for National Grid http: //www. ijs. si/ http: //www. sling. si/ 19

Additional Slides 20

Additional Slides 20

● Grid and Cloud integration ● VM managers = LRMS ● Virtualized WN ●

● Grid and Cloud integration ● VM managers = LRMS ● Virtualized WN ● Virtualized storage ● Example: Wno. Des 21

Unified Grid/Cloud access ● One client used to submit both jobs and servies ●

Unified Grid/Cloud access ● One client used to submit both jobs and servies ● virtualized WN ● Adapted LRMS (grid & cloud) ● Shared Cloud Storage ● Examples: Swarm, Xtrem. Web 22

Virtualized Grid Services ● ● Virtualization of all grid services Changes in infosys and

Virtualized Grid Services ● ● Virtualization of all grid services Changes in infosys and registration services ● Cloud instances part of grid ● Example: none 23

Integrated hybrid G / C ● Transparent acces of resources: grid and cloud ●

Integrated hybrid G / C ● Transparent acces of resources: grid and cloud ● LRMS for grid and cloud ● Monitoring for both resources ● Optionnaly virtual WN 24

Integrated hybrid G / C 25

Integrated hybrid G / C 25

Objective? 26

Objective? 26

Possibilities Edu. GAIN VOMS Keystone Authorization: Role Discovery / Provisioning Computing + Data Native

Possibilities Edu. GAIN VOMS Keystone Authorization: Role Discovery / Provisioning Computing + Data Native Job: CL / MPI Slurm VM Job VM service Compute Quota / Accounting 27