MultiProgramming and Scheduling Design for Applications of Interactive

2 Staff and Skills Ø 1/1/2005: Creation of MOAIS team 1/1/2006: Creation of INRIA

Evolution of parallel programming § Parallelism everywhere § Distributed, Heterogeneous MPSo. C Grids Cluster

4 MOAIS objective Ø End-to-end parallel programming solutions for high-performance interactive computing with provable

Approach § To mutually adapt application and scheduling. § Proactive/static to the platform :

Overview Research Directions Interactive application 4. Interactivity Performance Adaptive control of execution model: abstract

7 Research directions and achievements for 2005 -2007 1. 2. 3. 4. Scheduling Interfaces

8 1. Scheduling § Objective A: modeling of scheduling problems for adaptive applications §

9 1. Scheduling Objective B: Design of multi-objective scheduling with provable guarantees. Ø Simultaneous

10 2. Interfaces for coordination Objective: provably efficient control at runtime of the coupling

2. Kaapi: Support and transfert § Distributed implementation of CAPE-Open standard for process engineering

12 3. Adaptive algorithms Objective: To design and analyze algorithms that may obliviously adapt

3. Adaptive algorithms § Heterogeneous resources, variable speeds: work-stealing to obliviously self-tune granularity Ø

14 3. Adaptive algorithms § Cache&processor oblivious stream computations [ PDP’ 07] § AWS:

3. Adaptive algorithms Adaptive 3 D-vision [VR’ 07] Level of details Time (ms) Realtime

4. Interactivity § Motivation: parallelism for interactive applications § Challenging application: multi-cameras, multi-cpus, multi-GPUs,

17 4. Interactivity Middleware dedicated to interactive applications § Distributed components, moldable § Parallel

18 Summary of 2005 -2007 MPSo. C Grids Cluster SMP multi-core GPU AWS Multi-objective

19 Some facts § Publications • 127 in 3 years , 19 rank #1

20 Highlights § 1 st prize Plugtest Nov. 2007 Nqueens challenge Dec. 2006: special

Research directions 2008 -2012[1/3] To push the interactions to large scale § Heterogeneous computing

Research directions 2008 -2012[2/3] Ø Scheduling : multi-objective § Large systems, many users, various

Research directions [3/3] Ø Adaptive algorithms - Large data sets, out-of-core issues - Framework

Summary “To provide parallel programming schemes, interfaces and tools for high performance interactive computing

Former members § 13 Ph. Ds defended in 2005 -2007 § Now: 2 at

Slides: 25

Download presentation

Multi-Programming and Scheduling Design for Applications of Interactive Simulation Jean-Louis Roch & al. Louvre, Musée de l’Homme Sculpture (Tête) Artist : Anonyme Origin: Rapa Nui [Easter Island] Date : between the XIst and the XVth century Dimensions : 1, 70 m high http: //moais. imag. fr EVALUATION SEMINAR -RESEARCH THEME Num B "Grids and high-performance computing" March 27 -28, 2008 1

2 Staff and Skills Ø 1/1/2005: Creation of MOAIS team 1/1/2006: Creation of INRIA team-project MOAIS § § § § § Vincent Danjean Pierre-François Dutot Thierry Gautier Guillaume Huard Grégory Mounié Bruno Raffin Jean-Louis Roch Denis Trystram Frédéric Wagner [Md. C 9/2005] [Md. C 9/2006] [CR] [Md. C] [CR INRIA] § Parallel algorithms & programming § Scheduling § Interactive applications [Md. C, Team leader] [Prof] [Md. C 9/2006] § 1 Invited Prof. Alfredo Goldman § 19 Ph. D students, 1 engineer § 14 Ph. Ds defended since 2005 [USP Sao Paulo]

Evolution of parallel programming § Parallelism everywhere § Distributed, Heterogeneous MPSo. C Grids Cluster SMP multi-core GPU MPI Open. MP Cuda [NVidia] Map. Reduce [Google] TBB [Intel] …SPIRIT Cilk++ [Cilk. Arts] Fortress [Sun] 3

4 MOAIS objective Ø End-to-end parallel programming solutions for high-performance interactive computing with provable performances. optimization computational steering, VR embedded input output QAP/Nugent on Grid’ 5000 [PRISM, GSCOP, DOLPHIN] INRIA Grimage platform [MOAIS, PERCEPTION, EVASION] Ø Performance is multi-objective Streaming on MPSo. Cs [ST]

Approach § To mutually adapt application and scheduling. § Proactive/static to the platform : the devices evolve gradually § Online/dynamic to the execution context : data and resources § Tolerant to data variations, failures, other appli. perturbation, … § From algorithms to applications § Scheduling and parallel programming schemes § Programming interfaces and tools § Target applications : batch scheduling, combinatorial optimization, computational steering, stream encoding 5

Overview Research Directions Interactive application 4. Interactivity Performance Adaptive control of execution model: abstract representation algorithm: scheduling, fault tolerance Architecture 3. M Adaptive O algorithms A 2. Interfaces I for coordination S 1. Scheduling 6

7 Research directions and achievements for 2005 -2007 1. 2. 3. 4. Scheduling Interfaces for coordination Adaptive algorithms Interactive applications

8 1. Scheduling § Objective A: modeling of scheduling problems for adaptive applications § Adaptable parallelism degree for efficient coarse grain scheduling § Parallel task models: moldable tasks, divisible load Task == | | … | . . . § Some results: § Comparisons and coupling models: [IJ FCS 06 ] § Off-line: improvement of performance ratio : - 3/2 -approximation [SIAM J. Comp 07] instead of 2 [Turek&al] by strip-packing - (3+ 5) for moldable tasks on a grid of clusters [Europar’ 06] § On-line: decrease of control overhead : « work-first principle » [Cilk] - Extension to general distributed data-flow computations [ICTTA’ 06, ICCS’ 07]

9 1. Scheduling Objective B: Design of multi-objective scheduling with provable guarantees. Ø Simultaneous approximation for each objective § Approximated solutions of Pareto optimal solutions: - Makespan/Reliability[SPAA 07] - Makespan/Memory [IPDPS 08] § Generic -Relaxation scheme [Shmoys&al. ]: - Makespan/Minsum [WEA 05]; To include a smart algorithm inside a recursive doubling (eg. for Makespan) (eg. for Minsum) For moldable tasks: yields a bi-approximation with arbitrary ratio between Cmax and Minsum [WEA 05] 0 2 4 8 16 t

10 2. Interfaces for coordination Objective: provably efficient control at runtime of the coupling of components with various synchronizations constraints. Ø Kaapi middleware 1 struct sum { 2 void operator()(Shared_r < int > a, 3 Shared_r < int > b, 4 Shared_w < int > r ) 5 { r. write(a. read() + b. read()); } 6 }; 7 8 struct fib { 9 void operator()(int n, Shared_w<int> r) 10 { if (n <2) r. write( n ); 11 else 12 { int r 1, r 2; 13 Fork< fib >() ( n-1, r 1 ) ; 14 Fork< fib >() ( n-2, r 2 ) ; 15 Fork< sum >() ( r 1, r 2, r ) ; 16 } 17 } 18 } ; Local stack runtime Distributed nested macrodataflow graph § Provable performances: - Efficient local serialization “work-first principle”, zero-copy - Scheduling: • coarse-grain graph partitioning [J. CLSS’ 07, ICCS 07]] + ``work-stealing’ - Fault-tolerance protocols, from scheduling properties • coordinated protocol [ICTTA’ 06] + original TIC protocol [EIT 05, TDSC 08] § Positioning: § Multi-processors/multi-core architectures: Intel TBB, Cilk++ § Grid / global platforms: Tolerate failure and falsification: Satin (FT)

2. Kaapi: Support and transfert § Distributed implementation of CAPE-Open standard for process engineering computations [IFP] Cluster implementation of compliant runtime RSI/Indiss-RT § § Quadratic assignment [ANR CHOC] Finite element computations [ANR DISCOGRID] Cryptographic S-Box selection [ANR SAFESCALE] Probabilistic inference engine [Pro. Bayes] 11

12 3. Adaptive algorithms Objective: To design and analyze algorithms that may obliviously adapt their execution under the control of the scheduling Sequential algorithm Parallel algo 1 P=2 Parallel algo 2 P=100 n Which one to select? Overheads on i t a niz o r ch n y s r nd u d e … Parallel algo k P=+∞ a ncy c tio a c i un m om

3. Adaptive algorithms § Heterogeneous resources, variable speeds: work-stealing to obliviously self-tune granularity Ø Minimize both the work Wp and the depth Dp § But: work Wp increases when depth Dp decreases : § multi-objective problem § Adaptive recursive coupling of algorithms [Europar’ 06, PASCO’ 07, PDP’ 08] - Relaxation: sequential / parallel work-stealing 13

14 3. Adaptive algorithms § Cache&processor oblivious stream computations [ PDP’ 07] § AWS: adaptive work-stealing for MPSo. Cs § Use case: HDTV on MPSo. Cs [ST Microelectronics film grain tech. ] MPSo. C Bridge = AWS scheduling Architecture description [SPIRIT / IP-XACT Simulator] § Near optimal experimental results [PDP 08] Film-grain application (streaming) Ex: HD-TV Noise reduction Application description potential parallelism [AWS api]

3. Adaptive algorithms Adaptive 3 D-vision [VR’ 07] Level of details Time (ms) Realtime constraint : 30 frames per sec Maximum precision 1. . 16 CPUs Adaptive heterogeneous coupling with Kaapi: CPU+GPU [EGPV’ 07] 15

4. Interactivity § Motivation: parallelism for interactive applications § Challenging application: multi-cameras, multi-cpus, multi-GPUs, multi-display § Grimage platform [2004…] -- 30 nodes cluster -- 15 cameras -- 16 projectors § Positioning: other platforms: - [Blue-C, ETH Zurich, 2005 …], [Tele-Immersion@UCBerkeley 2005] § Specificity: collaboration with § EVASION(realtime physics simulation) § PERCEPTION (computer vision) 16

17 4. Interactivity Middleware dedicated to interactive applications § Distributed components, moldable § Parallel code coupling § Static coarse grain mapping HDTV player on 12 Mpixels display wall (16 projectors) [CPUs + GPUs]

18 Summary of 2005 -2007 MPSo. C Grids Cluster SMP multi-core GPU AWS Multi-objective Adaptive Performance § Applications are time-consuming but essential to validate scientific approach

19 Some facts § Publications • 127 in 3 years , 19 rank #1 - 17 Int. Journal (SIAM J. Comp, IEEE TC, TPDS, TDSC, EJOR, FCS, …) - 59 Int. Conf (SPAA, IPDPS, CCGrid VR, VIS, Europar, ICCS, Siggraph…) § Contracts CEA, Bull, C-S, DCN, • 2 ARC, 5 ANRs • 1 pole MINALOGIC • 2 Europe, 1 Ass. team K€ • Industry partners: STM, IFP, § Softwares: § Kaapi, Flow. VR, Taktuk, AWS

20 Highlights § 1 st prize Plugtest Nov. 2007 Nqueens challenge Dec. 2006: special Jury prize Nov. 2007: 1 st prize • Nqueens(23) in 2107 s with 3654 cores § SIGGRAPH Aug. 2007 Emerging Technologies Demo ~4000 visitors § Valorization: start-up (Sep. 2007) - co-founded by former Ph. D C. Menier [joined MOAIS / PERCEPTION] - transfer: parallel 3 D modeling

Research directions 2008 -2012[1/3] To push the interactions to large scale § Heterogeneous computing - Complex memory hierarchy § Provable performances vs adversary - Game theory 21

Research directions 2008 -2012[2/3] Ø Scheduling : multi-objective § Large systems, many users, various objectives : equity / fairness § Extra global objective to non-cooperative strategies Ø Coordination interface => Runtime for HIPC on demand § Work-stealing based runtime extended to complex memory hierarchy § Dependable computing on global computing platforms 22

Research directions [3/3] Ø Adaptive algorithms - Large data sets, out-of-core issues - Framework / high level library Ø High performance interactive computing - Interactive resolution of complex problem (scheduling) - Grimage: explore new 3 D interactions [PERCEPTION] • Parallelism for adaptive interactive performance - [EVASION, ALCOVE] Kaapi partitioning+work-stealing to balance load between heterogeneous resources (CPUs / GPUs ) 23

Summary “To provide parallel programming schemes, interfaces and tools for high performance interactive computing that enable to achieve provable performances on distributed parallel architectures, from multi-processors system-onchip to lightweight grids and global computing platforms. ” SIGGRAPH’ 07 [MOAIS - PERCEPTION - EVASION] 24

Former members § 13 Ph. Ds defended in 2005 -2007 § Now: 2 at INRIA : Alcove, Cepage 7 in university : Reims, IKI Iran, Luxembourg, Vannes, Damascus, Warsaw, Colima 1 in Postdoc : Iowa SU 1 Start-up co-founder : 4 DViews 2 in industry : IFP, Amadeus § 2 postdocs § Now: Univ. Paris 6, Petrobraz § 1 long term visit § Axel Krings, Idaho State Univ § 3 engineers § Now: INRIA/PARIS, industry 25