IBMs X 10 Presentation by Isaac Dooley CS
IBM’s X 10 Presentation by Isaac Dooley CS 498 LVK Spring 2006
Sources: • Report on the Experimental Language X 10, Vijay Saraswat • http: //domino. research. ibm. com/comm/research_projects. nsf/pages/x 10. index. html • X 10 Tutorial v 10. pdf
Goal of X 10 • “The fundamental goal of X 10 is to enable scalable, high-performance, high -productivity transformational programming for high-end computers -for traditional numerical computation workloads… as well as commercial server workloads. ”
Description • “X 10 is a type-safe, modern, parallel, distributed object-oriented language intended to be very easily accessible to Java(TM) programmers. It is targeted to future low-end and high-end systems with nodes that are built out of multi-core SMP chips with nonuniform memory hierarchies, and interconnected in scalable cluster configurations. …
Description • … A member of the Partitioned Global Address Space (PGAS) family of languages, X 10 highlights the explicit reification of locality in the form of places; lightweight activities embodied in async, future, foreach, and ateach constructs; constructs for termination detection (finish) and phased computation (clocks); the use of lock-free synchronization (atomic blocks); and the manipulation of global arrays and data structures. ”
Memory Model • Not Distributed Memory, Not Shared Memory • “Fragmented Memory Model” • Partitioned Global Address Space(GAS) – Like UPC, Co-Array Fortran, Titanium • Globally Asynchronous, Locally Synchronous (GALS)
Language • Extended Subset of Java – Without Java Concurrency(threads, monitors) – With Distributed Arrays – With new built-in types – With Places, Activites, Clocks
Language • Places – Host some data – Runs some activities(lightweight threads) – Must spawn activities to access data at other places – Correspond to an Address Space (SMP node) • Immutable data • Freely copied between places on access
Language • Clocks – Used to order activities • Distributed Arrays – Support Collectives
Language • Atomic Sections – atomic S – Excecute locally, only accessing local data • Asynchronous Activities – async (P) S – future (P) E
Regions • Distributions are maps from a region to a subset of places • A region [0: 200, 1: 100] specifies a collection of 2 -D points • Points are used as array indices • Operations on regions are provided – Union, disjunction, set difference
Deadlock, Data Races • X 10 guarantee −Any program written with async, finish, atomic, foreach, ateach, and clock parallel constructs will never deadlock • Unrestricted use of future and force may lead to deadlock, but Restricted use of future and force in X 10 can preserve guaranteed freedom from deadlocks • To eliminate Data Races, atomic methods and blocks should be used
Example of “future” public class Tut. Future 1 { static int fib(final int n) { if ( n <= 0 ) return 0; else if ( n == 1 ) return 1; else { future<int> fn_1 = future { fib(n-1) }; fn_2 = future { fib(n-2) }; return fn_1. force() + fn_2. force(); } } // fib() public static void main(String[] args) { System. out. println("fib(10) = " + fib(10)); } } future<int> Example of recursive divide-and- conquer parallelism --- calls to fib(n-1) and fib(n-2) execute in parallel
Example of “async” final int n=100; … finish { async for (int i=1 ; i<=n ; i+=2 ) odd. Sum. val += i; for (int j=2 ; j<=n ; j+=2 ) even. Sum. val += j; } // Both odd. Sum and even. Sum have been computed now System. out. println("odd. Sum = " + odd. Sum. val + " ; even. Sum = " + even. Sum. val); • Parent activity creates new child to execute “for (int i=1 ; i<=n ; i+=2 ) odd. Sum. val += i” • An async statement returns immediately • Parent execution proceeds immediately to next statement • Any access to parent’s local data must be through final variables
Example of “atomic” finish { async for (int i=1 ; i<=n ; i+=2 ) { double r = 1. 0 d / i ; atomic r. Sum += r; } for (int j=2 ; j<=n ; j+=2 ) { double r = 1. 0 d / j ; atomic r. Sum += r; } } An atomic statement/method is conceptually executed in a single step, while other activities are suspended − Note: programmer does not manage any locks explicitly An atomic section may not include − Blocking operations − Creation of activities
Example of “atomic” public class Tut. Atomic 2 { const int a = new boxed. Int(100); const int b = new boxed. Int(100); public static atomic void incr_a() { a. val++ ; b. val-- ; } a. val-- ; b. val++ ; } public static atomic void decr_a() { public static void main(String args[]) { int sum; finish { async for (int i=1 ; i<=10 ; i++ ) incr_a(); for (int i=1 ; i<=10 ; i++ ) decr_a(); } atomic sum = a. val + b. val; System. out. println("a+b = " + sum); } } Console output: a+b = 200
Code for Jacobi 2 -D public class Jacobi { const region R = [0: N+1, 0: N+1]; const region RInner= [1: N, 1: N]; const distribution D = distribution. factory. block(R); const distribution DInner = D | RInner; const distribution DBoundary = D - RInner; double[D] B = new double[D] (point p[i, j]) {return DBoundary. contains(p) ? (N-1)/2 : N*(i-1)+(j-1); }; // Exploded Variable Declaration, implicitly defined for i, j public boolean run() { int iters = 0; double err; while(true) { double[. ] Temp = new double[DInner] (point [i, j]) {return (read(i+1, j)+read(i-1, j)+read(i, j+1)+read(i, j-1))/4. 0; }; if((err=((B | DInner) - Temp). abs(). sum()) < epsilon) break; B. update(Temp); iters++; }} public double read(final int i, final int j) { return future(D[i, j]) B[i, j]. force(); } public static void main(String args[]) { boolean b= (new Jacobi()). run(); System. out. println("++++++ " + (b? "Test succeeded. " : "Test failed. ")); System. exit(b? 0: 1); }
- Slides: 17