A Type and Effect System for Deterministic Parallel

  • Slides: 24
Download presentation
A Type and Effect System for Deterministic Parallel Java ROBERT BOCCHINO, ET AL. UNIVERSAL

A Type and Effect System for Deterministic Parallel Java ROBERT BOCCHINO, ET AL. UNIVERSAL PARALLEL COMPUTING RESEARCH CENTER UNIVERSITY OF ILLINOIS Presented by Thawan Kooburat Computer Science Department University of Wisconsin - Madison *Based on OOPSLA-2009 conference presentation by Robert Bocchino

Outline 2 �Prologue �Introduction �Deterministic Parallel Java (DPJ) �Usage Patterns �Implementation �Evaluation

Outline 2 �Prologue �Introduction �Deterministic Parallel Java (DPJ) �Usage Patterns �Implementation �Evaluation

Prologue 3 �Parallel. Array (Java 7 – java. util. concurrent) Slice up data into

Prologue 3 �Parallel. Array (Java 7 – java. util. concurrent) Slice up data into blocks Perform operation on all data concurrently

Prologue 4 �Parallel. Array of distinct objects Time Access global variables Framework cannot prevent

Prologue 4 �Parallel. Array of distinct objects Time Access global variables Framework cannot prevent programmer from writing code will break the semantic

Introduction 5 �Deterministic Execution Same input always produce same output Many computational algorithms are

Introduction 5 �Deterministic Execution Same input always produce same output Many computational algorithms are deterministic �Many programs use parallel execution in order to gain performance , but it is not part of the specification.

Deterministic-by-default 6 �Guaranteed deterministic execution by default �Nondeterministic behavior must be explicitly requested. foreach

Deterministic-by-default 6 �Guaranteed deterministic execution by default �Nondeterministic behavior must be explicitly requested. foreach � Iterating over independent objects foreach_nd � Iterating overlapping objects R. Bocchino, V. Adve, S. Adve, and M. Snir, “Parallel Programming Must Be Deterministic by Default”

Benefits 7 �Can reason sequentially �No hidden parallel bugs � Testing based on input

Benefits 7 �Can reason sequentially �No hidden parallel bugs � Testing based on input No need to test all interleaving combinations �Parallelize incrementally �Easier to compose

Deterministic Parallel Java (DPJ) 8 �Based on Java language �Fork/Join parallelism Cobegin Foreach �Type

Deterministic Parallel Java (DPJ) 8 �Based on Java language �Fork/Join parallelism Cobegin Foreach �Type and effect system Expose noninterference (Soundness) Field-level granularity Differentiate between readers and writers �Guarantee deterministic execution at compile time

Regions and Effects 9 �Regions Divide memory location into regions Can be formed into

Regions and Effects 9 �Regions Divide memory location into regions Can be formed into a tree Parameterized by region class Tree. Node<region P> { region L, R, V; int value in V; Tree. Node<L> left = new Tree. Node<L>(); Tree. Node<R> right = new Tree. Node<R>(); }

Regions 10 Root: L: R Root: R: L Root: R: R

Regions 10 Root: L: R Root: R: L Root: R: R

Effects 11 �Effects Read or write operations on data Programmer specify effect summary for

Effects 11 �Effects Read or write operations on data Programmer specify effect summary for each method class Tree. Node<region P> { Method effect summary. . . Tree. Node<L> left = new Tree. Node<L>(); Tree. Node<R> right = new Tree. Node<R>(); void update. Children() writes L, R { cobegin { left. data = 0; /* writes L */ Non right. data = 1; /* writes R */ interference } } Compiler inferred from type

Usage Patterns 12 �Region path lists (RPLs) Updating nested data structure with field-granularity �Index-parameterized

Usage Patterns 12 �Region path lists (RPLs) Updating nested data structure with field-granularity �Index-parameterized array Updating an array of objects �Subarray Partition array for divide and conquer pattern. �Commutativity Declare effect summary based on operation’s semantic

Region Path Lists (RPLs) 13 class Tree<region P> { region L, R; int value

Region Path Lists (RPLs) 13 class Tree<region P> { region L, R; int value in P; Tree<P: L> left = new Tree<P: L>(); Tree<P: R> right = new Tree<P: R>(); } P= Root value Root left Root: L right Root: R P=Root: L value Root: L left Root: L: L right Root: L: R P=Root: R value Root: R left Root: R: L right Root: R: R

Region Path Lists (RPLs) 14 class Tree<region P> {. . . int increment() writes

Region Path Lists (RPLs) 14 class Tree<region P> {. . . int increment() writes P: * { Method effect summary value++; /* writes P */ cobegin { /* writes P: L: * */ Effect inferred if (left != null) left. increment(); from type and /* writes P: R: * */ summary if (right != null) right. increment(); } } Inclusion (Method effect) P: L: * ⊆ P: * P: R: * ⊆ P: * Disjointness (Cobegin body) P: L: * ∩ P: R: * = ∅

Index-parameterized Array 15 �Enforce disjointness of array’s element (reference) a[i] 1 2 3 b[i]

Index-parameterized Array 15 �Enforce disjointness of array’s element (reference) a[i] 1 2 3 b[i] 4 1 2 3 �Syntax: c[i] 4 C<[i]>[]#i 1 2 3 4 C<[1]> C<[2]> C<[3]> C<[4]>

Index-parameterized Array 16 class Body<region P> { double mass in P: M; double force

Index-parameterized Array 16 class Body<region P> { double mass in P: M; double force in P: F; Body <*> link in Link; void compute. Force() reads Link, *: M writes P: F {. . } } final Body<[_]>[]<[_] > bodies = new Body<[_]>[N]<[_]>; foreach (int i in 0, N) { /* writes [i] */ Objects are parameterlized bodies[i] = new Body<[i]> (); Write to each element is distinct by index region } foreach (int i in 0, N) { /* reads [i], Link, *: M writes [i]: F */ bodies[i]. compute. Force(); Operations in foreach block is noninterference Read does not interfere write

Subarray 17 �Mechanisms: DPJArray: DPJPartition: DPJ Libraries Wrapper class for Java array Collections of

Subarray 17 �Mechanisms: DPJArray: DPJPartition: DPJ Libraries Wrapper class for Java array Collections of disjoint DPJArray �Divide and Conquer usage pattern Initialize an array using DPJArray Recursively partition original array using DPJPartition � Each partition is a disjoint subset of original array Create a tree of partition based on flat array

Subarray 18 static <region R> void quicksort(DPJArray<R> A) writes R: * { int p

Subarray 18 static <region R> void quicksort(DPJArray<R> A) writes R: * { int p = quicksort. Partition(A); /* Chop array into two disjoint pieces */ final DPJPartition<R> segs = new DPJPartition<R>(A, p, OPEN); cobegin { /* write segs: [0]: * */ quicksort(segs. get(0)); Use local variable to represent regions /* write segs: [1]: * */ quicksort(segs. get(1)); } } DPJArray R DPJPartition segs[0] p segs[1]

Commutativity 19 �Method annotation interface Set<type T, region R> { commutative void add(T e)

Commutativity 19 �Method annotation interface Set<type T, region R> { commutative void add(T e) writes R; } �Allow programmers to override effect system Compiler will not check inside the method �This allows cobegin { add(e 1); add(e 2); } Any order of operations is equivalent Operation is atomic

Commutativity 20 �Method invocation foreach (int i in 0, n) { /* invokes Set.

Commutativity 20 �Method invocation foreach (int i in 0, n) { /* invokes Set. add with writes R */ set. add(A[i]); } cobegin { /* invokes Set. add with writes R */ set. add(A); /* invokes Set. size with read R */ set. size(); }

Implementation 21 �Extend Sun’s javac compiler Covert DPJ into normal Java source �Compile to

Implementation 21 �Extend Sun’s javac compiler Covert DPJ into normal Java source �Compile to Fork. Join. Task Framework (Java 7) Similar to Cilk DPJ translates foreach and cobegin to tasks

Evaluation 22 �Benchmarks

Evaluation 22 �Benchmarks

Performance 23 24 Speedup 20 16 Barnes-Hut Mergesort IDEA K-Means Collision Tree Monte Carlo

Performance 23 24 Speedup 20 16 Barnes-Hut Mergesort IDEA K-Means Collision Tree Monte Carlo 12 8 4 4 8 12 16 Number of cores 20 24

24 Q/A

24 Q/A