Fence Scoping Changhui Lin Vijay Nagarajan Rajiv Gupta
- Slides: 30
Fence Scoping Changhui Lin†, Vijay Nagarajan*, Rajiv Gupta† † University of California, Riverside * University of Edinburgh
Reordering in Uniprocessors Memory operations are reordered to improve performance Hardware (e. g. , store buffer, reorder buffer) Compiler (e. g. , code motion, caching value in register) a 1: St x a 2: Ld y a 1: St x No harm as long as dependences are respected
Reordering in Multiprocessors counter-intuitive program behavior Initially x=y=0 P 1 a 1: x = 1; P 2 b 1: Ry = y; a 2: y = 1; b 2: Rx = x; Intuitively, y=1 x=1 Ry=1 Rx=1 a 1 b 1: 1; Rxy = y; a 2 yxx = x; 1; b 2: R a 1 b 1 a 1: b 2 Rxyx = y; 1; x; b 2 a 2: R yx = x; 1; (Rx=0, Ry =0) (Rx=1, Ry =1) (Rx=0, Ry =1)
Reordering in Multiprocessors counter-intuitive program behavior Initially p=NULL, flag = false P 1 p = new A(…) flag = true; P 2 if (flag) a = p->var; flag is supposed to be set after p is allocated
Fence Instructions Memory Consistency Models Specify what reordering is allowed e. g. , SC, TSO (x 86, SPARC), RMO (ARM, Power. PC) Fence Instructions (Fences/Memory barriers) Selectively override default relaxed memory order Order memory operations before and after the fence P 1 p = new A(…) FENCE flag = true;
Fence Instructions Memory Consistency Models Specify what reordering is allowed e. g. , SC, TSO (x 86, SPARC), RMO (ARM, Power. PC) Fence Instructions (Fences/Memory barriers) Selectively override default relaxed memory order Order memory operations before and after the fence Inevitable -- building concurrent implementations (e. g. , mutual exclusion, queues) [Attiya et. al. , POPL’ 11] Expensive -- Cilk-5’s THE protocol spends 50% of its time executing a memory fence [Frigo et. al. , PLDI’ 98]
Motivation Control Data Access Process Data Concurrent algorithm Not all memory orderings enforced by fences are necessary Fences are usually used to enforce some specific memory operations Programmers know better how a fence is used, which can be conveyed to the hardware
Scoped Fence (S-Fence) A S-Fence only orders memory operations in the scope Scope definition (Class scope, Set scope) Bridge the gap between programmers’ intention and hardware execution Programmers specify the scope Scope information is conveyed to hardware, imposing fewer ordering constraints Lightweight hardware and compiler support
Scoped Fence (S-Fence) Programming support S-FENCE global scope S-FENCE[class] class scope S-FENCE[set, {var 1, var 2, …}] set scope
Work-Stealing Queue Algorithm 1 void put (TASK task){ 2 tail = TAIL; 3 wsq[tail] = task; 4 FENCE // store-store 5 TAIL = tail+1; 6 } 7 TASK take ( ){ 8 tail = TAIL – 1; 9 TAIL = tail; 10 FENCE // store-load 11 head = HEAD; 12 if (tail<head){ 13 TAIL = head; 14 return EMPTY; 15 } …… 24 return task 25 } 26 TASK steal ( ){ 27 head = HEAD; 28 tail = TAIL; …… 35 return task; 36 } Chase-Lev lock-free concurrent work-stealing queue
Parallel Spanning Tree ① 1 task = wsq. take(); 2 for (each neighbor task’ of task) 3 if (task’ is not processed){ 4 process(task’); ② 5 wsq. put(task’) ; 6 } ③ (a) 8 9 10 11 tail = TAIL – 1; TAIL = tail; FENCE head = HEAD; …… color[task’] = label; parent[task’] = task; 2 tail = TAIL; 3 wsq[tail] = task’; 4 FENCE 5 TAIL = tail + 1; (b)
Class Scope S-FENCE[class] class scope Make use of class in OO languages to illustrate the concept Constrain a fence to the object class where it is used (Encapsulation) Intuition: function members operate on data members of the class
Class Scope S-FENCE[class] class A { B b; int m 1, m 2; void func. A() { m 1 = val 1; b. func. B(); S-FENCE 1[class] m 2 = val 2; } } class scope class B { int n 1, n 2; void func. B() { n 1 = val 3; S-FENCE 2[class] n 2 = val 4; } } S-FENCE 1: m 1, m 2, n 1, n 2 S-FENCE 2: n 1, n 2
Class Scope Semantics More details in paper
Parallel Spanning Tree ① 1 task = wsq. take(); 2 for (each neighbor task’ of task) 3 if (task’ is not processed){ 4 process(task’); ② 5 wsq. put(task’) ; 6 } ③ (a) 8 tail = TAIL – 1; 9 TAIL = tail; 10 SFENCE[class] FENCE 11 head = HEAD; …… color[task’] = label; parent[task’] = task; 2 3 4 5 tail = TAIL; wsq[tail] = task’; FENCE SFENCE[class] TAIL = tail + 1; (b)
Compiler Support ISA Extension class-fence fs_start – start of a fence scope fs_end – end of a fence scope Use fs_start and fs_end to embrace functions containing fences Informing hardware to mark memory operations properly
Hardware Support Fence Scope Bits (FSB) Each entry of ROB and store buffer is associated with FSB Flag whether a memory operation is in the scope of some fence Store Buffer Decoding - memory operations in the scope are. . . marked via FSB Fence issue - check the Reorder Buffer entry for current scope . . . Fence Scope Bits
Hardware Support Fence Scope Bits (FSB) Each entry of ROB and store buffer is associated with FSB Flag whether a memory operation is in the scope of some fence Store Buffer Decoding - memory operations in the scope are. . . marked via FSB Fence issue - check the Reorder Buffer entry for current scope . . . Fence Scope Bits
Hardware Support Setting Fence Bits FSS: stack to record scope FSB fs_start a I 0 I 1 fs_start b I 2 outer inner I 3 I 4 fs_end b I 5 I 6 fs_end a I 7 0 1 2 3
Hardware Support Setting Fence Bits FSS: stack to record scope FSB fs_start a I 0 I 1 fs_start b I 2 outer inner I 3 I 4 fs_end b I 5 I 6 fs_end a I 7 0 1 2 3
Hardware Support Setting Fence Bits FSS: stack to record scope Issue Fence by checking FSB on the current scope FSB fs_start a I 0 I 1 fs_start b I 2 outer inner I 3 I 4 fs_end b I 5 I 6 fs_end a I 7 0 1 2 3
Hardware Support Setting Fence Bits FSS: stack to record scope Issue Fence by checking FSB on the current scope FSB fs_start a I 0 I 1 fs_start b I 2 outer inner I 3 I 4 fs_end b I 5 I 6 fs_end a I 7 0 1 2 3
Why S-Fence performs Better? Store Buffer drained St A 1 St X 2 FENCE 3 Ld Y 4 St B SB stall St A St X St A SB ROB stall & Fence issued . . . Ld Y St B ROB Timeline stall Scoped Fence St A : a cache miss Traditional Fence 0 stall St A St X St A Ld Y St B
Set Scope Dekker algorithm Initially flag 1 = flag 2 = 0 P 1 P 2 m 1 = … m 2 = … flag 1 = 1; flag 2 = 1; FENCE if (flag 2 == 0) critical section FENCE if (flag 1 == 0) critical section
Set Scope Dekker algorithm Initially flag 1 = flag 2 = 0 P 1 P 2 m 1 = … m 2 = … flag 1 = 1; flag 2 = 1; S-FENCE[set, {flag 1, flag 2}] S-FENCE … if (flag 2 == 0) critical section if (flag 1 == 0) critical section
Set Scope S-FENCE[set, {var 1, var 2, …}] set scope only order memory accesses to {var 1, var 2, …} Compiler and Hardware Supports flag memory accesses to the specified variables set fence scope bits in hardware for flagged memory accesses For simplicity, we do not differentiate memory accesses to different sets
Experimental Evaluation Cycle-accurate simulation (SESC) Integrate scoped fence logic RMO memory model Benchmarks pst - parallel spanning tree (work-stealing queue, class scope) ptc – parallel transitive closure (work-stealing queue, class scope) barnes – from SPLASH 2 (fences inserted for SC, set scope) radiosity – from SPLASH 2 (fences inserted for SC, set scope)
Experimental Evaluation Traditional fence (T) vs. Scoped fence (S) class scope Fence Stall Reduced ~13% ~50% set scope ~40 -50%
Conclusion Introduce the concept of fence scope Propose class scope and set scope Open. CL 2. 0 (sub-group, work-group, device, system) Lightweight compiler and hardware support No change in inter-processor communication Fence scope should be implemented in some form !
Fence Scoping Changhui Lin†, Vijay Nagarajan*, Rajiv Gupta† † University of California, Riverside * University of Edinburgh
- Project scoping exercise
- Project scoping
- How do you undertake community scoping
- Dynamic scoping
- What is problem scoping
- Shruthi nagarajan
- Nisha nagarajan
- Viswanath nagarajan
- Rajiv vidya mission
- Dr sneha sood
- Ca rajiv singh
- Ca rajiv singh
- Gentoo graphical installer
- Rajiv roy md
- Rajiv gandhi groundwater raipur
- Vijay aswani
- Vijay saraswat
- Pavan kumar vijay
- Vijay tewari
- Vijay
- Vijay tallapragada
- Ias
- Chatbot juridique
- R. vijay krishna
- Vijay sathaye
- Vijay samant
- Vijay kumar mit
- Vijay kharadi ias
- Vijay ramchandani
- Dr. vijay kumar
- Vijay bhattiprolu