Parallel Architectures Memory Consistency Synchronization cslabntua 2018 2019

  • Slides: 52
Download presentation
Parallel Architectures Memory Consistency + Synchronization cslab@ntua 2018 -2019 1

Parallel Architectures Memory Consistency + Synchronization [email protected] 2018 -2019 1

Πηγές/Βιβλιογραφία • “Parallel Computer Architecture: A Hardware/Software Approach”, D. E. Culler, J. P. Singh,

Πηγές/Βιβλιογραφία • “Parallel Computer Architecture: A Hardware/Software Approach”, D. E. Culler, J. P. Singh, Morgan Kaufmann Publishers, INC. 1999 • “Transactional Memory”, D. Wood, Lecture Notes in ACACES 2009 • Onur Mutlu, “Cache Coherence”, Computer Architecture - Lecture 28 – Carnegie Mellon University, 2015 (slides & video) – http: //www. ece. cmu. edu/~ece 447/s 15/lib/exe/fetch. php? media=onur-447 -spring 15 lecture 28 -memory-consistency-and-cache-coherence-afterlecture. pdf – https: //www. youtube. com/watch? v=Jfj. T 1 a 0 vi 4 E&t=4106 s [email protected] 2018 -2019 2

Παράδειγμα 2 Producer posting Item x: Consumer: Load Rhead, (head) Load Rtail, (tail) spin:

Παράδειγμα 2 Producer posting Item x: Consumer: Load Rhead, (head) Load Rtail, (tail) spin: Load Rtail, (tail) Store (Rtail), x if Rhead==Rtail goto spin Rtail=Rtail+1 Load R, (Rhead) Store (tail), Rtail Rhead=Rhead+1 Store (head), Rhead Το πρόγραμμα είναι γραμμένο με consume(R) την υπόθεση ότι οι εντολές εκτελούνται σε σειρά. [email protected] 2018 -2019

Παράδειγμα 2 (2) Producer posting Item x: 1 2 Load Rtail, (tail) Store (Rtail),

Παράδειγμα 2 (2) Producer posting Item x: 1 2 Load Rtail, (tail) Store (Rtail), x Rtail=Rtail+1 Store (tail), Rtail Ο tail pointer μπορεί να ανανεωθεί πριν την εγγραφή του x! Consumer: Load Rhead, (head) spin: Load Rtail, (tail) 3 if Rhead==Rtail goto spin Load R, (Rhead) 4 Rhead=Rhead+1 Store (head), Rhead consume(R) § Ο προγραμματιστής υποθέτει ότι αν η 3 πραγματοποιηθεί μετά τη 2, τότε η 4 πραγματοποιείται μετά την 1. § Προβληματικές ακολουθίες: o 2, 3, 4, 1 o 4, 1, 2, 3 [email protected] 2018 -2019

Sequential Consistency § “A multiprocessor is sequentially consistent if the result of any execution

Sequential Consistency § “A multiprocessor is sequentially consistent if the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor occur in this sequence in the order specified by its program. ” [Lamport, 1979] § SC = τυχαία μίξη των (εν σειρά) αναφορών των σειριακών προγραμμάτων στους επεξεργαστές [email protected] 2018 -2019

Παράδειγμα : Relaxed Consistency με Fences Producer posting Item x: Load Rtail, (tail) Store

Παράδειγμα : Relaxed Consistency με Fences Producer posting Item x: Load Rtail, (tail) Store (Rtail), x Fence. SS Rtail=Rtail+1 Store (tail), Rtail εγγυάται ότι ο tail pointer δε θα ανανεωθεί πριν την εγγραφή του x [email protected] 2018 -2019 Consumer: Load Rhead, (head) spin: Load Rtail, (tail) if Rhead==Rtail goto spin Fence. LL Load R, (Rhead) εγγυάται ότι ο R δε Rhead=Rhead+1 θα φορτωθεί πριν Store (head), Rhead την εγγραφή του x consume(R)

Weak Ordering vs Release Consistency WO RC cslab@ntua 2018 -2019

Weak Ordering vs Release Consistency WO RC [email protected] 2018 -2019

Παράδειγμα: Sparc V 9 memory fences § #Load § #Store. Load § #Load. Store

Παράδειγμα: Sparc V 9 memory fences § #Load § #Store. Load § #Load. Store § #Store § Logical or-ed combinations possible § #XY = “All X operations that appear before the memory fence in program order complete before any Y operations that follow after the memory fence in program order. ” § (+) Ευελιξία όσον αφορά την βέλτιστη εκμετάλλευση του εκάστοτε relaxed consistency model για μέγιστη απόδοση § (-) Προγραμματιστικά δύσκολη + ζητήματα μεταφερσιμότητας ανάμεσα σε διαφορετικά models [email protected] 2018 -2019

Παράδειγμα : Αμοιβαίος Αποκλεισμός Thread 1 xdatap data Thread 2 xdatap Memory ld xdata,

Παράδειγμα : Αμοιβαίος Αποκλεισμός Thread 1 xdatap data Thread 2 xdatap Memory ld xdata, (xdatap) add xdata, 1 sd xdata, (xdatap) Τι χρειάζεται για να εκτελεστεί σωστά ο κώδικας; [email protected] 2018 -2019

Αμοιβαίος Αποκλεισμός με Load/Store (1) § Χρήση 2 διαμοιραζόμενων μεταβλητών. c 1 = 1;

Αμοιβαίος Αποκλεισμός με Load/Store (1) § Χρήση 2 διαμοιραζόμενων μεταβλητών. c 1 = 1; L: if c 2 = 1 then go to L; <critical section> c 1 = 0; Πρόβλημα; [email protected] 2018 -2019 → Deadlock! c 2 = 1; L: if c 1 = 1 then go to L; <critical section> c 2 = 0;

Peterson (1981) vs Dekker (1966) Peterson's: "I want to enter. " "You can enter

Peterson (1981) vs Dekker (1966) Peterson's: "I want to enter. " "You can enter next. " "If you want to enter and it's your turn I'll wait. " Else: Enter CS! "I don't want to enter any more. " flag[0]=true; turn=1; while(flag[1]==true&&turn==1){ } // CS flag[0]=false; Dekker's: "I want to enter. " "If you want to enter and if it's your turn I don't want to enter any more. " "If it's your turn I'll wait. " "I want to enter. " Enter CS! "You can enter next. " "I don't want to enter any more. " flag[0]=true; while(flag[1]==true){ if(turn!=0){ flag[0]=false; while(turn!=0){ } flag[0]=true; } } // CS turn=1; flag[0]=false; Πηγή: http: //cs. stackexchange. com/questions/12621/contrasting-peterson-s-and-dekker-s-algorithms [email protected] 2018 -2019 44