An OntheFly Mark and Sweep Garbage Collector Based
An On-the-Fly Mark and Sweep Garbage Collector Based on Sliding Views Hezi Azatchi - IBM Yossi Levanoni - Microsoft Harel Paz – Technion Erez Petrank GC via Sliding Views
Garbage Collection Today’s advanced environments: n multiprocessors + large memories Dealing with multiprocessors Stop The World Erez Petrank GC via Sliding Views 2
Garbage Collection Today’s advanced environments: n multiprocessors + large memories Dealing with multiprocessors Parallel collection Concurrent collection On-the-fly collection Erez Petrank GC via Sliding Views 3
Garbage Collection Today’s advanced environments: n multiprocessors + large memories Dealing with multiprocessors 300 ms 30 ms Parallel collection Concurrent collection Informal pause times Erez Petrank On-the-fly collection 3 ms GC via Sliding Views 4
Garbage Collection Today’s advanced environments: n multiprocessors + large memories Dealing with multiprocessors Parallel collection Informal throughput loss Erez Petrank Concurrent collection 10% On-the-fly collection 10% GC via Sliding Views 5
This Talk A new on-the-fly mark and sweep collector. 1. A synergy of snapshot collection and sliding views. 1. Implementation and measurements on the Jikes RVM. 2. 1. 2. Pause times < 2 ms Throughput loss 10%. Erez Petrank GC via Sliding Views 6
The Mark-Sweep algorithm [Mc. Carthy 1960] n n Traverse & mark live objects. White objects may be reclaimed. globals Roots Erez Petrank GC via Sliding Views 7
Base: a snapshot collection n A naïve collector: n n n Stop program threads Create a snapshot (replica) of the heap Program threads resume Trace replica concurrently with program Objects identified as unreachable in the replica may be collected. Problem: taking a replica of the heap is not realistic Erez Petrank GC via Sliding Views 8
Base: a snapshot collection n A naïve collector: n n n Stop program threads Create a snapshot (replica) of the heap Program threads resume Trace replica concurrently with program Objects identified as unreachable in the replica may be collected. [Furusou et al. 91]: use a copy-on-write barrier. n n No need to copy unless area written Use virtual pages. Erez Petrank GC via Sliding Views 9
Some inefficiencies n n Copying a page requires synchronization. Efficiency depends on the system. Triggering and copying apply to all fields although only pointers are interesting: Programs work at object level, this mechanism works at page level n a waste to copy a full page. Erez Petrank GC via Sliding Views 10
Synergy with recently developed techniques n n Note goal: we want to copy pointers in each modified object prior to its first modification. The write barrier of the Levanoni-Petrank reference counting collector provides exactly this. Use a dirty bit per object. Before a pointer is first modified – save object pointer values locally. This can be done concurrently by a multithreaded program with no synchronization! Erez Petrank GC via Sliding Views 11
The write barrier (simplified) Update(Object **slot, Object *new){ Object *old = *slot if (!Is. Dirty(slot)) { log( slot, old ) Observation: Set. Dirty(slot) If two threads: } 1. invoke the write barrier in *slot = new parallel, and 2. both log an old value, } then both record the same old value. Erez Petrank GC via Sliding Views 12
The write barrier (simplified) Update(Object **slot, Object *new){ Object *old = *slot if (!Is. Dirty(slot)) { log( slot, old ) The “real” write barrier: Set. Dirty(slot) • In the object level } • With an optimistic initial “if” *slot = new } Erez Petrank GC via Sliding Views 13
Concurrent (intermediate) Algorithm: n n n Stop all threads Scan roots (locals) Initiate write barrier usage Resume threads Trace from roots. n n Next goal: stop one thread at a time Whenever a dirty objects is discovered use buffers to obtain its pointers. Stop write barrier usage Sweep to reclaim unmarked objects. Clear all buffers and dirty bits. Erez Petrank GC via Sliding Views 14
The Sliding Views “Framework” n Avoid simultaneous halting. Instead, stop one thread at a time. n n View of the heap is a “sliding view”. There is a time interval in which all objects are read. (But not one single point in time. ) Erez Petrank GC via Sliding Views 15
Danger in Sliding Views Program does: P 1 O P 2 O P 1 NULL Problem: reachability of O noticed! Erez Petrank Here sliding view reads P 2 (NULL) Here sliding view reads P 1 (NULL) Solution: “snooping”. If a pointer to O is stored while the sliding view is taken – do not reclaim O. GC via Sliding Views 16
The Sliding Views Algorithm: n n n n Initiate snooping and write barrier usage For each thread: n Stop thread and scan its roots (locals) n Whenever a dirty object is discovered use buffers to obtain its actual values. Stop snooping Trace from roots and snooped objects. Stop write barrier usage Sweep to reclaim unmarked objects. Clear all buffers and dirty bits. Erez Petrank GC via Sliding Views 17
Optimizing the write barrier n We only need to store: 1. 2. 3. 4. n non-null pointer values of object. while tracing is on. objects that have not been traced. the object once. Implication of 3: new objects are never stored. Slow path of the write barrier is seldom taken (~ 1/300) Erez Petrank GC via Sliding Views 18
Write Barrier Statistics Benchmark Long path frac. SPECjbb 2000 1 / 299 Compress 1 / 894 Jess 1 / 13, 210 Db Javac Mpegaudio jack mtrt 2 Erez Petrank 1 / 305 1 / 160 1 / 64, 099 1 / 16, 572 1 / 4116 GC via Sliding Views 19
Performance Measurements n n Implementation for Java on the Jikes Research JVM Compared collectors: n n n Jikes parallel collector (Parallel) Jikes concurrent RC (Jikes concurrent) Benchmarks: n n Erez Petrank Server benchmark: SPECjbb 2000 --business-like transactions in a large firm Client benchmarks: SPECjvm 98 --mostly single-threaded client benchmarks GC via Sliding Views 20
Pause Times vs. Parallel Jikes parallel Erez Petrank GC via Sliding Views 21
Pause Times vs. Jikes Concurrent Erez Petrank GC via Sliding Views 22
SPECjbb 2000 Throughput Jikes parallel Erez Petrank GC via Sliding Views 23
SPECjvm 98 Throughput Jikes parallel Erez Petrank GC via Sliding Views 24
SPECjbb 2000 Throughput Erez Petrank GC via Sliding Views 25
SPECjvm 98 Throughput Erez Petrank GC via Sliding Views 26
SPECjbb 2000 Throughput Erez Petrank GC via Sliding Views 27
Most Related Collector n n Vast literature on on-the-fly mark & sweep collectors. The state-of-the-art collector is by Doligez-Leroy-Gonthier [POPL 93 -94] Implemented for Java by IBM research: Domani-Kolodner-Petrank [PLDI 2000] Domani et al [ISMM 2000] Our new collector is the only alternative for tracing on-the-fly. Erez Petrank GC via Sliding Views 28
Comparison ? No available research o 1 implementation for Java. Some thoughts on locality: A difference in write barrier on pointer modification: n n Parent p o 2 [DLG]: Mark ex-referenced object [This work: ] Copy (seldom) parent pointers, check (frequently) parent mark bits. Erez Petrank GC via Sliding Views 29
Related Work n Snapshot tracing: n n On-the-fly tracing: n n Demers et al (1990), Furusou et al. (1991) Dijkstra et. al. (1976), Steele (1976), Lamport (1976), Kung & Song (1977), Gries (1977) Ben-Ari (1982, 1984), Huelsbergen et. al. (1993, 1998) Doligez-Gonthier-Leroy (1993 -4), Domani. Kolodner-Petrank (2000) The RC sliding views algorithm: n [Levanoni & Petrank: OOPSLA 01]. n Azatchi & Petrank [Compiler Construction 2003] Generational extension of sliding views: Erez Petrank GC via Sliding Views 30
Conclusions n n n A new non-intrusive, efficient mark & sweep garbage collector suitable for multiprocessors. An implementation on Jikes and measurements on a multiprocessor. Low pause times (1 ms) small throughput penalty (10%). Erez Petrank GC via Sliding Views 31
- Slides: 31