Rarely Copying Garbage Collection Yoshinori Kobayashi Toshio Endo
Rarely Copying Garbage Collection Yoshinori Kobayashi, Toshio Endo, Kenjiro Taura, Akinori Yonezawa University of Tokyo PLDI 2002 Student Research Forum 17, June, Berlin
Contents • • • Design Goals Mark&Sweep v. s. Copying Collector Conservative Garbage Collector Bartlett's Mostly Copying Collector Rarely-Copying Collector Experiments and Discussion
Design Goals • Design Goals – Fast Allocation – Conservative GC – Fast GC • To achieve fast GC, it use hybrid strategy(Mark&Sweep, Copying) • Target – Allocation intensive programs – Programs written in a type accurate language and partially written in C language
Fast Allocation Linear Allocation void* allocate(size_t size){ prev = free; free = free + size; if(free < limit){ return prev; free size Allocation request }else{ ・・・ } } limit There must be a large continuous free area
cf. Allocation From Freelist • Allocator must search a sufficient area for an allocation request from freelist.
Copying GC makes a large continuous area From_space To_space From_space
Mark&Sweep GC causes fragmentation
Performance tradeoff between Mark&Sweep and Copying General tendency Allocation Mark & Sweep Freelist (Slower) Copying GC Linear Allocation (Faster) Garbage Collection Faster Slower ØIt strongly depends on the amount of live objects.
Cost of GC and Allocation cost of mark&sweep GC cost of copy GC Copy GC is better L(live objects) Mark-Sweep is better
Performance tradeoff
Conservative Garbage Collector n Ambiguous pointer n A word that Garbage Collector doesn’t know whether it is a pointer or not. n It appears in a program written in C language. • The conservative collector considers objects pointed to by ambiguous pointers live. • The collector cannot update ambiguous pointers – We cannot copy objects pointed to by ambiguous pointers
Why Conservative GC? l Our GC's target l Programs written in type accurate language, partially written in C language l Developing programs in a given language very often requires programmers to integrate libraries written in other languages. l Programmers don’t like complex native interface. l A simple interface below needs Ambiguous Pointers.
Another approach – Permit the existence of “Ambiguous Pointers” int X_sum_Array(int* body){ (some work using body) ・・・ } ○ Simple Native Interface × Fully Type-Accurate GC is impossible Which interface would you like to use?
Another approach – Permitthe theexistenceofof One approach – Prohibit “Ambiguous Pointers” “Ambiguous Pointers” An example : JNI int X_sum_Array(int* body){ JNIEXPORT jint JNICALL Java_Foo_sum. Array(JNIEnv* (some work using body)env, jobject* obj, jint. Array arr){ ・・・ } jint* body = (*env)->Get. Int. Array. Elements(env, arr, 0); (some work using body) (*env)->Release. Int. Arrayelements(env, arr, body, 0); ○ Simple Native Interface ・・・ × Fully Type-Accurate GC is impossible } ○ Fully Type-Accurate GC Which interface would you like to use? ×Complex Native Interface
Existing Work : Mostly-Copying Collector (Copy GC + Conservative GC) Copying Garbage Collector in the presence of ambiguous pointers l The root set consists of ambiguous pointers l There is no ambiguous pointer in the heap The whole heap consists of fixed size blocks (pages) l The pages pointed to by ambiguous pointers : Mark & Sweep l The pages pointed to only by exact pointers : Copying GC
Mostly-Copying Collector Root set live dead
Why Rarely Copying Collector? (1) Mostly-Copying collector l Mostly-Copying collector It copies live objects regardless of the amount of live objects. It copies too many objects. GC is too expensive.
Why Rarely Copying Collector? (2) Our approach: The whole heap consists of fixed size blocks(pages). Mark & Selective Copy Mark phase : it takes statistics. It chooses better strategy for each page Ø Copying - pages the amount of live objects is small Ø Sweep - pages the amount of live objects is large Ø Our focus is making the total performance better than a collector with a single strategy. Ø Mark&Copy or Mark&Sweep for each page
Rarely-Copying Collector Rcgc() { Mark_with_profiling(); Sweep(pages_to_sweep); Copy(pages_to_copy); } Profiling: for each page, keeping track of ØThe size of live objects in each page Øthe pointers which point to objects inside the page
Copy or Sweep? ~for each page~ Is an object in the page pointed to by ambiguous pointers? Yes No amount of live objects > Lth Yes # of Pointers > Pth Lth, Pth: the thresholds given by the user No copy sweep
Experiments We will measure the costs of l Allocation l Garbage collection changing the thresholds. The experiments are now in progress. The preliminary results will be available in a month.
Expected performance GC time Allocation time Mark-Sweep Copy GC Profiling overhead Rarely-Copying Mark Sweep Copy
Discussion • Preliminary results show that this technique reduces. . . by ** % and improves program performance by ? ? ? % over Boehm's collector.
Summary • We are implementing Rarely-Copying GC – Fast Allocation – Conservative Collector – Select Copy or Sweep • Experiments are still in progress. • Preliminary results will be available in a month.
ENDE
- Slides: 26