Previous Section Next Section Table of Contents Glossary Index

Chapter 17. Implementation Details of Clozure CL

17.4. GC details

The GC uses a Mark/Compact algorithm; its execution time is essentially a factor of the amount of live data in the heap. (The somewhat better-known Mark/Sweep algorithms don't compact the live data but instead traverse the garbage to rebuild free-lists; their execution time is therefore a factor of the total heap size.)

As mentioned in Section 17.3, “Heap Allocation”, two auxiliary data structures (proportional to the size of the lisp heap) are maintained. These are

  1. the markbits bitvector, which contains a bit for every doublenode in the dynamic heap (plus a few extra words for alignment and so that sub-bitvectors can start on word boundaries.)

  2. the relocation table, which contains a native word for every 32 or 64 doublenodes in the dynamic heap, plus an extra word used to keep track of the end of the heap.

The total GC space overhead is therefore on the order of 3% (2/64 or 1/32).

The general algorithm proceeds as follows:

17.4.1. Mark phase

Each doublenode in the dynamic heap has a corresponding bit in the markbits vector. (For any doublenode in the heap, the index of its mark bit is determined by subtracting the address of the start of the heap from the address of the object and dividing the result by 8 or 16.) The GC knows the markbit index of the free pointer, so determining that the markbit index of a doubleword address is between the start of the heap and the free pointer can be done with a single unsigned comparison.

The markbits of all doublenodes in the dynamic heap are zeroed before the mark phase begins. An object is marked if the markbits of all of its constituent doublewords are set and unmarked otherwise; setting an object's markbits involves setting the corresponding markbits of all constituent doublenodes in the object.

The mark phase traverses each root. If the tag of the value of the root indicates that it's a non-immediate node whose address lies in the lisp heap, then:

  1. If the object is already marked, do nothing.

  2. Set the object's markbit(s).

  3. If the object is an ivector, do nothing further.

  4. If the object is a cons cell, recursively mark its car and cdr.

  5. Otherwise, the object is a gvector. Recursively mark its elements.

Marking an object thus involves ensuring that its mark bits are set and then recursively marking any pointers contained within the object if the object was originally unmarked. If this recursive step was implemented in the obvious manner, marking an object would take stack space proportional to the length of the pointer chain from some root to that object. Rather than storing that pointer chain implicitly on the stack (in a series of recursive calls to the mark subroutine), the Clozure CL marker uses mixture of recursion and a technique called link inversion to store the pointer chain in the objects themselves. (Recursion tends to be simpler and faster; if a recursive step notes that stack space is becoming limited, the link-inversion technique is used.)

Certain types of objects are treated a little specially:

  1. To support a feature called GCTWA [1] , the vector that contains the internal symbols of the current package is marked on entry to the mark phase, but the symbols themselves are not marked at this time. Near the end of the mark phase, symbols referenced from this vector which are not otherwise marked are marked if and only if they're somehow distinguishable from newly created symbols (by virtue of their having function bindings, value bindings, plists, or other attributes.)

  2. Pools have their first element set to NIL before any other elements are marked.

  3. All hash tables have certain fields (used to cache previous results) invalidated.

  4. Weak Hash Tables and other weak objects are put on a linkedlist as they're encountered; their contents are only retained if there are other (non-weak) references to them.

At the end of the mark phase, the markbits of all objects that are transitively reachable from the roots are set and all other markbits are clear.

17.4.2. Relocation phase

The forwarding address of a doublenode in the dynamic heap is (<its current address> - (size_of_doublenode * <the number of unmarked markbits that precede it>)) or alternately (<the base of the heap> + (size_of_doublenode * <the number of marked markbits that precede it >)). Rather than count the number of preceding markbits each time, the relocation table is used to precompute an approximation of the forwarding addresses for all doublewords. Given this approximate address and a pointer into the markbits vector, it's relatively easy to compute the exact forwarding address.

The relocation table contains the forwarding addresses of each pagelet, where a pagelet is 256 bytes (or 32 doublenodes). The forwarding address of the first pagelet is the base of the heap. The forwarding address of the second pagelet is the sum of the forwarding address of the first and 8 bytes for each mark bit set in the first 32-bit word in the markbits table. The last entry in the relocation table contains the forwarding address that the freepointer would have, e.g., the new value of the freepointer after compaction.

In many programs, old objects rarely become garbage and new objects often do. When building the relocation table, the relocation phase notes the address of the first unmarked object in the dynamic heap. Only the area of the heap between the first unmarked object and the freepointer needs to be compacted; only pointers to this area will need to be forwarded (the forwarding address of all other pointers to the dynamic heap is the address of that pointer.) Often, the first unmarked object is much nearer the free pointer than it is to the base of the heap.

17.4.3. Forwarding phase

The forwarding phase traverses all roots and the "old" part of the dynamic heap (the part between the base of the heap and the first unmarked object.) All references to objects whose address is between the first unmarked object and the free pointer are updated to point to the address the object will have after compaction by using the relocation table and the markbits vector and interpolating.

The relocation table entry for the pagelet nearest the object is found. If the pagelet's address is less than the object's address, the number of set markbits that precede the object on the pagelet is used to determine the object's address; otherwise, the number of set markbits that follow the object on the pagelet is used.

Since forwarding views the heap as a set of doublewords, locatives are (mostly) treated like any other pointers. (The basic difference is that locatives may appear to be tagged as fixnums, in which case they're treated as word-aligned pointers into the object.)

If the forward phase changes the address of any hash table key in a hash table that hashes by address (e.g., an EQ hash table), it sets a bit in the hash table's header. The hash table code will rehash the hash table's contents if it tries to do a lookup on a key in such a table.

Profiling reveals that about half of the total time spent in the GC is spent in the subroutine which determines a pointer's forwarding address. Exploiting GCC-specific idioms, hand-coding the routine, and inlining calls to it could all be expected to improve GC performance.

17.4.4. Compact phase

The compact phase compacts the area between the first unmarked object and the freepointer so that it contains only marked objects. While doing so, it forwards any pointers it finds in the objects it copies.

When the compact phase is finished, so is the GC (more or less): the free pointer and some other data structures are updated and control returns to the exception handler that invoked the GC. If sufficient memory has been freed to satisfy any allocation request that may have triggered the GC, the exception handler returns; otherwise, a "seriously low on memory" condition is signaled, possibly after releasing a small emergency pool of memory.

[1] I believe that the acronym comes from MACLISP, where it stood for "Garbage Collection of Truly Worthless Atoms".

Previous Section Next Section Table of Contents Glossary Index