\chapter{Garbage collection} \label{chap-garbage-collection} \section{Introduction} Contrary to traditional operating systems such as \unix{}, a Lisp operating system will need a global \emph{tracing garbage collector}. Traditional operating systems get away with a simpler technique, because the file system in such an operating system can not contain cycles. With this restriction, the simpler \texttt{reference counting} mechanism is sufficient. Furthermore, although reference counting is usually slower than a tracing garbage collector, the additional overhead of reference counters is of no importance when used for a file system in secondary memory. There is a rich literature on automatic memory management. (see e.g., \cite{Jones:2011:GCH:2025255}) For \sysname{}, we plan to have a two-level memory management technique. The low level consists of a relatively small local heap for each thread, and a per-thread garbage collector that manages that heap. The higher-level consists of a global heap that contains long-lived objects and objects that are shared between several threads. \section{Per-thread garbage collector} Each thread has a local heap, roughly the size of the cache, say around 4MiB. The thread-local heap is managed entirely by the thread itself, so that the garbage collector for it is executed by the thread itself. Experiments show that we will be able to run the thread-local garbage collector in a few milliseconds, which is good enough for most applications. We will use a sliding garbage collector in order to maintain allocation order. This way, we have a precise measure of the relative age of the objects, so that we can promote only the oldest objects when required. There can be no references between an object in one local heap to an object in another local heap. And there can be no references from the global heap to a local heap. Whenever a reference is about to be created from an object in the global heap to an object in the local heap, this attempt is caught by a \emph{write barrier} on the global heap. As a result if this write barrier being tripped, the object in the local heap being referred to (and its transitive closure) will migrate to the global heap, thereby preserving the general invariant. \section{Global garbage collector} In addition to the thread-local heaps, there is a global heap. The garbage collector for this heap will use a combination of the traditional \emph{mark-and-sweep} collector and an ordinary memory allocator, similar to the one used by the \clanguage{} functions \texttt{malloc} and \texttt{free}. Recall that that a heap-allocated object is either a \texttt{cons} cell or a \emph{general instance}. A \texttt{cons} cell is represented as two machine words. A general instance is represented as a \emph{header} consisting of two machine words, and a \emph{rack} which is a vector of words with a contents that depends on the exact type of the object. In both cases, then, a reference to a heap-allocated object is a reference to a double word. Given this representation, we separate the headers from the racks, so that \texttt{cons} cells and headers of general instances are allocated from a separate part of the global heap. Since this part of the global heap consists of only two-word objects, it can be managed very efficiently with a \emph{mark-and-sweep} garbage collector, using a simple free list. The advantage of this technique is that an object is never moved as a result of a garbage collection. Therefore, any reference to an object that is shared between several threads, remains valid after a garbage collection of the global heap. The marking phase is done by first requesting each thread to do a garbage collection and to mark any object in the global heap that is referred to by local objects. When all threads have responded, a global collection is started. The global collection is done concurrently with thread activity. For that reason, objects allocated in the global heap during this phase are marked as being live. The global collection traces the global heap starting with objects marked by the mutator threads. This tracing uses a standard three-color algorithm. Write operations to the global heap are caught by a write barrier. When tracing in the global heap is finished, the part of the global heap that contains two-word headers and \texttt{cons} cells is scanned and unmarked cells are collected into a free list. If an unmarked cell is a \texttt{cons} cell, then no further action is needed. If an unmarked cell is a header object, then the corresponding rack is returned to the rack part of the global heap.