How Clozure CL implements closures

I was lurking on the #lisp irc channel on freenode yesterday, and the matter of how Clozure CL implements closures came up in passing. If you use M-. to see the source of init-nclosure, you’ll see a fairly impenatrable fragment of assembly code. I thought I would try to explain it, at least for the x86 ports.

To understand how closures are implemented, it is necessary to know how functions are implemented.

A memory allocated object that isn’t a cons cell is a thing called a uvector. A uvector consists of a header word and a number of data words.

Functions are uvectors, but they are a little bit funky. Let’s look at an example.

We’re going to run the lisp under gdb so that we can examine how this function is represented in memory:

$ gdb dx86cl64
[set up how gdb deals with signals, define some utility commands]
(gdb) source /usr/local/src/ccl/lisp-kernel/darwinx8664/.gdbinit
(gdb) run
Starting program: /usr/local/src/ccl/dx86cl64
Reading symbols for shared libraries +. done
Welcome to Clozure Common Lisp Version 1.5-dev-r13385  (DarwinX8664)!
? (defun 2-to-the (x)
    (expt 2 x))
? #'2-to-the
[find out the address of the function]
#<Compiled-function 2-TO-THE #x302000A53B8F>
[now we hit C-c]
? ^C
Program received signal SIGINT, Interrupt.
0x00007fff813569ee in __semwait_signal ()
1: x/i $pc  0x7fff813569ee <__semwait_signal+10>: jae 0x7fff813569f5 <__semwait_signal+17>

We want to see the function in memory. To avoid talking about tagging, we just observe that memory-allocated objects are always dnode (two machine words) aligned. Thus we just mask off the low 4 bits.

(gdb) x/14g 0x302000A53B80
0x302000a53b80:	0x0000000000000d95	0x4c00000000000009

That first word is the header word. Note the #x95 in the low 8 bits. That’s the subtag. In this case, it’s x8664::subtag-function. The upper 56 bits are the element count. In this case, the function contains 13 (#xd) 64-bit words.

I said earlier that functions were kind of funky. The first byte of executable code is, of course, the #x4c at address #x302000A53B8F, which is the tagged address of the function object. There are 7 bytes between the start of the first uvector element and the start of the machine code. We store the number of words of code and other immediate data in the lower 32 bits of the first word. (This enables the GC to know what part of the uvector that it shouldn’t scan.)

In our example, we see that the function contains 9 words of code and other immediate data. We’ll skip over them.

0x302000a53b90:	0xf983fffffff92d8d	0x56e5894855257508
0x302000a53ba0:	0x4800000010c7c748	0x00000010b9f8758b
0x302000a53bb0:	0xc9000000419d8b49	0x00000007900a63ff
0x302000a53bc0:	0x906690666600c2cd	0x00000000000000f2

The #xf2 in the last word is an internal marker called x8664::function-boundary-marker. It too is used by the GC.

The rest of the data is the function’s gc-able “constants”.

0x302000a53bd0:	0x000030004001364e	0x0000302000a53c63
0x302000a53be0:	0x0000302000a577ee	0x0000000004000800

We can use a custom gdb command to look at them.

[pl stands for "print lisp"]
(gdb) pl 0x000030004001364e
$1 = 0x38e00 "EXPT"

The symbol EXPT.

(gdb) pl 0x0000302000a53c63
$2 = 0x38e00 "(PC-SOURCE-MAP #<4-element vector subtag = #xE7 @#x0000302000A53C2D ((UNSIGNED-BYTE 8))> ...

A list of function metadata.

(gdb) pl 0x0000302000a577ee
$3 = 0x38e00 "2-TO-THE"

The function’s name.

The last word is a fixnum, and it is the function’s “lfun-bits”. The bit $lfbits-trampoline-bit will be set if the function is a closure trampoline. There are many more $lfbits-xxx things.

There are some internal functions that we can use to look at a function’s constants and lfun bits from lisp. %NTH-IMMEDIATE will show a function’s contants, indexed from 0.

? (%nth-immediate #'2-to-the 0)
? (%nth-immediate #'2-to-the 1)
(FUNCTION-SYMBOL-MAP (#(X) . #(63 17 44)))
? (%nth-immediate #'2-to-the 2)

One last thing to know about functions: the last three arguments to the function are passed in the registers arg_x, arg_y, and arg_z. (On x8632, we only have arg_y and arg_z). So, a function with only one argument will receive that argument in arg_z. If there are more arguments than arg registers, the arguments are passed on the stack.

OK, so that’s how functions work. If anyone is still reading, let’s look at closures now.

? (let ((n 0))
    (defun next-val ()
      (incf n))
    (defun set-val (x)
      (setq n x)))
? #'next-val

So, that’s different. Before we saw output like:

#<Compiled-function 2-TO-THE #x302000A53B8F>

Let’s look at that function’s constants:

? (%nth-immediate #'next-val 0)
#<Compiled-function NEXT-VAL (Non-Global)  #x302000EECCFF>

What’s this? The function next-val is actually a little trampoline function that sets up and calls the actual closure function.

DISASSEMBLE tries to be friendly and shows you the disassembly of the closure function rather than the trampoline, but we can directly call an internal function to see the trampoline itself:

? (x86-xdisassemble #'next-val)
[0]     (leaq (@ (:^ L0) (% rip)) (% fn))
[7]     (jmpq (@ .SPCALL-CLOSURE))

Finally, we can start to see what init-nclosure does. The code fragment that is init-nclosure generates (at run-time) the little bit of code shown above. The closed-over values are found in the trampoline function’s constants.

? (%nth-immediate #'next-val 1)
#<VALUE-CELL 0 #x302000EEA46D>

And I mentioned that $lfbits-trampoline-bit would be on in the function’s lfun-bits:

? (logbitp $lfbits-trampoline-bit (%nth-immediate #'next-val 2))

When the trampoline is actually called, it jumps to the .SPcall-closure subprimitive in the lisp kernel (see ccl:lisp-kernel;x86-spentry64.s), which prepends the constants in the trampoline representing the closed-over values to the closure function’s arglist, and calls it.

A closure function thus sees any closed-over values as regular arguments.

? (disassemble #'next-val)
[0]     (leaq (@ (:^ L0) (% rip)) (% fn))
[7]     (cmpl ($ 8) (% nargs))
[10]    (jne L93)

%nargs is the register that contains the (fixnum) count of arguments passed to the function. If it’s not fixnum 1, then error. Note that next-val’s lambda list is nil—the argument it is expecting is the closed over value of n.

Anyway, sorry to pontificate at such length about a topic of limited interest, but there it is.

[updated to correct an error explaining tagging]

One thought on “How Clozure CL implements closures”

Leave a Reply

Your email address will not be published. Required fields are marked *