Previous Section | Next Section | Table of Contents | Glossary | Index |
Not every foreign function is so marvelously easy to use as the ones we saw in the last section. Some functions require you to allocate a C struct, fill it with your own information, and pass in a pointer to that struct. Some of them require you to allocate an empty struct that they will fill in so that you can read the information out of it.
There are generally two ways to allocate foreign data. The first way is to allocate it on the stack; the RLET macro is one way to do this. This is analogous to using automatic variables in C. In the jargon of Common Lisp, data allocated this way is said to have dynamic extent.
The other way to heap-allocate the foreign data. This is analogous to calling malloc in C. Again in the jargon of Common Lisp, heap-allocated data is said to have indefinite extent. If a function heap-allocates some data, that data remains valid even after the function itself exits. This is useful for data which may need to be passed between multiple C calls or multiple threads. Also, some data may be too large to copy multiple times or may be too large to allocate on the stack.
The big disadvantage to allocating data on the heap is that it must be explicitly deallocated—you need to "free" it when you're done with it. Normal Lisp objects, even those with indefinite extent, are deallocated by the garbage collector when it can prove that they're no longer referenced. Foreign data, though, is outside the GC's ken: it has no way to know whether a blob of foreign data is still referenced by foreign code or not. It is thus up to the programmer to manage it manually, just as one does in C with malloc and free.
What that means is that, if you allocate something and then lose track of the pointer to it, there's no way to ever free that memory. That's what's called a memory leak, and if your program leaks enough memory it will eventually use up all of it! So, you need to be careful to not lose your pointers.
That disadvantage, though, is also an advantage for using foreign functions. Since the garbage collector doesn't know about this memory, it will never move it around. External C code needs this, because it doesn't know how to follow it to where it moved, the way that Lisp code does. If you allocate data manually, you can pass it to foreign code and know that no matter what that code needs to do with it, it will be able to, until you deallocate it. Of course, you'd better be sure it's done before you do. Otherwise, your program will be unstable and might crash sometime in the future, and you'll have trouble figuring out what caused the trouble, because there won't be anything pointing back and saying "you deallocated this too soon."
And, so, on to the code...
As in the last tutorial, our first step
is to create a local dynamic library in order to help show
what is actually going on between CCL
and C. So, create the file
ptrtest.c, with the following code:
#include <stdio.h> void reverse_int_array(int * data, unsigned int dataobjs) { int i, t; for(i=0; i<dataobjs/2; i++) { t = *(data+i); *(data+i) = *(data+dataobjs-1-i); *(data+dataobjs-1-i) = t; } } void reverse_int_ptr_array(int **ptrs, unsigned int ptrobjs) { int *t; int i; for(i=0; i<ptrobjs/2; i++) { t = *(ptrs+i); *(ptrs+i) = *(ptrs+ptrobjs-1-i); *(ptrs+ptrobjs-1-i) = t; } } void reverse_int_ptr_ptrtest(int **ptrs) { reverse_int_ptr_array(ptrs, 2); reverse_int_array(*(ptrs+0), 4); reverse_int_array(*(ptrs+1), 4); }
This defines three functions.
reverse_int_array
takes a pointer to an array
of int
s, and a count telling how many items
are in the array, and loops through it putting the elements in
reverse. reverse_int_ptr_array
does the same
thing, but with an array of pointers to int
s.
It only reverses the order the pointers are in; each pointer
still points to the same thing.
reverse_int_ptr_ptrtest
takes an array of
pointers to arrays of int
s. (With me?) It
doesn't need to be told their sizes; it just assumes that the
array of pointers has two items, and that both of those are
arrays which have four items. It reverses the array of
pointers, then it reverses each of the two arrays of
int
s.
Now, compile ptrtest.c into a dynamic library using the command:
gcc -dynamiclib -Wall -o libptrtest.dylib ptrtest.c -install_name ./libptrtest.dylib
The function make-heap-ivector
is the
primary tool for allocating objects in heap memory. It
allocates a fixed-size CCL
object in heap memory. It
returns both an array reference, which can be used directly from
CCL
, and a macptr
, which can be used to
access the underlying memory directly. For example:
? ;; Create an array of 3 4-byte-long integers (multiple-value-bind (la lap) (make-heap-ivector 3 '(unsigned-byte 32)) (setq a la) (setq ap lap)) ;Compiler warnings : ; Undeclared free variable A, in an anonymous lambda form. ; Undeclared free variable AP, in an anonymous lambda form. #<A Mac Pointer #x10217C> ? a #(1396 2578 97862649) ? ap #<A Mac Pointer #x10217C>
It's important to realize that the contents of the
ivector
we've just created haven't been
initialized, so their values are unpredictable, and you should
be sure not to read from them before you set them, to avoid
confusing results.
At this point, a
references an object
which works just like a normal array. You can refer to any item
of it with the standard aref
function, and
set them by combining that with setf
. As
noted above, the ivector
's contents haven't
been initialized, so that's the next order of business:
? a #(1396 2578 97862649) ? (aref a 2) 97862649 ? (setf (aref a 0) 3) 3 ? (setf (aref a 1) 4) 4 ? (setf (aref a 2) 5) 5 ? a #(3 4 5)
In addition, the macptr
allows direct
access to the same memory:
? (setq *byte-length-of-long* 4) 4 ? (%get-signed-long ap (* 2 *byte-length-of-long*)) 5 ? (%get-signed-long ap (* 0 *byte-length-of-long*)) 3 ? (setf (%get-signed-long ap (* 0 *byte-length-of-long*)) 6) 6 ? (setf (%get-signed-long ap (* 2 *byte-length-of-long*)) 7) 7 ? ;; Show that a actually got changed through ap a #(6 4 7)
So far, there is nothing about this object that could not
be done much better with standard Lisp. However, the
macptr
can be used to pass this chunk of
memory off to a C function. Let's use the C code to reverse the
elements in the array:
? ;; Insert the full path to your copy of libptrtest.dylib (open-shared-library "/Users/andrewl/openmcl/openmcl/gtk/libptrtest.dylib") #<SHLIB /Users/andrewl/openmcl/openmcl/gtk/libptrtest.dylib #x639D1E6> ? a #(6 4 7) ? ap #<A Mac Pointer #x10217C> ? (external-call "_reverse_int_array" :address ap :unsigned-int (length a) :address) #<A Mac Pointer #x10217C> ? a #(7 4 6) ? ap #<A Mac Pointer #x10217C>
The array gets passed correctly to the C function,
reverse_int_array
. The C function reverses
the contents of the array in-place; that is, it doesn't make a
new array, just keeps the same one and reverses what's in it.
Finally, the C function passes control back to CCL
. Since
the allocated array memory has been directly modified, CCL
reflects those changes directly in the array as well.
There is one final bit of housekeeping to deal with. Before moving on, the memory needs to be deallocated:
? (dispose-heap-ivector a ap) NIL
The dispose-heap-ivector
macro actually
deallocates the ivector, releasing its memory into the heap for
something else to use. Both a
and ap
now have undefined values.
When do you call dispose-heap-ivector
?
Anytime after you know the ivector will never be used again, but
no sooner. If you have a lot of ivectors, say, in a hash table,
you need to make sure that when whatever you were doing with the
hash table is done, those ivectors all get freed. Unless
there's still something somewhere else which refers to them, of
course! Exactly what strategy to take depends on the situation,
so just try to keep things simple unless you know better.
The simplest situation is when you have things set up so that a Lisp object "encapsulates" a pointer to foreign data, taking care of all the details of using it. In this case, you don't want those two things to have different lifetimes: You want to make sure your Lisp object exists as long as the foreign data does, and no longer; and you want to make sure the foreign data doesn't get deallocated while your Lisp object still refers to it.
If you're willing to accept a few limitations, you can make this easy. First, you can't let foreign code keep a permanent pointer to the memory; it has to always finish what it's doing, then return, and not refer to that memory again. Second, you can't let any Lisp code that isn't part of your encapsulating "wrapper" refer to the pointer directly. Third, nothing, either foreign code or Lisp code, should explicitly deallocate the memory.
If you can make sure all of these are true, you can at
least ensure that the foreign pointer is deallocated when the
encapsulating object is about to become garbage, by using
CCL
's nonstandard "termination" mechanism, which is
essentially the same as what Java and other languages call
"finalization".
Termination is a way of asking the garbage collector to let you know when it's about to destroy an object which isn't used anymore. Before destroying the object, it calls a function which you write, called a terminator.
So, you can use termination to find out when a particular
macptr
is about to become garbage. That's
not quite as helpful as it might seem: It's not exactly the same
thing as knowing that the block of memory it points to is
unreferenced. For example, there could be another
macptr
somewhere to the same block; or, if
it's a struct, there could be a macptr
to one
of its fields. Most problematically, if the address of that
memory has been passed to foreign code, it's sometimes hard to
know whether that code has kept the pointer. Most foreign
functions don't, but it's not hard to think of
exceptions.
You can use code such as this to make all this happen:
(defclass wrapper (whatever) ((element-type :initarg :element-type) (element-count :initarg :element-count) (ivector) (macptr))) (defmethod initialize-instance ((wrapper wrapper) &rest initargs) (declare (ignore initargs)) (call-next-method) (ccl:terminate-when-unreachable wrapper) (with-slots (ivector macptr element-type element-count) wrapper (multiple-value-bind (new-ivector new-macptr) (make-heap-ivector element-count element-type) (setq ivector new-ivector macptr new-macptr)))) (defmethod ccl:terminate ((wrapper wrapper)) (with-slots (ivector macptr) wrapper (when ivector (dispose-heap-ivector ivector macptr) (setq ivector nil macptr nil))))
The ccl:terminate
method will be called
on some arbitrary thread sometime (hopefully soon) after the GC
has decided that there are no strong references to an object
which has been the argument of a
ccl:terminate-when-unreachable
call.
If it makes sense to say that the foreign object should live as long as there's Lisp code that references it (through the encapsulating object) and no longer, this is one way of doing that.
Now we've covered passing basic types back and forth with C, and we've done the same with pointers. You may think this is all... but we've only done pointers to basic types. Join us next time for pointers... to pointers.
Previous Section | Next Section | Table of Contents | Glossary | Index |