| Previous Section | Next Section | Table of Contents | Glossary | Index |
Not every foreign function is so marvelously easy to use as the ones we saw in the last section. Some of them require you to allocate a C struct, fill it in with your own information, and pass it a pointer to the struct. Some of them require you to allocate an empty struct so they can fill it in, and then you can read the information out of it.
Also, some of them have their own structs and return a pointer to that same struct every time you call them, but those are easier to deal with, so they won't be covered in this section.
You might know that Lisp (and, indeed, most programming languages) has two separate regions of memory. There's the stack, which is where variable bindings are kept. Memory on the stack is allocated every time any function is called, and deallocated when it returns, so it's useful for anything that doesn't need to last longer than one function call, when there's only one thread. If that's all you need, you can do it with .
Then, there's the heap, which holds everything else, and is our subject here. There are two advantages and one big disadvantage to putting things on the heap rather than the stack. First, data allocated on the heap can be passed outside of the scope in which it was created. This is useful for data which may need to be passed between multiple C calls or multiple threads. Also, some data may be too large to copy multiple times or may be too large to allocate on the stack.
The second advantage is security. If incoming data is being placed directly onto the stack, the input data can cause stack overflows and underflows. This is not something which Lisp users generally worry about since garbage collection generally handles memory management. However, "stack smashing" is one of the classic exploits in C which malicious hackers can use to gain control of a machine. Not checking external data is always a bad idea; however, allocating it into the heap at least offers more protection than direct stack allocation.
The big disadvantage to allocating data on the heap is that it must be explicitly deallocated—you need to "free" it when you're done with it. Ordinarily, in Lisp, you wouldn't allocate memory yourself, and the garbage collector would know about it, so you wouldn't have to think about it again. When you're doing it manually, it's very different. Memory management becomes a manual process, just like in C and C++.
What that means is that, if you allocate something and then lose track of the pointer to it, there's no way to ever free that memory. That's what's called a memory leak, and if your program leaks enough memory it will eventually use up all of it! So, you need to be careful to not lose your pointers.
That disadvantage, though, is also an advantage for using foreign functions. Since the garbage collector doesn't know about this memory, it will never move it around. External C code needs this, because it doesn't know how to follow it to where it moved, the way that Lisp code does. If you allocate data manually, you can pass it to foreign code and know that no matter what that code needs to do with it, it will be able to, until you deallocate it. Of course, you'd better be sure it's done before you do. Otherwise, your program will be unstable and might crash sometime in the future, and you'll have trouble figuring out what caused the trouble, because there won't be anything pointing back and saying "you deallocated this too soon."
And, so, on to the code...
As in the last tutorial, our first step
is to create a local dynamic library in order to help show
what is actually going on between CCL and C. So, create the file
ptrtest.c, with the following code:
#include <stdio.h>
void reverse_int_array(int * data, unsigned int dataobjs)
{
int i, t;
for(i=0; i<dataobjs/2; i++)
{
t = *(data+i);
*(data+i) = *(data+dataobjs-1-i);
*(data+dataobjs-1-i) = t;
}
}
void reverse_int_ptr_array(int **ptrs, unsigned int ptrobjs)
{
int *t;
int i;
for(i=0; i<ptrobjs/2; i++)
{
t = *(ptrs+i);
*(ptrs+i) = *(ptrs+ptrobjs-1-i);
*(ptrs+ptrobjs-1-i) = t;
}
}
void
reverse_int_ptr_ptrtest(int **ptrs)
{
reverse_int_ptr_array(ptrs, 2);
reverse_int_array(*(ptrs+0), 4);
reverse_int_array(*(ptrs+1), 4);
}
This defines three functions.
reverse_int_array takes a pointer to an array
of ints, and a count telling how many items
are in the array, and loops through it putting the elements in
reverse. reverse_int_ptr_array does the same
thing, but with an array of pointers to ints.
It only reverses the order the pointers are in; each pointer
still points to the same thing.
reverse_int_ptr_ptrtest takes an array of
pointers to arrays of ints. (With me?) It
doesn't need to be told their sizes; it just assumes that the
array of pointers has two items, and that both of those are
arrays which have four items. It reverses the array of
pointers, then it reverses each of the two arrays of
ints.
Now, compile ptrtest.c into a dynamic library using the command:
gcc -dynamiclib -Wall -o libptrtest.dylib ptrtest.c -install_name ./libptrtest.dylib
If that command doesn't make sense to you, feel free to go back and read about it at .
Now, start CCL and enter:
? ;; make-heap-ivector courtesy of Gary Byers
(defun make-heap-ivector (element-count element-type)
(let* ((subtag (ccl::element-type-subtype element-type)))
(unless (= (logand subtag target::fulltagmask)
target::fulltag-immheader)
(error "~s is not an ivector subtype." element-type))
(let* ((size-in-bytes (ccl::subtag-bytes subtag element-count)))
(ccl::%make-heap-ivector subtag size-in-bytes element-count))))
MAKE-HEAP-IVECTOR
? ;; dispose-heap-ivector created for symmetry
(defmacro dispose-heap-ivector (a mp)
`(progn
(ccl::%dispose-heap-ivector ,a)
;; Demolish the arguments for safety
(setf ,a nil)
(setf ,mp nil)))
DISPOSE-HEAP-IVECTOR
If you don't understand how those functions do what they do. That's okay; it gets into very fine detail which really doesn't matter, because you don't need to change them.
The function make-heap-ivector is the
primary tool for allocating objects in heap memory. It
allocates a fixed-size CCL object in heap memory. It
returns both an array reference, which can be used directly from
CCL, and a macptr, which can be used to
access the underlying memory directly. For example:
? ;; Create an array of 3 4-byte-long integers
(multiple-value-bind (la lap)
(make-heap-ivector 3 '(unsigned-byte 32))
(setq a la)
(setq ap lap))
;Compiler warnings :
; Undeclared free variable A, in an anonymous lambda form.
; Undeclared free variable AP, in an anonymous lambda form.
#<A Mac Pointer #x10217C>
? a
#(1396 2578 97862649)
? ap
#<A Mac Pointer #x10217C>
It's important to realize that the contents of the
ivector we've just created haven't been
initialized, so their values are unpredictable, and you should
be sure not to read from them before you set them, to avoid
confusing results.
At this point, a references an object
which works just like a normal array. You can refer to any item
of it with the standard aref function, and
set them by combining that with setf. As
noted above, the ivector's contents haven't
been initialized, so that's the next order of business:
? a
#(1396 2578 97862649)
? (aref a 2)
97862649
? (setf (aref a 0) 3)
3
? (setf (aref a 1) 4)
4
? (setf (aref a 2) 5)
5
? a
#(3 4 5)
In addition, the macptr allows direct
access to the same memory:
? (setq *byte-length-of-long* 4)
4
? (%get-signed-long ap (* 2 *byte-length-of-long*))
5
? (%get-signed-long ap (* 0 *byte-length-of-long*))
3
? (setf (%get-signed-long ap (* 0 *byte-length-of-long*)) 6)
6
? (setf (%get-signed-long ap (* 2 *byte-length-of-long*)) 7)
7
? ;; Show that a actually got changed through ap
a
#(6 4 7)
So far, there is nothing about this object that could not
be done much better with standard Lisp. However, the
macptr can be used to pass this chunk of
memory off to a C function. Let's use the C code to reverse the
elements in the array:
? ;; Insert the full path to your copy of libptrtest.dylib
(open-shared-library "/Users/andrewl/openmcl/openmcl/gtk/libptrtest.dylib")
#<SHLIB /Users/andrewl/openmcl/openmcl/gtk/libptrtest.dylib #x639D1E6>
? a
#(6 4 7)
? ap
#<A Mac Pointer #x10217C>
? (external-call "_reverse_int_array" :address ap :unsigned-int (length a) :address)
#<A Mac Pointer #x10217C>
? a
#(7 4 6)
? ap
#<A Mac Pointer #x10217C>
The array gets passed correctly to the C function,
reverse_int_array. The C function reverses
the contents of the array in-place; that is, it doesn't make a
new array, just keeps the same one and reverses what's in it.
Finally, the C function passes control back to CCL. Since
the allocated array memory has been directly modified, CCL
reflects those changes directly in the array as well.
There is one final bit of housekeeping to deal with. Before moving on, the memory needs to be deallocated:
? ;; dispose-heap-ivector created for symmetry
;; Macro repeated here for pedagogy
(defmacro dispose-heap-ivector (a mp)
`(progn
(ccl::%dispose-heap-ivector ,a)
;; Demolish the arguments for safety
(setf ,a nil)
(setf ,mp nil)))
DISPOSE-HEAP-IVECTOR
? (dispose-heap-ivector a ap)
NIL
? a
NIL
? ap
NIL
The dispose-heap-ivector macro actually
deallocates the ivector, releasing its memory into the heap for
something else to use. In addition, it makes sure that the
variables which it was called with are set to nil, because
otherwise they would still be referencing the memory of the
ivector - which is no longer allocated, so that would be a bug.
Making sure there are no other variables set to it is up to
you.
When do you call dispose-heap-ivector?
Anytime after you know the ivector will never be used again, but
no sooner. If you have a lot of ivectors, say, in a hash table,
you need to make sure that when whatever you were doing with the
hash table is done, those ivectors all get freed. Unless
there's still something somewhere else which refers to them, of
course! Exactly what strategy to take depends on the situation,
so just try to keep things simple unless you know better.
The simplest situation is when you have things set up so that a Lisp object "encapsulates" a pointer to foreign data, taking care of all the details of using it. In this case, you don't want those two things to have different lifetimes: You want to make sure your Lisp object exists as long as the foreign data does, and no longer; and you want to make sure the foreign data doesn't get deallocated while your Lisp object still refers to it.
If you're willing to accept a few limitations, you can make this easy. First, you can't let foreign code keep a permanent pointer to the memory; it has to always finish what it's doing, then return, and not refer to that memory again. Second, you can't let any Lisp code that isn't part of your encapsulating "wrapper" refer to the pointer directly. Third, nothing, either foreign code or Lisp code, should explicitly deallocate the memory.
If you can make sure all of these are true, you can at
least ensure that the foreign pointer is deallocated when the
encapsulating object is about to become garbage, by using
CCL's nonstandard "termination" mechanism, which is
essentially the same as what Java and other languages call
"finalization".
Termination is a way of asking the garbage collector to let you know when it's about to destroy an object which isn't used anymore. Before destroying the object, it calls a function which you write, called a terminator.
So, you can use termination to find out when a particular
macptr is about to become garbage. That's
not quite as helpful as it might seem: It's not exactly the same
thing as knowing that the block of memory it points to is
unreferenced. For example, there could be another
macptr somewhere to the same block; or, if
it's a struct, there could be a macptr to one
of its fields. Most problematically, if the address of that
memory has been passed to foreign code, it's sometimes hard to
know whether that code has kept the pointer. Most foreign
functions don't, but it's not hard to think of
exceptions.
You can use code such as this to make all this happen:
(defclass wrapper (whatever)
((element-type :initarg :element-type)
(element-count :initarg :element-count)
(ivector)
(macptr)))
(defmethod initialize-instance ((wrapper wrapper) &rest initargs)
(declare (ignore initargs))
(call-next-method)
(ccl:terminate-when-unreachable wrapper)
(with-slots (ivector macptr element-type element-count) wrapper
(multiple-value-bind (new-ivector new-macptr)
(make-heap-ivector element-count element-type)
(setq ivector new-ivector
macptr new-macptr))))
(defmethod ccl:terminate ((wrapper wrapper))
(with-slots (ivector macptr) wrapper
(when ivector
(dispose-heap-ivector ivector macptr)
(setq ivector nil
macptr nil))))
The ccl:terminate method will be called
on some arbitrary thread sometime (hopefully soon) after the GC
has decided that there are no strong references to an object
which has been the argument of a
ccl:terminate-when-unreachable call.
If it makes sense to say that the foreign object should live as long as there's Lisp code that references it (through the encapsulating object) and no longer, this is one way of doing that.
Now we've covered passing basic types back and forth with C, and we've done the same with pointers. You may think this is all... but we've only done pointers to basic types. Join us next time for pointers... to pointers.
| Previous Section | Next Section | Table of Contents | Glossary | Index |