OpenMCL provides facilities which enable multiple threads of execution (threads, sometimes called lightweight processes or just processes, though the latter term shouldn't be confused with the OS's notion of a process) within a lisp session. This document describes those facilities and issues related to multitheaded programming in OpenMCL.
Wherever possible, I'll try to use the term “thread” to denote a lisp thread, even though many of the functions in the API have the word “process” in their name.
Lisp threads share the same address space, but maintain their own execution context (stacks and registers) and their own dynamic binding context.
Traditionally, OpenMCL's threads have been cooperatively scheduled: through a combination of compiler and runtime suppport, the currently executing lisp thread arranged to be interrrupted at certain discrete points in its execution (typically on entry to a function and at the beginning of any looping construct). This interrupt occurred several dozen times per second; in response, a handler function might observe that the current thread had used up its time slice and another function (the lisp scheduler) would be called to find some other thread that was in a runnable state, suspend execution of the current thread, and resume execution of the newly executed thread. The process of switching contexts between the outgoing and incoming threads happened in some mixture of Lisp and assembly language code; as far as the OS was concerned, there was one native thread running in the Lisp image and its stack pointer and other registers just happened to change from time to time.
Under OpenMCL's cooperative scheduling model, it was possible (via the use of the CCL:WITHOUT-INTERRUPTS construct) to defer handling of the periodic interrupt that invoked the lisp scheduler; it was not uncommon to use WITHOUT-INTERRUPTS to gain safe, exclusive access to global data structures. In some code (including much of OpenMCL itself) this idiom was very common: it was (justifiably) believed to be an efficient way of inhibiting the execution of other threads for a short period of time.
The timer interrupt that drove the cooperative scheduler was only able to (pseudo-)preempt lisp code: if any thread called a blocking OS I/O function, no other thread could be scheduled until that thread resumed execution of lisp code. Lisp library functions were generally attuned to this constraint, and did a complicated mixture of polling and “timed blocking” in an attempt to work around it. Needless to say, this code is complicated and less efficient than it might be; it meant that the lisp was a little busier than it should have been when it was “doing nothing” (waiting for I/O to be possible.)
For a variety of reasons - better utilization of CPU resources on single and multiprocessor systems and better integration with the OS in general - threads in OpenMCL 0.14 and later are preemptively scheduled. In this model, lisp threads are native threads and all scheduling decisions involving them are made by the OS kernel. (Those decisions might involve scheduling multiple lisp threads simultaneously on multiple processors on SMP systems.) This change has a number of subtle effects:
it is possible for two (or more) lisp threads to be executing simultaneously, possibly trying to access and/or modify the same data structures. Such access really should have been coordinated through the use of synchronization objects regardless of the scheduling model in effect; preemptively scheduled threads increase the chance of things going wrong at the wrong time and do not offer lightweight alternatives to the use of those synchronization objects.
even on a single-processor system, a context switch can happen on any instruction boundary. Since (in general) other threads might allocate memory, this means that a GC can effectively take place at any instruction boundary. That's mostly an issue for the compiler and runtime system to be aware of, but it means that certain practices (such as trying to pass the address of a lisp object to foreign code) that were always discouraged are now discouraged ... vehemently.
there is no simple and efficient way to “inhibit the scheduler” or otherwise gain exclusive access to the entire CPU. (There are a variety of simple and efficient ways to synchronize access to particular data structures.)
As a broad generalization: code that's been aggressively tuned to the constraints of the cooperative scheduler may need to be redesigned to work well with the preemptive scheduler (and code written to run under OpenMCL's interface to the native scheduler may be less portable to other CL implementations, many of which offer a cooperative scheduler and an API similar to OpenMCL (< 0.14) 's.) At the same time, there's a large overlap in functionality in the two scheduling models, and it'll hopefully be possible to write interesting and useful MP code that's largely independent of the underlying scheduling details.
The keyword :OPENMCL-NATIVE-THREADS is on *FEATURES* in 0.14 and later and can be used for conditionalization where required.
all-processes
Returns a list of all lisp threads known to OpenMCL as of the precise instant it's called. It's safe to traverse this list and to modify the cons cells that comprise that list (it's freshly consed.) Since other threads can create and kill threads at any time, there's generally no way to get an “accurate” list of all threads, and (generally) no sense in which such a list can be accurate.
make-process name &key persistent (priority 0) (stack-size ccl:*default-control-stack-size*) (vstack-size ccl:*default-value-stack-size*) (tstack-size ccl:*default-temp-stack-size*) initial-bindings (use-standard-initial-bindings t)
Creates and returns a new process with the specified attributes. The newly created process will be incapable of execution; it will need to be preset (given an initial function to run) and enabled (allowed to execute) before it's able to actually do anything.
a string, used to identify the process
if true, requests that information about the process be retained by SAVE-APPLICATION so that an equivalent process can be restarted when a saved image is run.
this argument is currently ignored.[1]
the size, in bytes, of the newly-created process's control stack (used for foreign function calls and to save function return address context.)
the size, in bytes of the newly-created process's value stack (used for lisp function arguments, local variables, etc.)
the size, in bytes, of the newly-created process's temp stack (used for the allocation of dynamic-extent objects.)
when true (the default), the global “standard initial bindings” are put into effect in the new thread before. See DEF-STANDARD-INITIAL-BINDING. (“standard” initial bindings are put into effect before any bindings specified by :initial-bindings are.
an alist of (SYMBOL . VALUEFORM) pairs, which can be used to initialize special variable bindings in the new thread. Each valueform is used to effect the binding of the corresponding symbol according to the following rules:
if valueform is a function, it's called (with no arguments) in the execution environment of the newly created thread; the value returned from this call is used to initialize the corresponding variable.
if valueform is a constant (or a list whose CAR is QUOTE), the constant value (or the CADR of the QUOTE form) is used to initialize the corresponding variable.
if valueform is a symbol, that symbol's SYMBOL-VALUE - in the context of the calling thread as of the time that MAKE-PROCESS is called - is used to initialize the corresponding variable.
if valueform is a list, its CAR is applied to its CDR in the execution environment of the newly created thread; the value returned from this call is used to initialize the corresponding variable.
process-disable process
Disables the specified process (i.e., prevents it from running). This is a fairly expensive operation (it involves a few calls to the OS) and can be somewhat dangerous (for instance, if the process being disabled owns a lock or other resource.) Each call to PROCESS-DISABLE must be paired by a matching PROCESS-ENABLE call before the process is able to run. Returns T if the process had been enabled and is now disabled, NIL otherwise (that is, returns T if the process's PROCESS-DISABLED-COUNT transitioned from 0 to 1.
A thread can disable itself; it it's successful in doing so, then it can obviously only be reenabled by some other thread.
the process to disable
process-enable process
Undoes the effect of a previous call to PROCESS-DISABLE; if all such calls are undone, makes the process runnable. Has no effect if the process is not disabled. Returns T if the process had been disabled and is now enabled (if the process's PROCESS-DISABLED-COUNT transitioned from 1 to 0.)
the process to disable
process-disabled-count process
Returns the number of “outstanding” PROCESS-DISABLE calls on the specified process (those that don't have a matching PROCESS-ENABLE), or NIL if the process has expired (if its initial function has returned.) Newly created processes have a (PROCESS-DISABLED-COUNT) of 1.
the process
process-preset process function &rest args
Typically used to initialize a newly-created process, setting things up so that it'll begin execution by applying the specified function to the specified args when it's enabled.
the process to preset
the initial function (or a symbol which names a function)
a list of values, appropriate for the specified function.
process-run-function name function &rest args
Creates a process (via MAKE-PROCESS), presets it (via PROCESS-PRESET), enables it (via PROCESS-ENABLE), and returns that process. This is the simplest way to create and run a lisp thread.
either a string used to name the process or a list of keyword/value pairs used to supply additional arguments to MAKE-PROCESS. In the latter case, the additional keyword :NAME can be used to specifiy the name of the new process.
the initial function (or a symbol which names a function)
a list of values, appropriate for the specified function.
process-interrupt process function &rest args
Arranges for the target process to apply function to args at some point in the near future (interrupting whatever the process was doing.) If function returns normally, the process resumes execution at the point at which it was interrupted.
the target process. It's perfectly legal for a process to interrupt itself. A process must be in an enabled state in order to respond to a PROCESS-INTERRUPT request.[2]
the function that the target process should run in response to an interrupt
a list of values, appropriate for the specified function.
Bound to (of all things) the current thread in each thread. Shouldn't be set by user code.
process-reset process &optional unwind-option kill-option without-aborts-option
Generally used to cause a running process to cleanly exit from any ongoing computation and enter a state where it can be terminated (via PROCESS-KILL) or preset. There clearly needs to be something like this, but the current implemtation still contains code that's oriented to the concerns of a cooperative scheduler. If the target process is not the current process, uses PROCESS-INTERRUPT to force the target to reset itself (and is therefore subject to any constraints imposed by PROCESS-INTERRUPT.)
process-kill process &optional (without-aborts :ask)
Causes the target process to reset itself and then exit from its initial function. Uses PROCESS-RESET (and therefore PROCESS-INTERRUPT) internally.
Totally ignored.
Initialized to the OS scheduler's clock resolution every time a lisp image starts up; shouldn't be modified by user code. The scheduler's clock resolution is ordinarily of marginal interest at best, but (for backward compatibility) some functions accept “timeout” values expressed in “ticks”. Currently, both LinuxPPC and DarwinPPC cause this variable to be initialized to 100.
process-whostate process
Returns a string which describes the “state” of the specified process, primarily for the benefit of debugging tools. [4]
a process
process-allow-schedule
Advises the OS scheduler that the current thread has nothing useful to do and that it should try to find some other thread to schedule in its place. There's almost always a better alternative (involving waiting for some specific event to occur.)
process-wait whostate function &rest args
Causes the current process to repeatedly apply function to args until the call returns a true result, then returns NIL. After each failed call, yields the CPU as if by PROCESS-ALLOW-SCHEDULE. Again, it's almost always more efficient to wait for some specific event to occur; this isn't exactly busy-waiting, but the OS scheduler can do a better job of scheduling if it's involved in the process.
a string, which will be the value of PROCESS-WHOSTATE while the process is waiting
a function or function name, treated as a predicate
arguments to provide to the predicate
process-wait-with-timeout whostate ticks function args
If ticks is NIL, behaves exactly like PROCESS-WAIT (and then returns T.) Otherwise, ticks should be a small positive integer expressing a time interval in “ticks” (see *TICKS-PER-SECOND*). In this case, the predicate will be tested repeatedly (in the same kind of test/yield loop as in PROCESS-WAIT) until the predicate returns true (in which case PROCESS-WAIT-WITH-TIMEOUT returns T) or the time interval is exceeded (in which case NIL is returned.) The astute reader has no doubt anticipated the observation that better alternatives should be used whenever possible.
a string, which will be the value of PROCESS-WHOSTATE while the process is waiting
a small positive integer or NIL
a function or function name, treated as a predicate
arguments to provide to the predicate
without-interrupts &body body
Executes the body (and returns whatever value(s) it returns) in an environment in which PROCESS-INTERRUPT requests are deferred. As noted above, this has nothing to do with the scheduling of other threads; it may be necessary to inhibit PROCESS-INTERRUPT handling when (for instance) modifying some data structure (for which the current thread holds an appropriate lock) in some manner that's not reentrant.
a sequence of Lisp forms
make-lock &optional name
Creates and returns an object of type CCL::LOCK, which can be used to synchronize access to some shared resource. The lock is initially in a “free” state; locks can also be “owned” by a thread.
any value; typically a string or symbol which may appear in some PROCESS-WHOSTATEs of threads that're waiting for the lock.
with-lock-grabbed (lock) &body body
Waits until the lock is either free or owned by the calling thread, then excutes the body as an implicit PROGN and with the lock owned by the calling thread. If the lock was originally free, it's restored to a free state. Returens whatever values(s) the body returns.
a lock, as returned by MAKE-LOCK
a sequence of Lisp forms.
make-recursive-lock
Creates and returns an object of type CCL::RECURSIVE-LOCK. The newly returned lock is in the “free” state.
with-recursive-lock (lock) &body body
Waits until the lock is either free or owned by the calling thread. Executes the body and returns whatever value(s) it returns, after restoring the lock to its original state. In the contended case, the waiting is much more efficient than it may be when WITH-LOCK-GRABBED is used; the locking and unlocking operations are often less efficient.
an object of type CCL::RECURSIVE-LOCK, as created by MAKE-RECURSIVE-LOCK.
a sequence of Lisp forms
make-read-write-lock
Creates and returns an object of type CCL::READ-WRITE-LOCK. The returned object has no “writer” and no “readers”. READ-WRITE-LOCKs allow multiple threads to be “readers” (with presumed read access to the objects protected by the lock) or a single thread to be a “writer” (with exclusive read-write acccess to the protected object.)
with-read-lock (lock) &body body
Waits until the specified READ-WRITE-LOCK has no writer, then ensures that the current thread is a reader. Executes the body and returns whatever value(s) it returns, restoring the current thread's “reader” status to what it was on entry.
a READ-WRITE-LOCK, as returned by MAKE-READ-WRITE-LOCK.
a sequence of lisp forms
with-write-lock (lock) &body body
Waits until the specified READ-WRITE-LOCK has no readers and no other writer, then ensures that the current thread is the writer. Executes the body and returns whatever value(s) it returns, restoring the current thread's “writer” status to what it was on entry.
a READ-WRITE-LOCK, as returned by MAKE-READ-WRITE-LOCK.
a sequence of lisp forms
make-semaphore
Creates and returns an object of type CCL::SEMAPHORE. The returned object has a “count” of 0.
signal-semaphore semaphore
Atomically increments the semaphore's count by 1; this may enable a waiting thread to resume execution. Returns an OS error indication (which should probably be interpreted and processed; the most common error would probably involve trying to operate on something that's not a semaphore.
a semaphore
wait-on-semaphore semaphore
Waits until the semaphore has a positive count that can be atomically decremented; this will succeed exactly once for each corresponding call to SIGNAL-SEMAPHORE. Returns an OS error indication.
process-input-wait fd &optional timeout
Wait until input is available on the file descriptor fd. This uses the select system call and is generally a fairly efficient way of blocking while waiting for input. More accurately, this function waits until it's possible to read from fd without blocking or until the timeout value (if any) expires. Note that it's possible to read without blocking if an end-of-file condition exists.
a small non-negative integer used by the OS to denote an open file, socket, or similar I/O connection. The generic function (CCL::STREAM-DEVICE (s stream) direction) - where “direction” is one of :INPUT or :OUTPUT - will return the file descriptor associated with a stream, if any.
either NIL (the default) or a non-negative integer expressing a timeout interval in “ticks”. There are CCL::*TICKS-PER-SECOND* (typically 100) ticks per second
process-output-wait fd
Wait until output is possible on the file descriptor fd. This uses the select system call and is generally a fairly efficient way of blocking while waiting for output to become possible. (It can also be used to determine when a stream socket has established a connection, for instance.)
a small non-negative integer used by the OS to denote an open file, socket, or similar I/O connection. The generic function (CCL::STREAM-DEVICE (s stream) direction) - where “direction” is one of :INPUT or :OUTPUT - will return the file descriptor associated with a stream, if any.
Much of the functionality described above is similar to that provided by OpenMCL's cooperative scheduler, some other parts of which make no sense in a native threads implementation.
PROCESS-RUN-REASONS and PROCESS-ARREST-REASONS were SETFable process attributes; each was just a list of arbitrary tokens. A thread was eligible for scheduling (roughly equivalent to being “enabled”) if its arrest-reasons list was empty and its run-reasons list was not. I don't think that it's appropriate to encourage a programming style in which otherwise runnable threads are enabled and disabled on a regular basis (it's preferable for threads to wait for some sort of synchronization event to occur if they can't occupy their time productively.)
There are a number of primitives for maintaining process queues; that's now the OS's job.
Cooperative threads were based on coroutining primitives associated with objects of type STACK-GROUP. STACK-GROUPs no longer exist.
The following was written in April 2003:
The current limitations on PROCESS-INTERRUPT have pervasive effects: it's perfectly legitimate for a given thread to block indefinitely in the OS (waiting for a lock, semaphore, file system condition, etc.): this doesn't generally prevent other threads from running and may actually increase throughput (since the blocked thread isn't contending for resources with threads that have work to do), but it's currently not possible to interrupt a thread while it's running foreign code. Since PROCESS-KILL and PROCESS-RESET depend on PROCESS-INTERRUPT and since normal exit from a lisp session depends on using those functions to shut down threads in an orderly manner, it follows that threads should avoid blocking indefinitely (in fact, should avoid blocking any longer than one is willing to wait for a PROCESS-INTERRUPT request to be serviced.)
Alpha versions of OpenMCL 0.14 have compromised on this issue: blocking primitives generally wait for 1 second at a time, and do so repeatedly until the blocking primitive returns or an asynchronous control transfer occurs; lisp code is able to service interrupt requests “between” blocking calls. This 1-second delay can mean that a session in which several threads are running can take several seconds to shut down; the scheme also makes the programming model more complicated than it needs to be.
A better solution would involve using asynchronous signals to implement PROCESS-INTERRUPT. That's complicated a bit by the fact that many C library functions (including many of the functions used in the traditional Linux Threads library) are non-reentrant. That situation may improve substantially as a new Linux threads library (NPTL) is adopted; until NTPL is generally available on LinuxPPC, it may be necessary to be aware of this issue (and to introduce WITHOUT-INTERRUPTS forms around certain foreign calls.) The current situation on OSX is actually somewhat better, in that most blocking primitives are implemented in the Mach Kernel which has a (relatively) clear notion of “interrupted system calls”.
As of May 2003, this issue has been (largely) addressed: POSIX signals are used to implement PROCESS-INTERRUPT and the latency involved in PROCESS-INTERRUPT has been greatly reduced as a result. There are some cases (e.g., if a thread receives an interrupt signal while in the middle of exception-handling code) that aren't handled correctly.
It's hard to give step-by-step instructions; there are certainly a few things that one should look at carefully:
It's wise to be suspicious of most uses of WITHOUT-INTERRUPTS; there may be exceptions, but WITHOUT-INTERRUPTS is often used as shorthand for WITH-APPROPRIATE-LOCKING. Determining what type of locking is appropriate and writing the code to implement it is likely to be straightforward and simple most of the time.
I've only seen one case where a process's “run reasons” were used to communicate information as well as to control execution; I don't think that this is a common idiom, but may be mistaken about that.
It's certainly possible that programs written for cooperatively scheduled lisps that have ru:n reliably for a long time have done so by accident: resource-contention issues tend to be timing-sensitive, and decoupling thread scheduling from lisp program execution affects timing. I know that there is or was code in both OpenMCL and commercial MCL that was written under the explicit assumption that certain sequences of open-coded operations were uninterruptable; it's certainly possible that the same assumptions have been made (explicitly or otherwise) by application developers.
[1] | It shouldn't be ignored of course, but there are complications on some platforms. |
[2] | Currently (as of April 2003), the mechanism used by PROCESS-INTERRUPT is similar to the mechanism used to interrupt threads in the cooperative scheduler; as such, threads only “handle interrupt requests” when running lisp code. It's hoped that this limitation can be removed in the very near future. Update: As of May 2003, PROCESS-INTERRUPT uses asynchronous POSIX signals to interrupt threads. If the thread being interrupted is executing lisp code, it can respond to the interrupt almost immediately (as soon as it has finished pseudo-atomic operations like consing and stack-frame initialization.) If the interrupted thread is blocking in a system call, that system call is aborted by the signal and the interrupt is handled on return. It is still difficult to reliably interrupt arbitrary foreign code (that may be stateful or otherwise non-reentrant); the interrupt request is handled when such foreign code returns to or enters lisp. |
[3] | The documentation and the function both clearly need further work. |
[4] | This should be SETFable, but doesn't seem to ever have been. |
[5] | There are currently a few different flavors of locks, and different constructs for creating them and waiting on them. There may be good reasons for having multiple types of locks, but it'd be desirable to simplify the syntax a bit. |
[6] | This is deprecated. Use MAKE-LOCK instead. |
[7] | This construct is also deprecated; use WITH-LOCK-GRABBED (which is entirely equivalent) instead. |
[8] | There probably should be some way to atomically “promote” a reader (making it a writer). |