MacOSX effectively supports two distinct line-termination conventions. Programs in its Darwin substrate follow the Unix convention of recognizing #\LineFeed as a line terminator; traditional MacOS programs use #\Return for this purpose. OpenMCL follows the Unix convention on both Darwin and LinuxPPC, but offers (as of version 0.11) some support for reading and writing files that use the MacOS convention as well.
This support (and anything like it) is by nature heuristic: it can successfully hide the distinction between newline conventions much of the time, but could mistakenly change the meaning of otherwise correct programs (typically when files contain both #\Return and #\Linefeed characters or when files contain mixtures of text and binary data.) Because of this concern, the default settings of some of the variables that control newline translation and interpretation are somewhat conservative.
Although the issue of multiple newline conventions primarily affects MacOSX users, the functionality described here is available under LinuxPPC as well (and may occasionally be useful there.)
None of this addresses (or attempts to address) issues related to the third newline convention ("CRLF") in widespread use (since that convention isn't native to any platform on which OpenMCL currently runs). If OpenMCL is ever ported to such a platform, that issue might be revisited.
Note that some MacOS programs (including some versions of commercial MCL) may use HFS file type information to recognize TEXT and other file types and so may fail to recognize files created with OpenMCL or other Darwin applications (regardless of line termination issues.)
Unless otherwise noted, the symbols mentioned in this
documentation are exported from the CCL
package.
This variable is currently only used by the standard reader
macro function for #\; (single-line comments); that function
reads successive characters until EOF, a #\NewLine is read, or a
character EQL
to the value of
*alternate-line-terminator*
is read. In
OpenMCL for Darwin, the value of this variable is initially
#\Return ; in OpenMCL for LinuxPPC, it's initially NIL.
Their default treatment by the #\; reader macro is the primary
way in which #\Return and #\Linefeed differ syntactally; by
extending the #\; reader macro to (conditionally) treat #\Return
as a comment-terminator, that distinction is eliminated. This
seems to make LOAD
and
COMPILE-FILE
insensitive to line-termination
issues in many cases. It could fail in the (hopefully rare)
case where a LF-terminated (Unix) text file contains embedded
#\Return characters, and this mechanism isn't adequate to handle
cases where newlines are embedded in string constants or other
tokens (and presumably should be translated from an external
convention to the external one) : it doesn't change what
READ-CHAR
or READ-LINE
"see",
and that may be necessary to handle some more complicated cases.
Per ANSI CL, OpenMCL supports the :EXTERNAL-FORMAT
keyword argument to the functions OPEN
,
LOAD
, and COMPILE-FILE
. This argument
is intended to provide a standard way of providing
implementation-dependent information about the format of files
opened with an element-type of CHARACTER
. This
argument can meaningfully take on the values
:DEFAULT
(the default), :MACOS
,
:UNIX
, or :INFERRED
in OpenMCL.
When defaulted to or specified as :DEFAULT
, the
format of the file stream is determined by the value of the
variable CCL:*DEFAULT-EXTERNAL-FORMAT*
. See below.
When specified as :UNIX
, all characters are read
from and written to files verbatim.
When specified as :MACOS
, all #\Return characters
read from the file are immediately translated to #\Linefeed
(#\Newline); all #\Newline (#\Linefeed) characters are written
externally as #\Return characters.
When specified as :INFERRED
and the file is open
for input, the first bufferful of input data is examined; if a
#\Return character appears in the buffer before the first
#\Linefeed, the file stream's external-format is set to
:MACOS
; otherwise, it is set to :UNIX
.
All other values of :EXTERNAL-FORMAT
- and any
combinations that don't make sense, such as trying to infer the
format of a newly-created output file stream - are treated as
if :UNIX
was specified. As mentioned above, the
:EXTERNAL-FORMAT
argument doesn't apply to binary
file streams.
The translation performed when :MACOS
is specified
or inferred has a somewhat greater chance of doing the right thing
than the *alternate-line-terminator*
mechanism does;
it probably has a somewhat greater chance of doing the wrong
thing, as well.
The value of this variable is used when
:EXTERNAL-FORMAT
is unspecified or specified as
:DEFAULT
. It can meaningfully be given any of the
values :UNIX
, :MACOS
, or
:INFERRED
, each of which is interpreted as described
above.
Because there's some risk that unsolicited newline translation
could have undesirable consequences, the initial value of this
variable in OpenMCL is :UNIX
.