Previous Section | Next Section | Table of Contents | Glossary | Index |
In Hemlock, text is represented as a sequence of lines. Newline characters
are never stored but are implicit between lines. The
implicit newline character is treated as the single character #\Newline
by the
text primitives.
Text is broken into lines when it is first introduced into Hemlock. Text enters Hemlock from the outside world in two ways: reading a file, or pasting text from the system clipboard. Hemlock uses heuristics (which should be documented here!) to decide what newline convention to use to convert the incoming text into its internal representation as a sequence of lines. Similarly it uses heuristics (which should be documented here!) to convert the internal representation into a string with embedded newlines in order to write a file or paste a region into the clipboard.
A line
is an object representing a sequence of characters with no line breaks.
Given a line, this function returns as a simple string the characters in the line. This is setf'able to set the line-string to any string that does not contain newline characters. It is an error to destructively modify the result of line-string or to destructively modify any string after the line-string of some line has been set to that string.
This function returns an object that serves as a signature for a line's contents. It is guaranteed that any modification of text on the line will result in the signature changing so that it is not eql to any previous value. The signature may change even when the text remains unmodified, but this does not happen often.
A mark
indicates a specific position within the text represented by a
line and a character position within that line. Although a mark is
sometimes loosely referred to as pointing to some character, it in
fact points between characters. If the charpos is zero, the previous
character is the newline character separating the previous line from
the mark's line. If the charpos is equal to the number of characters
in the line, the next character is the newline character separating
the current line from the next. If the mark's line has no previous
line, a mark with charpos of zero has no previous character; if the
mark's line has no next line, a mark with charpos equal to the length of
the line has no next character.
This section discusses the very basic operations involving marks, but a lot of Hemlock programming is built on altering some text at a mark. For more extended uses of marks see Altering And Searching Text.
A mark may have one of two lifetimes: temporary or permanent. Permanent marks remain valid after arbitrary operations on the text; temporary marks do not. Temporary marks are used because less bookkeeping overhead is involved in their creation and use. If a temporary mark is used after the text it points to has been modified results will be unpredictable. Permanent marks continue to point between the same two characters regardless of insertions and deletions made before or after them.
There are two different kinds of permanent marks which differ only in their behavior when text is inserted at the position of the mark; text is inserted to the left of a left-inserting mark and to the right of right-inserting mark.
These functions destructively modify marks to point to new positions. Other sections of this document describe mark moving routines specific to higher level text forms than characters and lines, such as words, sentences, paragraphs, Lisp forms, etc.
This function changes mark to point n lines after (n before if n is negative) the current position. The character position of the resulting mark is (min (line-length resulting-line) (mark-charpos mark)) if charpos is unspecified, or (min (line-length resulting-line) charpos) if it is. As with character-offset, if there are not n lines then nil is returned and mark is not modified.
A region
is simply a pair of marks: a starting mark and an ending
mark. The text in a region consists of the characters following the
starting mark and preceding the ending mark (keep in mind that a mark
points between characters on a line, not at them). By modifying the
starting or ending mark in a region it is possible to produce regions
with a start and end which are out of order or even in different
buffers. The use of such regions is undefined and may result in
arbitrarily bad behavior.
This function returns the number of lines in the region, first and last lines inclusive. A newline is associated with the line it follows, thus a region containing some number of non-newline characters followed by one newline is one line, but if a newline were added at the beginning, it would be two lines.
Previous Section | Next Section | Table of Contents | Glossary | Index |