Re: dealing with new-lines

Tim Berners-Lee (
Fri, 8 Jan 93 12:09:41 +0100

For what it is worth, the way the CERN implementations are supposed to work is:

MIXED elements

Within a MIXED element, newlines are treated as word breaks. The effect of
this is that if the imaginary output cursor has a non-white character to the
left of it, then a space is introduced. This means that any number of newlines
will only produce one white space character. [It involves a horrid "are we in
the middle of a word?" flag.]

Spaces, however, always produce spaces, so multiple spaces will come out as
multiple spaces.


Within <XMP> I went all the way and said that from the trailing > of <XMP>
to the beginning <of </XMP> all data was litteral, including newlines.
Therefore example sections typically are marked up as

<XMP>This is an example line

That was an example. There are no new lines between the <XMP> and the example
line because the XMP section causes a paragraph break, and the style for the
normal paragraph specifies a minimum white space after each paragraph. Beacuse
each XMP section is like a black box, any white spce inside it would not be
seen by the white space management logic which overlaps the white spaces
required around successive paragraphs, and extra white spce would result.


By the way, I think we agreed (I gave in) that the <PRE> sections would have
siugnificant newlines. Your manuals, Tom, have <p> as well as newlines, which
gives double spacing on my browsers. So I tread newlines as newlines in all
the <PRE> element just as XMP.

SGML generation

Off the top of my head I think what happened was in Mixed elements, I generate
the newlines instead of spaces in PCDATA when the line gets a bit long, and
also insert them before open tags like <H3> which cause paragraph breaks
anyway. As an anchor tag need not cause a word break even, I do NOT insert
them before an <A>, but rather always output "<A\nNAME=%s" so that the
HREF attribute (generally the line length buster) starts atthe beginning of a