Line breaks and multibody douments

Peter Lister, Cranfield Computer Centre (ccprl@xdm001.ccc.cranfield.ac.uk)
Tue, 01 Jun 93 12:30:12 BST


Well, I knew everyone would disagree with me. Thank you for doing so politely.

I'd like to make it clear that I do *NOT* suggest line breaks and / or
page breaks to annoy people, nor because I have a specific interest in
poetry (though I do). Dave Raggett suggested a <poem> tag, and as I saw
no smiley, I assume he was at least semi-serious. IMHO, the idea sucks.
T V Raman made a similar comment about representing the high-level
structure of a poems, but I would claim that that is not the business of HTML.

The general philsophy is to be "Keep HTML minimal". I agree, and concur
with the idea of embedding e.g. GIF and TeX. You are right, Dave, that
HTML must not be all things to all men. I would add "or anyone for that
matter". So - let us maintain the high-level representations of our
docs (e.g. poetry) in our authoring systems, and then PUBLISH them as
HTML. Some poetry is published on e.g. audio tape, but the vast
majority is published on paper and obeys the conventions of line
breaks, verse breaks, page breaks etc, even though the author may
COMPOSE it in very different terms that just don't come across when
reading it. My feeling is that HTML should provide low-level PUBLISHING
functions, but not aetherial concepts or <poem> tags.

My best suggestion is an argument to <pre> ; e.g. <pre monofont> and
<pre normalfont>. Or maybe <pre style="em">, where the style argument
can be any valid text style tag or "plain"? This allows me to separate
the lines of my address (or poem) properly, introduces no extra tags,
and <pre> with no argument can default to the normal meaning of
"monospaced, to suit a listing or programming example".

OK, on to "page breaks". Or since, this kind of terminology seems to
offend people, may I suggest "multibody files"? It occurs to me that a
special "page break" tag is unneccesary. All we need is a well
established method of handling multiple <body></body> pairs in one file.

* Browsers can search across a whole set of short bodies
* Browsers don't have to try to whack the whole of an enormous file onto a text
widget. XMosaic already has to break large files into "pages", so it seems
like a good idea to make the "pages" more logical.
* Most authors and (the tools they use) consider multiple chapters of one
document to logically belong in one file.
* Most readers don't read large documents from page one and progress to the end

An obvious example is a technical manual or a Usenet FAQ. All I want to
see is the index and/or table of contents, then the chapter or section
which answers my question. I don't want to wait for vast quantities of
gunk to be mapped into a window ESPECIALLY if it contains inlined
images (e.g. the Vatican library document index). Just show me the
section which is relevent. OK, if I then want to browse down the index,
fine, but let it be my choice.

It's possible that these things are solved better by Dave Raggett's
html2 DTD, but I can't get to hplose.hpl.hp.com. Care to give us the IP
address while your netbods set up the name, Dave?

Peter Lister p.lister@cranfield.ac.uk
Computer Centre,
Cranfield Institute of Technology, Voice: +44 234 754200 ext 2828
Cranfield, Bedfordshire MK43 0AL UK Fax: +44 234 750875