Re: Questions about HTML conventions

Daniel W. Connolly (connolly@hal.com)
Tue, 8 Mar 1994 02:05:08 --100


In message <199403021725.SAA06528@tklab3.cs.uit.no>, Bjoern Stabell writes:
>
>First, where should meta information for documents be?
>
>Already, some information are in the MIME-headers (like
>Content-type and Last-modified) but the HEAD-part of HTML
>documents have some potential.

Ahh... very good question. I've been thinking about the same
thing lately. It seems to me that the HTML head/body structure should
be analagous to -- rather than subordinate to -- the RFC822 head/body
structure.

For example, I hope that one day the two following examples will be
treated identically by WWW clients:

ex 1:
To: Multiple recipients of list <www-talk@info.cern.ch>
Subject: Bird Watching
Date: Sun, 06 Mar 1994 16:56:53
Mime-Version: 1.0
Content-Type: text/x-html

<H1>Connolly on Bird Watching</H1>
[blah blah blah]

ex 2:
<HTML><HEAD>
<TO>Multiple recipients of list <mailaddr mbox="www-talk"
domain="info.cern.ch"></TO>
<Subject>Bird Watching</Subject> <!-- or <TITLE> -->
<Date ZULU="19940306165663">Sun, 06 Mar 1994 16:56:53</Date>
</HEAD>
<BODY>
<H1>Connolly on Bird Watching</H1>
[blah blah blah]
</BODY>

>There is a trend today, it is even a recommended practice, to put
>meta info like 'last updated', 'creation date', 'creator',
>'modified by' and 'next document' in the body of documents. At
>least Last-modified is transmitted by many http daemons in the
>MIME headers and so the browser should be able to present this,
>putting it in the body of the documents is superflous.

Agreed. Meta-information should be separate from document content. But
the whole idea behind hyportext is that the distinction is somewhat
blurry. For example, the server might be sending a cached HTML
conversion of a LaTeX document that was last modified 1/1/94. Should
the Last-Modified be the date of the converted HTML or the original
LaTeX file?

Actually, I'd like to see both. I'd like to see the server express in
machine readable form "I'm sending you foo.html, which was generated
on 19930701 by fred@foo.com from foo.html (which was last modified
19930405)." This requires an extension to the linking formalism of
HTML. For example:

Subject: Connolly On BirdWatching
Mime-Version: 1.0
Content-Type: multipart/x-sgml; boundary="cut-here";
dtd="ftp://info.cern.ch/new_html_dtd.sgml"

--cut-here
<!DOCTYPE HTML SYSTEM "ftp://info.cern.ch/new_html_dtd.sgml"
<!ENTITY bodyContent SYSTEM "cid:19940305.lkjsdf@hal.com">
>
<HTML><HEAD>
<resource id="r1">
doc-id:hal.com/connolly/bird-watching</resource>
<resource id="r2" octets="10754">
http://www.hal.com/users/connolly/bird-watching.html</resource>
<resource id="r3" octets="9432" notation="text/x-latex"
last-modified"19940301">
ftp://ftp.hal.com/pub/connolly/bird-watching.tex</resource>
<translation anchroles="orignal translation" degradation=".75"
expiration="19940310"
linkends="r3 r2">
<locator anchroles="name location" expiration="19941001"
linkends="r1 r3">
<copy anchroles="orignal copy"
linkends="r2 bodyContent">
</HEAD>
<BODY conref="bodyContent">
</HTML>

--cut-here
Content-Type: text/x-html

<H1>Bird Watching</H1>
<H2>Abstract</H2>
[... HTML conversion of bird-watching.tex]

--cut-here--

>At least, it shouldn't be necessary to present a 'back to
>prevoius page' button at the bottom of every page. It's not that
>hard to maintain a stack of previously visited URLs in the
>browser. Also, having such stuff in the document makes them less
>usable in other environments (like if we incorporate a general
><INCLUDE> scheme in HTML+).

Amen. Another astute observation. The trick here is that (1) we're
pretty much using an SGML DTD to specify HTML (2) SGML DTDs are by
nature not very extensible, and (3) not everybody will want to use the
same schema for next/previous/up/down type stuff.

I expect what we need is for documents to be transmitted with all the
up/down/etc. stuff in pretty much every node, but labelled in such a
way that the browser can recognized the <next> tag or whatever and
display it as a button in a familiar place, rather than a link in the
text flow.

>Secondly, how should <TITLE> and <H1> be used?

Title is meta-information, and H1 is content. At least that's the
design. Yes it's redundant, but that's not bad, if you ask me.

Dan

------- End of Forwarded Message