Future of meta-indices: site indexing proposal and Perl script

Robert S. Thau (rst@ai.mit.edu)
Thu, 24 Mar 1994 19:33:37 --100


Date: Thu, 24 Mar 1994 17:38:28 --100
From: Tim Berners-Lee <timbl@ptpc00.cern.ch>

This suggestion (on www-talk@info.cern.ch) happens to overlap with
an SGML suggestion on uri@bunyip.com, in a discussion of URC
(Universal Resource Citations, aka Metainformation?).
so I cross-post.

Another possibility is to use

<meta name="summary">
MIT AI lab events, including seminars, conferences, and tours
</meta>

which has the advantage that it can be nested:

<meta name="author">
<meta name="name">Jane Doe</meta>
<meta name="email">jd@weird.com</meta>
<meta name="urn">/people/1967/us/va/12437234hgj3246h</meta>
</meta>

and is equivalnt to the LISP which was also proposed on
the uri list.

Unfortunately, over the short term, it also has a disadvantage, in that
documents with this particular form of metainformation coding would
probably be mishandled by plain-jane HTML browsers --- these would ignore
the <meta> and </meta> tags (as they generally ignore any tags which they
aren't specifically prepared for), and present the values of the
metainformation into the document text.

By contrast, with the <meta name="..." value="..."> scheme which I (and a
few others) have been discussing recently, the browsers don't wind up
displaying the metainformation, since it's *all* buried in tags which they
simply ignore.

(Notice of covert agenda: the reason I'm particularly concerned about this
is that I'm looking for something I can use to drive my autoindexing script
now, meaning that it has to cope well with the existing infrastructure,
including browsers which have never heard of any sort of <meta ...> tag).

Still, if there were a nested structure which the existing browsers would
ignore, I and my indexer could easily live with that. There's a hint of a
way to get one in the distinction below:

Perhaps it would be useful to distinguish between two
semantics:

1. A noun clause for the object which has properties
urn=sdfgwkedf, height=1237123, fsize=9.5

2. A *statement* that the object define by
urn=sdfhjsdf
has properites
height=1237123, fsize=9.5

If we use different tags for the two levels, we could have a structure like
this (with apologies in advance for any unintended breach of SGML convention):

<metaobject name="author">
<metastmt name="name" value="Jane Doe">
<metastmt name="email" value="jd@weird.com">
<metastmt name="urn" value="/people/1967/us/va/12437234hgj3246h">
</metaobject>

One thing that is lost this way is that you can't put HTML tags in the
metavalues, but it's not clear that's necessarily wise to permit anyway.

Comments?

timbl

rst