Re: Meta Tag - proposal (suggestions ???)

Jon Wallis (j.wallis@wlv.ac.uk)
Thu, 16 Nov 1995 07:50:27 +0000


At 17:23 15/11/95 -0500, "Joe Budge" <budge@clark.net> wrote:
>> Request for comments, suggestions, etc...
>
>The META tag could benefit greatly from HTTP-EQUIV's for "revision"
>(as in 'revision number') and 'timestamp' (as in 'date/time the
>document was authored').
>
[snip
>An interesting "nice" feature would be an HTTP-EQUIV for 'period' (as
>in 'the document covers the stated period'). This would be used to
>organize information so that one can organize/retrieve by historical
>time period (eg: "give me all documents where 'title' contains
>'United Nations' and 'period' contains '1945').
[snip]

>> It is possible to use any text string, but if you want to define these
>> properties you have to use the following words:
>>
>> keywords: to indicate the keywords of the document
>> author: to indicate the author of the document
>> expire: to indicate the expire date of the document
>> language: to indicate the language of the document
>> abstract: to indicate the abstract of the document
>> organization: to indicate the organization of the author
>> public (yes,no): to indicate if the document is available to averybody
>> or not

What about a META element for "subject classification" - using the Dewey
Decimal or Universal Decimal system?

classification: to indicate the subject classification of the document

This would of great use in broad high level searching, obviating the need to
havw to do low-level content-based searching from the outset, which, in any
case, tends to return lots of "false-positive" results.

Class-base searching (using the content field of an element like

<META NAME="Class" CONTENT="123.4">

would significantly reduce the problems of homonyms, synonyms, variant
spelling and different languages.

e.g.,

homonyms
you search for "bass" - looking for the fish of that name - and get
documents about "bass" the musical instrument.

synonyms
e.g., you search for "theology", but my document only contains the words
"religious dogma", or you look for "car" but my document says "automobile"

variant spelling
you search for "colour", my word is spelt "color"

different languages
I look for "car", but your document is in french and says "voiture"

In a classification based approach, supported by a META "class" entry added
by the author (or by indexation in a "Web Library" that uses Dewey/Universal
Decimal), all the above problems could potentially be eliminated:

- "bass" fish would be under "597", bass the instrument would be "787"

- theology and religious dogma would both be under "2", car and automobile
would both be under 629.222

- colour and color would both be under "535.6" (NB this is an extension of
Dewey)

- car and voiture would both be under "629.222"

So, when searching, class-based searching would be used to identify a set of
"candidate" documents that were relevant to the subject in question,
low-level content-based text searching would be used, if necessary, to
focus the search on highly specific topics.

I would very much welcome comments on this idea.

Regards to all,



--
Jon Wallis         Senior Lecturer in Information Systems Engineering
School of Computing & I.T., University of Wolverhampton, UK - WV1 1SB
   Personal WWW Home Page   <URL:http://www.scit.wlv.ac.uk/~cm1906>
     University WWW Home Page <URL:http://www.scit.wlv.ac.uk/> 
-----------------"That's some catch, that catch-22"------------------