Suggestion: URL string-search syntax

Mark Torrance - Sun BOS Sunlabs (torrance@pompeii.east.sun.com)
Tue, 24 May 1994 15:56:47 +0500


I have a suggestion for an addition to the #key syntax currently used
to enable a URL to refer to a labelled anchor within a document.

The suggestion is to extend this syntax to support reference to an
arbitrary text string contained within the referenced document. As
with present #key URLs, this search would be done by the client after
the document is retrieved, and the # and whatever follows it would not
be sent to the server as part of the URL.

My suggestion is that #! be reserved as the header for a string to
be searched for in this fashion. So <http://www.ai.mit.edu#!finance> would
retrieve the URL <http://www.ai.mit.edu> and then search for the first
occurrence of finance. If it is found (case sensitive), the browser would
display that portion of the document (and probably hilight the string, or
otherwise indicate where it is). If not found, I suggest just going to the
top of the retrieved document with nothing hilighted. I also suggest
allowing spaces in the search string; this works at present in NCSA X Mosaic.
Should they be escaped? Should we use a different syntax in the URL for this?

I have implemented this functionality by making small changes to NCSA X Mosaic.
If someone from NCSA would like to consider adding these changes, write me and
I will send them to you.

Reasons for this change:
I would like to be able to refer to parts of a document that was not
structured to facilitate such reference; for example, a document of which I
am not the author. Eventually, I would like to see byte-offset ranges
available as a way to refer to parts of other documents as well.

This will support annotations which refer to specific parts of a source
document. Smart browsers may be able to indicate the presence of these
annotations with marginal notes or changebars next to the sections to which
they refer.

There may be concern that the URL will break if the document is changed.
This is a concern already with references to internal named anchors.
One possible solution would be some sort of short identifier, automatically
computed by a browser/poster at the time a user is making an annotation,
which is unlikely to match anything but the original section, and which
may be designed to match that section even if it changes slightly. We are
currently working on the design of such an identifier; suggestions are
welcome.
______________________________________________________________________________

Mark Torrance Tel: (508) 442-0812
Sun Microsystems Laboratories, Inc. Fax: (508) 250-5067
2 Elizabeth Drive (Mailstop: UCHL03-207) Net: torrance@east.sun.com
Chelmsford, MA 01824-4195 USA
______________________________________________________________________________