Re: Client-side highlighting; tag proposal

Joe English (jenglish@crl.com)
Tue, 14 Mar 1995 23:13:58 +0500


sjd@ebt.com (Steven J. DeRose) wrote:

> At 1:35 AM 3/14/95 +0500, Joe English wrote:
> >This is not much of an issue for HTML documents on the Web,
> >since they tend to be small and are rendered as a single unit
> >anyway. It's not like a browser is going to display the book of
> >Leviticus and have to worry about a marked region starting in Exodus
> >and ending in Deuteronomy.
>
> On the contrary, that is *exactly* the problem. I do have Leviticus on a
> web site, and although my server is kind enough to break it into net-size
> chunks if/when asked, I sure do have to know whether there is some
> long-distance thing in effect, otherwise we can't know to send whatever
> start-tag caused it when sending a smaller piece.

The *browser* only needs to worry about the net-sized chunks though.

Would HyTime spanlocs (or equivalent) help the server in a case like this?
That is, would it be any easier for the server to recognize that
Leviticus is in the middle of a marked range if the range
were identified by an external locator instead of by embedded
<MARK> elements? (or <SPOT> elements, which I'm starting to
like better now.) Maybe it can -- there's an obvious optimization
for the case where both locators are treelocs, but in the general
case I really don't see how it would help.

> >> Likewise, one cannot easily build a stack-based
> >> formatter, e.g. that keys styles off the list of element types in one's
> >> ancestry.
> >
> >This is only partly true, and irrelevant besides.
> >If the browser is going to include this functionality --
> >highlighting regions that may cross element boundaries --
> >it can't use ancestor-driven style resolution in any case,
> >regardless of how the regions are identified.
>
> Your critique is incorrect. Existence proof: open a dynatext book, since
> dynatext does in fact use "ancestor-driven style resolution" for SGML.
> It quite happily supports "highlighting regions that may cross element
> boundaries" -- just do a drag-select or a phrase search and watch.

That's basically the point I was trying to make.
See if this makes sense:

> >> Likewise, one cannot easily build a stack-based
> >> formatter, e.g. that keys styles off the list of element types in one's
> >> ancestry.

[ my comment sneakily removed to make it look like Steven was
answering the above and not me --JE ]

> Your critique is incorrect. Existence proof: open a dynatext book, since
> dynatext does in fact use "ancestor-driven style resolution" for SGML.
> It quite happily supports "highlighting regions that may cross element
> boundaries" -- just do a drag-select or a phrase search and watch.

> >And lastly, you *can* use a single-pass parser with a stack-based
> >formatter to keep track of marked spans.
>
> Precisely my point: you must do O(n), not O(lg n). Is that not unfortunate?

How do you format a document with n elements in O(lg n) time?

Once you've *parsed* the document, you can *display any piece of it*
in O(m * d) time (m being the size of the piece, d being the
depth of the element hierarchy -- not exactly O(lg n), but close enough).
This is true of the scheme I had in mind too; tracking the marked
regions during parsing is no more expensive than processing an
LPD would be (that's cheap, BTW). It boils down to doing
something like #POSTLINK processing during the parse; you don't
need to scan backwards during rendering.

I agree with the rest of your points (which I've deleted); HyTime or
HyTime-like locators would be a better approach than <MARK> or
<SPOT> elements would be. I *don't* agree that <MARK> should be
dropped from HTML 3 just because HyTime could do it better, though, any
more than <P align=center> should be eliminated once stylesheets come
along. <MARK> has the distinct advantage of simplicity, for both
browsers and search agents. The arguments against it are IMO not valid.

I also agree that <MARK>-like elements have no place in
DTDs intended for large-scale authoring. HTML is not such
a DTD, though. It's for lightweight presentation and delivery,
and <MARK> is a good lightweight mechanism.

I also also agree that universal support for HyTime in Web browsers
would be great. I have serious doubts that it will happen
any time soon though; the most popular browsers around still don't
support entity declarations or marked sections, and don't even get
attribute value literals right. (If the release of Panorama
causes a mass migration away from Netscape, I'll reconsider
this one, but until that happens I have little hope that HyTime
is a viable solution for the Web as a whole.)

--Joe English

jenglish@crl.com