Keeping HTML Simple & Format negotiation between Browser & Server

murphy@dccs.upenn.edu
Thu, 27 May 93 09:46:03 -0400


There's been a discussion on www-talk about how complex HTML should
become -- and the building opinion seems to be: keep HTML simple, and
allow more complicated formats to be embedded "in-line" in the HTML,
and negotiated between the server and the client.

Dave Raggett:
>> My suggestion is that we follow the approach taken with the PC and allow
>> object level embedding in HTML. This basically means you keep HTML small
>> and simple while allowing people to embed figures etc. in a foreign format.
>> The embedding uses a link like Mosaic's <IMG> tag. The embedded data can
>> be sent along with the main document using MIME's multipart capability.
>> This approach also allows simple browsers to negotiate the format, so
>> that the server can be asked to render equations into bitmaps etc.
>>
>> This approach avoids the danger of HTML being continuously extended to
>> support an every increasing variety of needs. By decoupling HTML
>> from special purpose formats, I believe that the latter can evolve
>> faster and more effectively, than if they were tied to revisions to HTML
>> itself.
>>
>> That said, I am drafting a proposal for HTML+ (a superset of HTML) and
>> would be happy to draft ideas for supporting equations in a presentation
>> independent format. I don't think we should tie ourselves to Tex's approach,
>> but should take the good ideas from a wide variety of sources: Tex, eqn,
>> Microsoft Word, Mac, ...
>>
>> Comments please!
>>
>> Best wishes,
>>
>> Dave Raggett,

So for example, the client could send a list of pre-defined formats to
the server (e.g. IMG, TeX, PostScript, etc), and then the server can
choose among different formats of the same item embedded in the
document to give the client what it can handle. Sometimes, the server
might not be able to convert the one of the things that the client
(browser) can deal with, and then it's up to the browser to display an
intelligent message about it, perhaps even giving the user information
so he can find software for his desktop environment which DOES deal
with that format.

Elements needed:

1) an ever-growing dictionary of pre-defined format names and
agreement on what those format names mean. It has to be ever-growing
because it's a fact of life -- there's always going to be new types of
data formats coming onto the scene. The server and the browser
(client) both need access to this dictionary, perhaps getting a new
version of it every once in a while (either automatically, or perhaps
upon the user's request)

2) The document format (HTML, HMML, whatever) needs to accomodate
documents which have multiple items in it, which could be different
types, such as an embedded graph and a mathematical formula. AND each
of the items themselves COULD be expressed in a number of different
format types. The information which the two alternatives convey would
not have to be exactly equivalent -- sometimes a picture is worth a
thousand words! But if text is all your device can display, you might
settle for the words. The person who maintains the document can put
both text & image in the document, and the image would be displayed if
possible.

3) Either:

a) The protocol which the server and the client use must allow the
client to tell the server which pre-defined format types it can deal
with, and then the server can send back a version of the document
which contains those types.
OR
b) The protocol has no negotiation; the server sends back the entire
document and the client can sort it out. This COULD be costly in
terms of network bandwidth and computing resources at the desktop...
OR
c) Is there another alternative?

4) Whichever way is chosen for #3, the browser (client) needs a
lot of information in it about what the desktop environment is like
(e.g. it can display postscript by starting up XPostScriptViewer
with arguments "-f <filename> -mono", it can display bitmap images
by starting up "xv <filename>", it can display bitmap images
by starting up GIFFER, etc).

5) The user will probably want to be able to customize how the browser
chooses desktop applications (e.g. he installs a new image viewer, or
a device driver & software for sound playback).

The complexity of how to translate and display these "foreign" formats
could then be shifted to the OTHER applications in the desktop
environment, and kept out of the browser, and out of HTML.

Problems with this model:

1) If the browser starts up another application to handle a portion of
the document in a foreign format, what happens if that item in the
document was SUPPOSED to be in-line with other portions of the
document? How does the browser pull it back into the display of
the document and make it part of the whole?

Perhaps
a) There's a way to tell the application to dump the results into a
file or a pipe, which the browser then reads & incorporates into
the document or

b) HTML accomodates embedded binary data, with a tag introducing it
as sound, or a bitmap image, or whatever, and an integer which says
how long the binary data stream is in bytes. But I believe the
assumption here is that the server MUST know about the devices that
are at the desktop in order to generate, e.g. a reasonable image.

2) What if one wants to put hot-spots (links) into one of these
foreign format objects? For example, suppose one wants the user to
be able to point & client on a portion of the mathematical formula,
which is linked to another mathematical formula (or a graph, or a
piece of text, etc) which more fully explains it?

________________________________________________________________________
| |
| Linda A Murphy Internet: murphy@dccs.upenn.edu |
| Network Engineering Data Communications and Computing Services |
| University of Pennsylvania (215) 898-9534 |
|______________________________________________________________________|