Miscellaneous points

J.Larmouth@iti.salford.ac.uk
2 Dec 93 11:38


=========================================================================
E-mail from: Prof J Larmouth J.Larmouth @ ITI.SALFORD.AC.UK
Director Telephone: +44 61 745 5657
IT Institute Fax: +44 61 745 8169
University of Salford Telex: 668680 (Sulib)
Salford M5 4WT
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

To: WWW-talk @ info.cern.ch

(I sent the following on another list, and it was suggested that the
WWW-talk list - of which I am not a member - would be a better place to
send it. I DO NOT KNOW HOW TO SUBSCRIBE - COULD SOMEBODY LET ME KNOW SO
THAT I CAN SEE ANY RESULTING DISCUSSION? Thanks. John L)

Some of the following (use of WWW in the UK.AC and Error and congestion
handling) is probably less relevant for WWW-talk than it was for the
original list, so please skip these sections.

The original mail follows:

>>>>>>>>>>>>>>>>>>>>>>

Subject: Miscellaneous points

(If you don't like long and rambling E-mails, delete this NOW!
If you do, and you think some of the thoughts are worth wider
discussion, please copy to other lists if you wish.)

Contents:

1. Use of WWW in UK.AC
2. Negative remarks about WWW
2.1 HTTP
2.2 HTML
2.3 Error and congestion handling
3. Philosophical thoughts on networked information
3.1 Stateless servers are good?
3.2 Synchronisation primitives
4. Proposed WWW/APW feasibility study project
5. Final philosophical mutterings

1. Use of WWW in UK.AC
===========================

I believe that the UK.AC needs a top-level WWW home page (probably
maintained by someone on the JNT or on contract to them) which carries
pointers to **ALL** known networked information sources in the UK.AC.

(When I say ALL, I mean top-level - a site may well contain a subsidiary
page giving sources at that site. These would be listed at the UK level
only if they were widely accessed from other sites and of general utility
such as the Lancaster archives or the BIDs database.)

It would contain sections on WWW home pages for sites, gopher, archie
WAIS, etc servers, anonymous ftp sites, sites providing information
via an E-mail request, software archives, BIDs etc databases, and
mailing list servers with the lists they contain. In each case there
should be enough information to identify the sort of information
available from that site via that means, and (where it is not a simple
URL) details on how to go about getting the info.

I would suggest a primary document with this info, with indexes
(supported by hypertext links) that would reveal the data either by
subject category or by access mechanism.

I suspect the info should not only be an HTML document, but should also
be searchable with WAIS, Gopher, Archie, Winifred (or whatever her
name is!) etc.

Over to someone!

2. Negative remarks about WWW
==================================

2.1 HTTP
------------

The main function of HTTP seems to be to do the OSI presentation layer
negotiation of transfer syntaxes, with announcement of the abstract
syntaxes supported. I am *very* pleased these concepts have been
introduced, but would have preferred to have seen closer alignment of
terminology with (and recognition of) the OSI work.

2.2 HTML
------------

(This is based on MOSAIC documents, not the primary reference material,
so there could be a misunderstanding - someone please correct me.)

HTML is *NOT* SGML-conformant. It should be mended. What I am
referring to is the ridiculous use of <P> as an *end*-tag for a
paragraph, with compulsory ommission of the start-tag - there is no
start-tag! This violates the rules of SGML on tag-ommission, and makes
authoring an HTML document distinctly non-ergonomic.

If this is not mended quickly, it will be too late. Perhaps too late
now.

Does anyone have access to the right places to get this problem
addressed?

I have yet to locate an actual HTML DTD (as opposed to tutorials) but I
get the impression that ommission of end-tags on <H1>, <H2>, etc is not
permitted (or at least not supported by most clients). If I am right,
then this is again very *unergonomic*, and should be addressed.

2.3 Error and congestion handling
=====================================

When using both MOSAIC and CELLO, there are frequent occasions when
they cannot connect to a site and/or when the transfer of material just
grinds to a halt. Even when it doesn't, transfer rates of about 2K
bytes per minute seem quite common!

The error reports on failures to connect are not very helpful. This is
partly poor design of the client software and/or lack of diagnostic
parameters in WINSOCK specifications, and/or the fact that we are in
early releases, but I think it may also reflect a lack of appropriate
information at the network level.

For example, am I getting failures to connect and/or poor transfer rates
and/or halts because of the link from Salford to the Janet back-bone,
because of the pipe across the atlantic, or because the server is very
congested/slow, or because my PC is too slow, or what? I have a
strongish suspicion it may be the Salford link and/or routers, but there is
no easy way to find out.

I have no answer to these sort of problems, but as world-wide access to
information servers becomes more common, pin-pointing where problems lie
will become more important. Certainly at present there is no way I would
want to expose the VC to either MOSAIC or CELLO - too many failures, and
far, far too slow. Hope it changes soon tho' - the idea is great!

3. Philosophical thoughts on networked information
=======================================================

I think we need to ponder a little on the appropriate metaphors for
developing multi-media services. There was some discussion earlier on
what to base JNT and European activity on, and the consensus seemed to
be WWW, which really means HTML. I don't wish to quarrel with that
decision - it looks like a fair starting point (but see the negative
comments above).

Someone, however, remarked in earlier discussion on this list that
there were two sorts of multi-media information/presentation, and WWW
was good for one but APW was optimised for the other.

I think we need to think a bit more about that.

3.1 Stateless servers are good?
-----------------------------------

As a starting point, I see the concept of STATELESS SERVERS as a
fundamental issue. WWW is based on the philosophy of a single
world-wide distributed server that is stateless. Stateless servers are
clearly easier to implement efficiently because there is no need for the
server to maintain per-user information. They are probably a GOOD THING.

But they *do* limit what you can do!

If you think about a computer game (take LEGEND as an example of one I
know fairly well), you will recognise that retention of state is
fundamental to the operation of the game. This goes from simple
parameterisation to let you give characters names of your own choosing
through to quite complex data-structures holding details of the objects
being carried at any point in time and spells that have been mixed.
There is also a database that relates to whether a particular hot-spot
has an object in it that can be picked up. Hitting particular hot-spots
up-dates these data-structures, sometimes with a conditional test (is
the character already carrying as much as can be carried?)

It is part of my thesis here that anything you need to handle a networked
computer game, you also need for more professionally oriented services.

HTML appears to have a tag for a "variable", but I have not seen it in
use in any actual document, and am not fully clear how powerful the
mechanism is, but it sounds as tho' it relates to the above, and could
perhaps form the basis for providing support for controlled (by the HTML
document) state retention by the client with a stateless server.

Indeed, I could foresee a useful world-wide standardisation of some
variables (please use OBJECT IDENTIFIERS for the variable names!), such
as the user's name and/or e-mail address. Information delivered to users
could then be personalised without introducing state into the server.

Going far enough to support LEGEND in this way requires rather more
study, but I commend the intellectual activity!

3.2 Synchronisation primitives
----------------------------------

If we are seriously considering extending WWW to have the functionality
of APW, then we have an interesting task in hand.

WWW is a mesh of information, and the delivery of various parts of that
are largely uncoordinated (in-line images v non-in-line) is about the
only synchronisation that is present.

A fundamental need is to introduce into the HTML mark-up tags that
reflect various synchronisation requirements between the delivery of
different sources, and indeed the whole concept of multiple presentation
threads.

As a starting point, it would be an excellent intellectual exercise to
take the functionality represented by an APW tree of icons and see if we
can produce a definition of tags that would completely reflect that
functionality.

(This is actually another way of saying "Extend HTML so that the
underlying format for APW *could* be HTML, with no loss of
functionality".)

Only if this is done can the vendors of Authorware (or anyone else if
Authorware were prepared to release their private formats) produce
software to turn an Authorware program into a "standard"
vendor-inbdependent form.

I am, by the way, fairly convinced that the job can be done, that SGML
*is* man enough for it, but it will need a conscious effort and
coordinated programme.

4. Proposed WWW/APW feasibility study project
==================================================

It would be interesting to tackle the following project, which would
result in an extended HTML (XHTML) and a prototype APW look-alike that
would run on any XHTML client:

a) Examine the APW icon and flow diagram functionality and
define an appropriate set of tags to reflect that functionality.

(This would in principle make it possible for a human being to
translate any APW program into XHTML such that any XHTML client
could play the programme back with a result for the user that is
identical to the playing of the original APW programme.)

In order to do this, the problems of synchronisation tags
and of tags for state variables, and of tags for DO WHILE loops
will have to be addressed. The work of tag definition should,
of course, be done in a general way, but must provide the full
APW functionality - at least!

b) Consider a further extension of XHTML to define tags
that, when encountered by an XXHTML client, cause it to take
actions that result in the production of an HTML document. This
could be extended to other sorts of actions, particularly
related to the disposal (for example, E-mailing) of the
document, or the automatic invocation (controlled by clicking
a hot-spot in the document) of FTP or TELNET or E-mail.

c) Using the work of b), define an XXHTML document that
provides a display similar to the APW display of programme icons,
and such that "browsing" that document in a suitable way causes
an XHTML document to be generated that is equivalent to an APW
programme. In other words, an XXHTML document that gives any
XXHTML client the functionality of APW to produce a multi-media
programme, albeit perhaps with a poorer user interface.

5. Final philosophical mutterings
======================================

If you have followed the last section, you will see the overall
direction I am seem to be heading in.

I can already hear mutterings of "horses for courses". SGML is *not*,
and never can be a programming language. Tags that cause actions (by
the client) when encountered, rather than merely identifying document
components, is outside the "spirit" of SGML. But I reply so are tags
that cause hypertext links to be activated! And those are fundamental to
HTML.

It is possible the mutterers are right, but we will never know unless
we explore the feasibility in more detail.

If the mutterers *are* right, then WWW is, I think, *not* a suitable
tool for providing networked support for the sort of multi-media
programmes that APW is used to produce, and we need to investigate some
other base for these sorts of multi-media programmes.

Personally, however, I have little sympathy with the view that "there
are two sorts of programme, and we need different tools for each". I
believe there will be many programmes that will need both approaches,
and a common (or at least integrated) tool base would be important.

John L