from Dr. SGML^H^H^H^H Macro...

Marc Andreessen (marca@ncsa.uiuc.edu)
Fri, 18 Jun 93 20:17:17 -0500


From: drmacro@vnet.IBM.COM
Date: Fri, 18 Jun 93 08:40:05 EDT
Newsgroups: comp.text.sgml
Subject: HyTime Finite Coordinate Space Locations
Disclaimer: This posting represents the poster's views, not those of IBM
News-Software: Usenet 3.1

A typical problem with online presentation systems is the problem
of defining hyperlink anchors and other things within bitmap
images or other multimedia data notations that have no facility
for defining locations within themselves or that are not revisable.
Most systems I've seen provide some sort of addressing mechanism
that applies a location onto the object. For example, the IPF
online help system under OS/2 lets you create what it calls an
"art link" by specifying a graphic and then coding in the x and
y locations of a hot spot to create a link anchor. It works, but
it's particularly transportable or flexible. HyTime provides a
more general solution to this problem through its finite coordinate
space (FCS) location element (fcsloc).

Fcsloc lets you impose or overlay a coordinate space onto a data
object such that the boundaries of the data object are aligned with
the coordinate space. To make this works takes a little set up on
the application side, but once the coordinate space is defined, using
an FCSLOC is no more difficult than using any other location mechanism.

The finite coordinate space element (FCS) defines a set of coordinate
axes. Each axis has a measurement domain associated with it, e.g.,
what system of measurement is used for each axis (inches, seconds,
pixels, etc.). A "semantic" FCS is defined in the DTD by declaring
an element that conforms to the FCS architectural form. For example,
to address into graphics, we might define a two-dimensional FCS.

Within the document instance, particular portions of the FCS are defined
by specifying instances of the FCS element defined in the DTD.

Thus, to solve the problem of addressing into images, we first declare
an instance of the FCS form in the DTD itself:

<!ELEMENT GraphicGrid - O (evsched : wand : baton)+ >
<!ATTLIST GraphicGrid
ID ID #IMPLIED
HyTime NAME #FIXED fcs
axisdefs NAMES #FIXED "xaxis yaxis"
>
<!ELEMENT (xaxis : yaxis) - O (#PCDATA) >
<!ATTLIST (xaxis : xaxis)
HyTime NAME #FIXED axis
axisdim CDATA #FIXED "1000" -- Length of axis --
axismeas CDATA #FIXED "gquantum" -- generic quantum --
>

These declarations define the semantic coordinate space GraphicGrid,
which for this example just uses the generic quantum as its measurement
unit, which is fine for overlaying onto graphics of varying resolutions.
To actually use the fcsloc element, however, we have to actually put an
instance of GraphicGrid in our document. It would look like this:

<GraphicGrid id=maingrid>
<evsched><event></event></evsched><GraphicGrid>

For this simple example, the finite coordinate space contains a single
empty event element. This is because for this example, we aren't
creating an event schedule, but simply defining an FCS to overlay onto a
graphic image, however, the element GraphicGrid has required content,
so I had to put something.

The location source for the fcsloc is a
graphic entity, which is addressed via a nameloc element that associates
an ID with the entity name. The fcsloc element also refers to the FCS
element and contains the actual address within the FCS being located:

<nameloc id=graphic=object><nmlist entity>a-graphic-entity</nameloc>
<fcsloc locsrc=graphic-object
impfcs=maingrid>
<extlist>100 150 200 100
</fcsloc>

The Locsrc= attribute names the object against which the fcsloc is
applied (the location source). The Impfcs= attribute refers to the FCS
element that defines the grid to overlay onto the object, in this case,
the GraphicGrid element with an id of "maingrid". The extlist element
contains an "extent list", which is nothing more than a set of
dimension specifications, one for each axis, applied in the order the
axis elements are declared in the Axisdefs= attribute of the FCS
element.

Note that for a special-purpose application like IBMIDDoc, the FCS
element would not have to be actually coded in each document, but could
be defined in the application specification and treated as though it
really did exist, so that in practice, all the author has to specify is
the FCSLOC element. For interchange with other HyTime applications, the
real FCS element could be generated dynamically or included via entity
reference to an entity shipped as part of the IBMIDDoc package.

Notice that the measurement domain of the FCS is ignored for the
purpose of doing the location. If I understand it correctly,
the imposed FCS is aligned with the boundaries of the location
source (the graphic in this example). The dimension spec in the
FCSLOC then effectively defines a proportional part of the FCS,
rather than some absolute measurement. In other words, if the
FCS's measurement domain was centimeters, and you specified
a dimension of '1 450' thinking you were asking for 450 centimeters,
you would really be addressing some proportion of the FCS,
dependent on the length of each axis (defined with the axisdim=
attribute of the Axis elements).

This behavior makes sense because computer-presented objects
tend not to have meaningful absolute extents. For example, if
you use a word processor that shows a scale across a document and
labels it as "inches", chances are the ruler on the screen won't match
the ruler in your desk. Also, it's unlikely that the presentation
system will know how big, in absolute measurements, a given object
thinks it is, as that information will always be notation-specific.

Note also that FCSLOC is really an expedient for systems that
don't support either event schedules or have data notations that
provide ways, in that notation, to define anchor points within
them. Event schedules do provide the richness of expression to
allow the definition of position within an FCS using absolute
measurements. For example, instead of imposing a hyperlink
anchor onto a graphic using FCSLOC, I could create an event
schedule that places both the graphic and the hyperlink anchor
on the FCS and thus relates them to each other within the
same coordinate space.

<GraphicGrid><!-- FCS element, contains event schedule -->
<evsched><!-- Contains events in this coordinate space -->
<event id=the-graphic
extent="1 782 1 6687">
<!-- Contains reference to graphic -->
<object objectname=my-graphic>
</event>
<event id=anchor-in-graphic
extent="100 150 200 100">
<refkey linkends="the-graphic text-anchor">Link to something else</>
</event>
</evsched>
</GraphicGrid>

With this two-event event schedule, I've placed two objects
within a defined coordinate space, in this case, a graphic
and a link, thereby relating them spatially. The RefKey element
is also an Ilink element, so it specifies the graphic as one
of its anchors explicitly, even though their occurence within
the schedule also relates it to the graphic. RefKey is an element
from the IBMIDDoc language. It has the defined semantic of creating
hyperlinks between multimedia objects and text.

Note also that the elements in the event schedule are references
to other objects, not necessarily objects themselves. This is a
very powerful aspect of event schedules, because it lets you
position objects without modifying the objects themselves--anything
you can address can be placed in an event schedule by using standard
HyTime location elements to refer to it.

For a real online system to present this event schedule, it first
defines the location of the FCS itself (which might be
defined by another event schedule, see the next discussion), then
interprets each event, placing it on the FCS (in this example, a
two-dimensional grid) according to its defined extent. It then does
whatever processing is associated with the objects in each event,
e.g., display the graphic and display and manage the hyperlink anchor.
None of these functions are anything most sophisticated online
systems don't already do--we've just expressed the constructs using
a standard notation, HyTime. Implementing support for this level
of function should be a fairly small delta on top of an existing system
that already provides the presentation functions needed.

Note also that an event could itself be a reference to another
event schedule. For example, consider the problem of defining
a multimedia presentation consisting of a series of "panels"
organized in time. You would have one event schedule representing
the main time dimension. The events in this schedule would be
panel events. Each panel could be described as a two-dimensional
FCS with the panel objects being events within it. Consider this
example:

<TimeLine><!-- FCS element with one axis, measured in abstract units -->
<evshed>
<event extent="1 1"><!-- For this example, each even takes one
quantum since we're just defining order -->
<panel contentref=panel-1><!-- Uses content of Panel element
defined elsewhere -->
</event>
<event extent="1"><!-- Starts at quantum following previous event -->
<panel contentref=panel-2>
</event>
</evsched>
</TimeLine>

You could think of this timeline element as being analogous to
a master document in a DTP system, defining at the highest organizational
level the order of inclusion of the constituent elements of the
presentation, with the added function of defining their extents.

Each panel would itself be an event schedule, this time, either
containing or refering to content objects:

<PanelSpec><!-- FCS element with two axes, measured in abstract units -->
<evsched>
<TextBlock extent="10 100 20 236">
<!-- TextBlock is an event form element, now with the application-
specific semantic of containing text elements. -->
<parablock contentref="some-text-data">
<!-- Refers to some text data marked up with IBMIDDoc elements
elsewhere.-->
</TextBlock>
<ViewPort extent="50 100 60 400"><!-- Also of form event -->
<MultiMediaObject object=video1>
<TextAlternative>
<p>This video shows the following...
</TextAlternative>
</MultiMediaObject>
</Viewport>
&standard-buttons;<!-- Get standard button events -->
</PanelSpec>

Note that the PanelSpec event schedule is a sort of "style spec" for
the panel, defining not the content, but at least one aspect of its
physical aspect, namely the positions of the elements within the
panel space. However, in this sort of application, the lines between
style and content start to blur a little. They could be unblurred in
a concrete application by only allowing reference elements within
events. Conversely, you could define the panel content language
as a series of event schedules that contain their content directly.
But, the details of the presentation of each panel element, such as
border size, color, etc., would be purely a matter of style.

Note in these examples that I've used normal IBMIDDoc elements as
the content of the panel event schedule. This is to emphasize both
that IBMIDDoc is capable of applying to multimedia applications and
that the purpose of a multi-media-specific language is to do nothing
more than define the order and position of content within time
and space--there is no need to invent completely new content objects.

Going through these examples has convinced me that scheduling in
HyTime is (or can be) much simpler than it looks at first glance.
I was rather surprised at how easy it was to type in the event
schedules above. I was also surprised at how similar my "panel"
event schedule is to the structures defined by the IBM Dialog Tag
Language, which provides elements for logically organizing panels
into regions that contain the real content elements (fields, buttons,
text, etc). Regions are explicit coordinate spaces. Interesting.

Finally, note the indirection in all of this. This indirection is
crucial both to the implementation and the flexibility of the system.
It is the indirection of pointing from an event schedule to the
content that makes the system flexible and maintainable. It will be
the job of multimedia authoring tools to manage this indirection for
writers--that's the really difficult part of authoring in this
environment. For example, I would expect a multimedia editor to
build these sorts of event schedules under the covers as the author
creates their presentation, including providing an interface for
creating the references from events to actual content elements, as
well as doing the things it already does, such as calculating the
extents of events (e.g., the screen locations of viewports or
windows).

More finally, note that given this sort of HyTime language for
defining the structure of multimedia presentations, there's no
need to have a presentation system that interprets the event
schedules in real time (although that would be nice). You could
(and probably must in the short term) apply the same technique
that the Dialog Tag Language did, namely, compiling the declarative
source into a run-time form that does not need to be interpreted.
It is important to remember that just because your source is structured
with HyTime, it does not mean your presentation system must understand
HyTime, any more than using SGML for text means your ultimate formatter
has to understand SGML. It only requires the ability to create processors
that create from the source a runtime object that expresses the structures
in the source. It is really a question of binding time: when do you
bind the source to its presentation? In the near term, the binding
will be early, at the time you compile the deliverable into some
proprietary form (e.g., IPF, ToolBook, etc.). In the future, the
binding will be as late as possible using true HyTime engines that
interpret the source and create the presentation directly at
presentation time.

Eliot Kimber Internet: drmacro@vnet.ibm.com
Dept E14/B500 IBMMAIL: USIB2DK9@IBMMAIL
Network Programs Information Development Phone: 1-919-254-5160
IBM Corporation
Research Triangle Park, NC 27709