EVENT: Keynote Address to WWW '95

Mark D. Pesce (mpesce@netcom.com)
Sat, 15 Apr 95 18:47:44 -0700


[ This is the *relatively* complete text of my presentation before the
Developer's Day attendees at WWW '95 in Darmstadt, Germany. I have
MS-PowerPoint slides; these will be posted, on tcc.net.org as well as by
w3.org - Mark ]

The VRML Equinox

Introduction - Cyberspace Begins

VRML is the latest in a series of improvements to the World Wide
Web which extend its usablility and broaden the base of possible
users. The abandonment of interface - which is the essence of
"virtual reality"; creates an environment which requires no training,
no metaphor, and no interface.

VRML did not arise out of the void; there is a consistent history of
innovations upon which it capitalizes, and which it uses to be truly
effective. I will do my best to outline these developments in a way
that can put VRML into its historical perspective.

Part One - The History of Cyberspace * Context as Content

1969 to 1989 - The Birth of the Internet

In the time generation from 27 October 1969, the day Internet was "born",
through the birth of the Web in 1989, the Internet existed as a medium most
suited to the machines which communicated through it. Even though these
macines communicated for human beings, the essential nature of the
communication was computer-centered, rather than human-centered. A line
like "ftp 192.100.81.101" is not, in any relevant sense, human-centered
information. The result of this methodology (which we laughingly call
"interface") is that Internet access, during this period, was restricted
primarily to those individuals who could navigate environment where memory
was the only aid to navigation. These "memory librarians" were often called
systems administrators, or sysops.

1989 to 1993 - The Birth of the World Wide Web

When TimBL created the first Web server and browser, he kicked off a
revolution in two fronts; in connectivity - that documents could be
connected through semantic reference, and in interface, because HTML could
specify features far beyond linkage - and started to approach a universal
document description format. In the beginning, the Web remained a curiosity
of the academic community - text-based, but still powerfully enabling.

1993 - Baby's Got Legs

In 1993, the engineers at the United States National Center for
Supercomputer Applications (NCSA), released Mosaic, a W3 browser with a
fully graphical interface. As well as all of the HTML widgets for text,
Mosaic included an ability to handle images within the document, and could
use these images as "maps", so that a fully graphical point-and-click
environment could be authored within the Web. Support for forms was also
added at this time.

Together these were the pivotal events in the formation of the Web. These
improvements, in both interface (GIF inlining) and connectivity (forms),
created an explosion in accessibility. The extent of that explosion appears
to be complete, and, in the most positive sense, catastrophic. Not only is
the entire Internet reformulating itself to accomodate the Web (at both
physical and software levels), but all of the world-at-large is being
effected by it. The reasons for this are absolutely clear: the Internet is
finally usable by large numbers of people, and is therefore useful to them.

1994 - Problem Child

Even given these strengths, the Web has flaws. The overriding flaw rises
out of its essential nature as both hyperspatial and textual; it's very
difficult to find anything, except by accident. This is because, within the
Web, there's no there there. That is, every point is directly connected to
every other point, and most often, this point of connection is textual.
Thus, the URL, for all of its wonders of expression, is still
computer-centered data. No one can remember them; I can tell you to go to
the VRML Forum home page by accessing http://vrml.wired.com/. That's pretty
compact as URLs go, but chances are that you'd have trouble remembering it
in half an hour. If you didn't count yourself among the Web-literatti,
you'd probably not understand it at all, nor would you remember it.

So, in some sense, the Web is still where the Internet was twenty-five years
ago; it has lowered the barrier to entry greatly, but we still need to tell
our computers something we ourselves don't understand. This will prevent
people from using the Web in truly worldwide numbers; my mother, for
example, has been able to master sending me email at mpesce@netcom.com,
address, but I doubt she could handle
http://www.net.org/~tcc/people/mpesce.html - and this isn't a comment on my
mother, but on the essential nature of the URL. In a perfect world, they
would be invisible, but because of the text-and-hyperspace nature of an HTML
web, they never could be.

1994 - The Sensual Web

A few years ago, research into "sensualized" interfaces began to receive
widespread attention in the press and the industry. A wide range of
technologies, which collectively came to be known as "Virtual Reality",
began a fundamental change in the nature of the user interface, moving it to
a human-centered design; where the space around the user became the
computing environment, and the entire sensorium was engaged in the
interface. All of this was in an effort to make computers more responsive
to the humans who used them, and focused around a basic realization: if
something is represented sensually, it is possible to make sense of it.
This has particular relevance for the World Wide Web; sensualizing the Web,
and specifically, putting the "space in cyberspace", gives us a perceptual
handhold in a space which now lacks them.

Early in 1994, working with Tony Parisi, I developed a three-dimensional
equivalent of HTML, and a "helper app" which could work in conjunction with
an HTML-based Web browser. We called the application "Labyrinth". In
February of 1994, looking through the Web pages at CERN, we found that TimBL
believed that a perceptualized Web would be an important step toward a rich
Web. Over a series of emails, we found ourselves invited to present our
work in Geneva, at the First International Conference on the World Wide Web.

At the conference, Dave Raggett and TimBL had organized a
"birds-of-a-feather" session to discuss "Virtual Realilty Markup Languages
and the World Wide Web". It was clear that there was intense interest, at
CERN, NCSA and other places, to provide some more perceptualized interface
to the web. While our work was preliminary, it was also clear that a more
industrial-strength approach to a VRML (as it was now called) was necessary.
With plenty of help from Brian Behlendorf, the sysop for WIRED Magazine, and
the blessing of WIRED (who donated server space and bandwidth) we
established a mailing list for interested parties. To our pleasant
surprise, within a month we had two thousand people worldwide eagerly
engaged in a discussion of what VRML should be.

In the very early days, we confronted two issues directly; first - should we
"reinvent the wheel" or adapt an existing, even commercial implementation as
the basis for VRML? The consensus of the list membership was that an
existing solution would be preferrable to something "hacked up" by us. It
would also bring us to an implementation phase considerably faster. We
asked the list members to nominate candidates, and after a thee-month
process, settled on the ASCII format of Silicon Graphics' Open Inventor
language. They agreed to place this data format into the public domain, and
further, contributed QvLib, a C++ class library which can be linked into any
VRML application, and is used to parse VRML into an internal object
representation. This work, by Gavin Bell and Paul Strauss of SGI, is a core
technology of VRML, and forms the basis of something that has become as
integral to VRML as libwww is to the Web.

The second and more vexing issue concerned scope. While HTML grew
organically out of a community's needs for a hypermedia system where none
existed previously, the same could not be said for virtual reality. Films,
from "Brainstorm" to "Lawnmower Man", and implementations as varied as
"Dactyl Nightmare" and "PLACEHOLDER" gave their own rendering of the feature
set essential to a virtual world. The Web provided some guidelines for a
minimal feature set, but beyond that lay a gray area of possibilities so
broad they could easily choke any development effort in language wars and
other semantic skirmishes. For this reason the consensus of the list
membership is that VRML in its 1.0 specification not highly interactive, but
merely replicates the functionality of HTML in its ability to "inline"
objects, and "anchor" objects (down to the polygon level) to other items
within the Web. As we well know, even that limited interactivity gives HTML
and VRML a very broad range of possibilities.

In Chicago, we presented the Draft Specification for VRML 1.0; this was used
as the basis for the implementation for two commercial VRML browsers; the
WebSpace browser, a collaborative effort of Silicon Graphics and Template
Graphics Software, and Intervista WorldView. During the last two months,
the implementors have found some deficiencies in the draft specification.
These have been repaired, and in the beginning of May, the VRML 1.0 Final
Specification will be released; WebSpace and WorldView will also be released
in the same time frame. VRML browsers will be common by Midsummer.

Part Two - Technology and Applications

What does VRML have in it that makes it useful? Essentially,
VRML is a three-dimensional equivalent of HTML, in that you can
have all of the features of an HTML environment - a document
oriented view of the Internet - but within a fully perceptualized
context. VRML file, or "world" (these files are generally given the
extension ".wrl" as a result) contains a complete scene description,
with an exhaustive listing of all elements that are in the world.

VRML in large part follows the syntax developed for Open
Inventor. This means that each file begins with the simple tag, and
then is followed with as many objects as are required to define the
given scene.

A sample file ("sample.wrl") is as follows:

#VRML V1.0 ascii
Separator {
DirectionalLight {
direction 0 0 -1 # Light shining from viewer into scene
}
PerspectiveCamera {
position -8.6 2.1 5.6
orientation -0.1352 -0.9831 -0.1233 1.1417
focalDistance 10.84
}
Separator { # The red sphere
Material {
diffuseColor 1 0 0 # Red
}
Translation { translation 3 0 1 }
Sphere { radius 2.3 }
}
Separator { # The blue cube
Material {
diffuseColor 0 0 1 # Blue
}
Transform {
translation -2.4 .2 1
rotation 0 1 1 .9
}
Cube {}
}
}

VRML is in its essence an object-oriented scene description
language; that is, each object in a scene is self-contained, and
furthermore, objects can be nested within other objects. This has a
number of benefits; using this methodology, it is possible for a
VRML object to inherit or donate qualities (like color, translation,
rotation, etc.) from a parent or to a child.

Any VRML scene has three basic types of information, Separators,
Nodes and Fields.

Separators

A Separator establishes scope. Any fields or nodes within a
separator have scope only within the range of that separator. Thus
an Info node (which can be used to give names to things) or a
Translation node (which can place things within a scene) can be
scoped using the Separator.

Nodes

Nodes are "doing" part of VRML; they actually describe the
material features of a scene. Valid node types in VRML are (non-
exhaustively):

Cone
Coordinate3
Cube
Cylinder
Directional Light
Group
IndexedFaceSet
IndexedLineSet
Info
LevelOfDetail
Material
MaterialBinding
MatrixTransform
Normal
NormalBinding
OrthographicCamera
PerspectiveCamera
PointLight
PointSet
Rotation
Scale
Separator
ShapeHints
Sphere
SpotLight
Switch
Texture2
Texture2Transform
TextureCoordinate2
Transform
TransformSeparator
Translation
WWWAnchor
WWWInline

A complete description of all of these nodes can be found in the
VRML Specification at http://www.eit.com/vrml/vrmlspec.html

There are two nodes that are quite specific to VRML, and to the
Web in general - WWWAnchor and WWWInline.

WWWAnchor has the following syntax:

WWWAnchor {
name "http://www.blah.org/linked.to.thing"
map NONE | POINT
}

This node creates an anchor on an object at any point, down to the
single polygon level. The object is anchored to the URL given in
the name field. The map field creates the three-dimensional
equivalent of image mapping with inline images; if the field's value
is NONE, then no mapping data is returned, however, if the field's
value is POINT, then the URL is sent with ?x,y,z appended to it,
these values being the x, y, and z values within the scene where the
selection event occured. This could be used, for example, to select
an area on a globe, and greatly enhances the usability of VRML
anchor.

The WWWInline is roughly equivalent to the IMG SRC tag for
inline images given in HTML. In this case, however, an entire
VRML file can be included within the inlined file, or it can simply
be a single object. The file must be a proper VRML file, however,
and must have all of the correct syntactical elements of a stand-
alone VRML file.

WWWInline has the following synxtax:

WWWInline {
name "http://www.bar.org/inline.vrml.doc"
bboxSize bbox.x bbox.y. bbox.z
bboxCenter bbcent.x bbcent.y bbcent.z
}

The node specifies the URL (which can be either a script or a static
document), and also provides two fields for a description of the size
and center of the object. This feature allows nested VRML files to
be very dense in their description while being parsimonious in their
utilization of network bandwidth. A VRML browser can know the
"bounding box" of an object (that is, the maximal volume which an
object will occupy) before the object is loaded. This information
can be used by the browser to generate a view of a scene before all
of the elements in the scene have been loaded. As browsers move
into multi-threaded implementations, these features will become
increasingly important. It is not unreasonable to believe that as
VRML progresses, most scene elements will be referenced
indirectly, through the WWWInline node.

Through a careful use of the WWWInline and LevelOfDetail
nodes, VRML files can be rich yet lean, leaving these decisions to
the browser application. It was here that VRML attempted to learn
from the shortcomings of HTML with respect to variable
bandwidth demands and capabilities of the browser.

Current Applications of VRML 1.0

VRML adds perceptualization capabilities to the World Wide Web
in a way that has not been possible before this, therefore an entire
new class of Web applications have been enabled. While it is
impossible to suggest all of the ways that VRML might be used in
the Web environments of the future, three projects do highlight
VRML's capabilities to service the needs of communities of three
kinds; geographic, demographic, and economic.

Virtual SoMa

The heart of the nation's multimedia industry is located in San
Francisco's South of Market (SoMa) neighborhood. Several
organizations have initiated a project to model SoMa in VRML and
make it accessible through the Web.

As of the beginning of May, you will be able to wander through the
streets of virtual Soma, just as you can in the real world, but this
world is linked into the Web. Starting at the corner of a virtual 2nd
and Howard Streets, you can wander down to 520 3rd Street, the
offices of WIRED Magazine, Click the mouse button, and you'll
find yourself at HOTWIRED's home page. Virtual SoMa is the
pilot for a project which may someday cover all of San Francisco.
The uses of VRML for tourism and community service directories
are obvious.

Creating Virtual SoMa involved a community of architects
(Colleen & Associates, San Francisco), working with engineers
(The Community Company) and local businesses (WIRED,
Cyberlab7, IMF). AutoCAD DXF models were converted to
Open Inventor and then into VRML files; these files were then
hand edited (!) to add links, and were then published on Web
servers.

WaxWeb 2.0

David Blair, the avant-garde filmmaker of WAX: or The Discovery
of Television Among the Bees, and Tom Meyer, a doctoral
candidate at Brown University, created WaxWeb
[http://bug.village.virginia.edu/], a hypermedia web version of the
film. They have integrated VRML into the next release of
WaxWeb. WaxWeb 2.0 truly pushes the boundaries of
hypermedia. Using the computer to keep things stirred up, no two
trips through WaxWebVRML are ever exactly the sam. Every
visitor gets a slightly different tour through the video, sounds, and
images that make up WaxWeb.

Technologically, this has been perhaps the most ambitious project
attemted with VRML to date. It includes extensive use of VRML
scripting - and VRML can be scripted just as HTML can - as well
as a clever binding of Xerox PARC's MOO (MUD Object-
Oriented) technology, so that the enviornment thus created is
extensible, both by the administrators of WaxWeb and by the
members of the WaxWeb community. There are several hundred
VRML models at the WaxWeb site, each of which can be linked,
on the fly, to other media within WaxWeb.

Internet Underground Music Archive

IUMA has become one of the major sites in the Web; its success is
as unprecedented as it is unexpected. In the beginning it was what
its name implied; a place where "underground" muscial artists
could deliver their content. It began as an FTP site, but quickly
moved into the Web, leveraging the Web's sensuality and making
its content much more appealing and navigable. Sensing the future
of the recording industry, many major record labels have signed up
with IUMA, including such popular artists as Madonna.

The artistic staff at IUMA have created a "Sonic Lodge", an
enviornment which is both deceptively simple and quite profound.
Select a link on the IUMA home page, and soon you find yourself
in the "IUMA Living Room". Here you can sit on the sofa, or read
the magazines on the coffee table - each of these magazines link to
various IUMA home pages - or you can wander over to the IUMA
CD player, and hit the play button. This causes an MPEG Audio
File - their "pick of the week" to be downloaded and played.

It is here that the enormous potential of VRML in the entertainment
marketplace begins to become quite obvious. While it is difficult
to establish any personality - or, as the entertainment and
advertising worlds call it, "branding" - within a textual
environment, it is quite easy to create a sensual enviornment for
entertainment in VRML. Madonna, who looks rather prosaic on
her home page, would be quite alluring in the "Madonna
Lounge/Living Room/Planet".

Part Three - The Future of Cyberspace

Making VRML

Where to from here? There are a lot of applications which need to
be written, a lot of places that need to be designed. A focus on
easy-to-use design tools, and equally easy to use publishing tools
will lower the barrier to entry to VRML creation to levels
appropriate to mass access. Unlike HTML, VRML will almost
never be handwritten - and it's here that comparisions between
HTML and VRML start to break down. HTML is about text and
page design, so writing it in a text editor, be it Microsoft Word or
vi, makes some sense. But three-dimensional worlds are never
constructed in a word processor. These worlds are constructed in
CAD or walkthrough programs. These programs vary in
complexity widely - Autodesk's 3D Studio is sophisticated and
rich, while ParaGraph's Virtual Home Museum System is
inexpensive, intuitive, but not as expresive.

Over the next few months, companies such as Virtus and
ParaGraph will introduce applications that make it easy for anyone
to author VRML environments, and the other, higher-end packages
will also include VRML input/output modules as part of their
regular suite of services. Eventually, the Web will become well-
integrated with these products, and world creation and editing will
take place within the Web, in real time.

VRML 1.1 - Scalability

Despite features such as the WWWInline and LevelOfDetail nodes,
VRML does not scale particularly well. Looking at environments
like IUMA, it is possible to imagine that worlds can easily grow to
tens of millions of polygons. The real world is not page-oriented; it
consists of an inexhaustable supply of objects. At the same time,
there is a common canon objects which we encounter everywhere;
telephones, pencils, doors and windows. It seems absurd to
download an object every time it is referenced, especially if it is
among a set of objects so commonplace as to be nearly invisible.

Bandwidth considerations are also important. If I go to visit IUMA
30 times a week, is it really necessary for me to download all of the
furniture in Madonna's Bedroom every time I hit the site? Or,
should I cache the data in the local VRML browser, so that just an
HTTP GET HEAD can tell me that I don't need to download the
object again? VRML objects will undoubtedly be quite numerous;
it may well be that the enthusiastic VRML user will have 100 MB
or more of disk space set aside for the caching of objects - both
geometry and textures.

This all ties into an important advantage that VRML may well
have over HTML - this dictionary can be put onto a CD-ROM.
Using CD-ROM as a local cache, VRML can leverage the
flexibility of the Web with CD-ROM's enormous storage
capabilities. These CD-ROMs could have a global scope; that is,
they could have the essential dictionary of objects in a VRML
universe, or they could be application specific. This CD-ROM
works well with IUMA, this one with Internet Shopping Network,
etc., etc.

To construct a dictionary, we must be able to have a universal
name space for objects, both within an instance of a VRML
browser, but also, within the Web itself. It should not be necessary
to give the canonical telephone as a WWWInline with some
complex URL; rather, we should be able to say WWWInline
"telephone", and let the rest take care of itself. (This assumes that
the canonical telephone is being used.) There are proposals on the
table for a universal naming mechanism (the Universal Resource
Name) which spans the entire Web. Such a mechanism is an
essential part of this extension to VRML. If a likely URN
candidate does not exist by Midsummer, VRML designers will
either have to halt further development on scalable worlds, or will
have to use some other solution, such as ASN-1, to provide a
universal name space.

While VRML 1.x will not provide any features for interactivity, we
will most likely extend the basic feature set to support sound (both
ambient and 3D localized), and video data streams which can be
applied as a texture to some object. This facilitates the creation of
a VRML "conference room in cyberspace" which has numerous
business and entertainment applications. Finally, some simple
animation features will most likely be added.

Draft specifications of VRML 1.1 will be available in late June.

[ List Members - I concluded the presentation with a series of comments
about VRML 2.0 and where I think the "killer apps" of VRML are, that is, why
VRML isn't just "DOOM meets Home Shopping". I will post my comments about
VRML 2.0 to the list at a later time. The discussion of VRML 2.0 which
began while I was in Germany is interesting and merits a full evaluation
before I comment upon it. ]

Mark Pesce
VRML List Moderator