Re: Redefining the markup language on the fly?

Glenn Vanderburg (
Fri, 10 Jun 1994 10:17:21 -0500

Murray Maloney writes:
> > [David Koblas writes:]
> > Correct me if I'm wrong, but I could say something like:
> >
> >
> > And now use <NEW_TAG> all over the place.
> However, it will make it impossible to completely
> regularize the language as an SGML DTD. That is,
> it would be impossible to parse a document that
> added new tags in this manner.

These two statements are not equivalent, and neither is quite true.

It may make it impossible to "completely regularize the language", if you
define that as "explicitly specifying the set of allowed tags." However,
it won't inhibit specifying the language as an SGML DTD in any way. SGML
explicitly allows adding new tags, etc. on a document-by-document basis.
In fact, SGML provides no way to prohibit it (although a given application,
such as HTML, can simply state in its specification that such mechanisms
are not to be used).

And it certainly would not be impossible to parse a document that adds new
tags to those defined by the base DTD. A conforming SGML parser (of which
there exist several) is, in fact, required to do so, and I know of at least
one SGML application which understands how to associate reasonable semantics
with the new tags, based on architectural forms. Of course, parsing and
understanding the DTD, and then parsing the document instance using the
grammar defined by the DTD, is more difficult than simply parsing a static
language (as HTML browsers currently do). But it is not impossible.

For educational purposes, here is how David's example would be fully
specified using the current HTML 3.0 spec. I think this is accurate.

<!ENTITY % cextra "|NEW_TAG">

The "architectural forms" solution that has been discussed recently is
similar, but it uses a special attribute (such as "HTML") to do the mapping,
rather than an element in the document instance itself:

<!DOCTYPE HTML-Hypothetical [ <!-- Hypothetical example, not true HTML -->
<!ENTITY % cextra "|NEW_TAG">

---Glenn Vanderburg