Re: Extraneous Characters in Netscape Display

Paul Grosso (
Thu, 25 May 1995 06:47:49 +0500

> From: "Alexander, Larry" <>
> I have been creating Web pages using the netscape rules file in HoTMetaL
> Pro. Problems appear when I have pages with special entities. Each file
> begins with:
> <!DOCTYPE HTML PUBLIC "-//Netscape Corp.//DTD HTML plus Tables//EN"
> "html-net.dtd"
> [
> <!ENTITY pound CDATA "">
> ]>
> When using NCSA Mosaic v2B4 everything looks fine. However, with Spry Air
> or Netscape, I see:
> ]>
> at the head of each page.
> Is HoTMetaL doing something illegal with the !ENTITY declaration?

The document type declaration--including the optional 'internal subset'
(the part from [ to ] inclusive)--is valid SGML.

However, it is probably the case that many browsers do not handle the
optional internal subset. It looks like some broswers blindly look
for the first > to end the doctype declaration.

Not handling the internal subset is probably understandable for non-SGML
browsers. However, it would be better [serious understatement!] if
browsers that don't handle the internal subset would at least skip over
it properly.

I also note, however, that the Public Identifier you show is not one
of the valid choices given in the HTML 2.0 spec. From the 95 March 31
spec by Dan:

4. Document Structure Elements

To identify information as an HTML document conforming to this
specification, each document should start with the prologue:


Note: If the body of a text/html body part does not begin
with a document type declaration, an HTML user agent should
infer the above document type declaration.

[There are other valid Public Identifiers listed later in the spec, but
none like the one you show above.] So, while valid SGML, the document
you show is not valid HTML. As such, though you can expect your document
to work with SGML browsers (given that you make your DTD and whatever
style information that's needed available), you cannot necessarily expect
your document to work with non-SGML, HTML-compliant browsers. Unless I
misunderstand the HTML 2.0 spec, an HTML browser has every right to see
your doctype statement and say, "oops, this isn't HTML, so I won't attempt
to display this."

While it might be reasonable behavior for a very forgiving browser to
say "I don't recognize the Public Id in this doctype declaration, so
I'll skip over it and pretend it wasn't there and instead infer the
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> declaration and
continue in that manner with this document," I don't think it's safe
to assume that will always happen.


Paul Grosso
VP Research, ArborText, Inc.
Chief Technical Officer, SGML Open