I am dead set against PIs. Sure we could develop conventions,
but they could never be verified as conforming by an SGML parser.
No, PIs are bad! PIs are worse even than format-specific
SGML elements like <I> and <B> which can readily be mapped
to any formatting desired at the reader's end.
I'd like to start for the top again and try to describe my view
of the "Formatting HTML" big picture. I invite everyone to
tear it apart if that is what it deserves...
1) At the highest level, we have structured HTML that has
been created by an HTML editor or a conversion tool.
It is clean and parsed SGML and it has DSSSL style-sheet(s)
associated with it.
The author's and the reader's style-sheets both have general
style rules for elements in context, and the author's style
sheet also has specific rules which are tied to unique IDs
in a document or document set. If the reader decides to
override the author's style sheet, the specific rules associated
with IDs may still be preserved at the discretion of the reader
-- this may be important to understanding of the document.
This is the model that we expect SGML evangelists to follow
and evangelize. It is also the model that most closely
matches what applications like SoftQuad's Panorama will follow.
Well done. Congratulations all around. The day is saved.
2) At the next level, we have mostly structured HTML that
was created by hand or some automated tool, but it is
not necesarily clean nor conforming to the HTML DTD.
It probably has a style sheet associated with it, even if
it is some sort of generic and generally acceptable style sheet
that is commonly used -- sort of like MM or MAN macros.
The style sheet has only general style rules for elements in context.
The style sheet might be a general use style sheet or it
might be a house style sheet -- either way the reader can override.
The author has chosen not to include a style sheet with specific
rules tied to unique IDs. Instead, the author has defined specific
style rules on specific elements by setting the values of attributes
on those elements. The author is exercising creative licence.
You can argue forever over the pros and cons of the author having
the right to exercise that licence, but in the end it is the
author's work that is being argued over, not yours -- you got
what you wanted in item 1 above.
All right. It isn't how you and I would have done it,
but the author got to exercise creative licence --
which was very important - and the reader gets to use
the author's material.
3) Finally, we have arbitratry HTML. Who knows how it was created.
It may have been a random-HTML generator. Who cares how it was
generated, the point is that it is out there. It is almost
a certainty that there is no style sheet associated with it.
Fortunately, there is an implicit style-sheet (the default)
associated with HTML in every browser.
The author has chosen to define some specific style rules
on specific elements by setting the values of attributes
on those elements. Again, the author is exercising creative
licence. And again, you have nothing to say about it.
The reader, of course, can override the default style-sheet
associated with HTML by defining their own -- assuming that
the browser will allow/support that.
Now you may be thinking that option 1 is the only way
to do styles and options 2 and 3 are abominations.
That's fine. Nobody is insisting that you -- as an
author or provider of information -- must use 2 or 3.
However, as a reader you will be forced to use all three.
That's ok too. As an informed reader, you will make choices
about the kind of information that you use. If providers
that are using methods 2 or 3 discover that their service
is being used less than competing services using method 1,
they will switch. But you have to appreciate that it might
be the providers who opt for pure style-sheets (option 1)
who have to switch to satisfy their readership -- I don't
expect that, but you have to be prepared for it.
Anyway, in the final analysis, here is my message
-- formatting control is absolutely necesary
-- style sheets are good
-- architectural forms are good
-- choice is good
-- processing instructions are bad
because the can only be standardized
by convention, not in an SGML DTD