Re: Processing instructions for style tweaks?

Paul Grosso (
Wed, 30 Nov 94 16:46:25 GMT

> Subject: Re: Processing instructions for style tweaks?
> Date: Wed, 30 Nov 1994 10:39:36 -0500 (EST)
> From: Murray Maloney <>
> I am dead set against PIs. Sure we could develop conventions,
> but they could never be verified as conforming by an SGML parser.
> No, PIs are bad! PIs are worse even than format-specific
> SGML elements like <I> and <B> which can readily be mapped
> to any formatting desired at the reader's end.
> . . .

I don't want to come out as if I'm championing PIs. I believe in
"clean SGML" [Sharon Adler used to talk of "polluting" the SGML
with format information] as much as anyone.

But, as Murray elegantly pointed out in the rest of his post (that
I elided), we must allow for other people with other viewpoints.
In particular, there are (at least sometimes for some people) good
reasons for wanting more control over style that can be achieved
via, say, DSSSL Lite location/query mechanisms.

However, I do disagree with "PIs are worse even than format-specific
SGML elements." I think you're wrong, here, Murray. Having formatting
markup *indistinguishable* from structural markup (i.e., having it all
be DTD elements--some with "good semantics" and some with "bad semantics")
is the worst way to go.

The advantage of using PIs for formatting-specific markup is that it's
easy to strip/ignore them when one wants to slough off the "pollution"
of embedded format-specific information.

For example, a PI might be used to force a page break or twiddle a line
break for certain esthetic reasons during final production (this
example may be more relevant to hardcopy, high-quality composition),
but as soon as the publication has gone to press and its time to
database the information for reuse or subsequent revision, you want to
strip such markup that is not part of the base information per se but
only an artifact of a particular presentation situation that is now a
thing of the past. If I had a <newpage> element in there instead of a
<?DL newpage> processing instruction, I would need to have a more
sophisticated filter--that I would need to change with every new
format-specific element I added--to strip them all.

With PIs, I can just strip everything of the form <?DL...>, or if my
software handles it, just say "write -nopi" and get a depolluted
version of the SGML. And, if I send the SGML--PIs and all--to another
conforming SGML system that hasn't been programmed to do anything
special with <?DL...> PIs, 'no harm, no foul,' it just works and the
PIs are ignored.

Finally, using formatting elements doesn't solve many of the problems
because they either can't be used everywhere one might want, or their
content models have to be so lax as to destroy the structure of the
original DTD. PIs don't have to drastically change the ESIS tree of
the document.

I do think there are better and worse ways of using PIs to implement
the kind of format-override control that's being discussed. My earlier
posting described in more detail how I would use PIs to allow for
instance-specific location mechanisms whose specific formatting effects
would still be specified in the style sheet.


Paul Grosso
VP Research Chief Technical Officer
ArborText, Inc. SGML Open