Re: Initial Draft --Cascaded Speech Style Sheets

Mary Holstege (
Wed, 14 Feb 1996 08:57:41 -0800

I think the draft is very interesting and has a lot of good ideas in it.
And yet...

I just can't escape the feeling that attributes such as 'pitch-range' and
'richness' look a lot like attributes such as 'serif' or 'dpi'. That is:
they more properly belong in a font (voice) definition than in a style sheet
specification. In setting styles for rendering a document visually, we
pick "Times" or "Helvetica" or "Gothic" etc. because we know that the font
family has certain affectual characteristics that match our needs. Similarly,
one should be able to select the vocal equivalent of "Times" or "Gothic" for
the same purpose in the same fashion.

More radically, given that typographical features such as bold, italic, and
font size were invented precisely to render certain auditory features in
a visual medium, surely the reverse is true? Can we not organize voices
in a manner analogous to fonts, indexed by a few basic attributes such
as volume, pitch, and stress rather than trying to make every possible
variation of speech available at the style sheet level. I suspect this
would make the style sheet too cumbersome to use (both from an implementor's
and an author's standpoint).

Indeed --- is it possible to use the *same* style sheet for voice and treat it
as a font mapping problem? Line spacing and hard line breaks are pauses
(map points to suitable time units), flush left is send-to-left-channel, left
margin is...

Eh. Probably not.

Still, is what you've done invent a set of *style* sheet attributes or a
set of *rendering* attributes? This is the difference between, say, a
word processor style definition and a line drawing specification in that
same word processor.

-- Mary

Mary Holstege, PhD
Manager, Online Engineering
KnowledgeSet Corporation
555 Ellis Street Tel: (415) 254-5452
Mountain View, CA 94043 FAX: (415) 254-5451