Re: LANG: Binary formats.

Chris Marrin (cmarrin@ariel.engr.sgi.com)
Tue, 28 Feb 1995 10:34:51 -0800


On Feb 28, 7:05am, Peter Kennard wrote:
> Subject: Re: LANG: Binary formats.
> Well, I know you know but Venue(tm) is an appropriate foundation (when it
> gets a final linker) for a binary format. Basicly it offers:

Remember who this is coming from (A member of the Inventor team) and therefore
I am biased. The idea of a binary format is great but the idea of something
structurally different from the Ascii format scares me. "Algorithmic
descriptions of generated geometries"? If you can describe something that does
not fit in the Ascii format this is not appropriate. Inventor already has a
binary format. It is very poor in that it does not compress data very well but
it is basically just a binary representation of the information and structure
that is in the Ascii format. I think this is what we need for a VRML binary
format.

Here's what's good about the Inventor binary format (and what I think we need
for a VRML binary format):

- Structurally equivalent to the Ascii format
- Has child counts in the header of group nodes for faster parsing
- Allows support for unknown nodes (can have previously undefined node and
fields as strings in the format).

Here's what's bad about it (and what we'd need to do better):
- All numbers are float of 4 bytes
- need several compressed data types (integer and float)
- need very compact representations of things like 0 and 1.
- need to compress arrays where all z values are 0, etc.

- All nodes and fields are Ascii strings
- need tokens for the standard ones with string support for unknown
nodes/fields.

- USE/DEF are Ascii strings
- need some nice compact representation for these
- Perhaps need compressed or tokenized name strings

- There is no compressed format for enbedded texture data.
- need to support jpeg or something

There's probably also opportunity for combining group child counts into the
group token (lower 8 bits?) and other such compactions. I think this alone
will get us the 4:1 compression Mark dreams of. And we can use the same parser
since the syntax and semantics will be the same.

I think there is additional opportunity for compression by introducing new data
types. Text has already been mentioned. A 10 character text string can take
the place of about 1000 polygons used to represent it. Nurbs and other
mathematically based primitives take up much less space than their polygonal
decompositions but I think this can wait till later...

-- 
chris marrin                     ,,.                        
Silicon Graphics, Inc.        ,`` 1$`
(415) 390-5367             ,|`   ,$`
cmarrin@sgi.com           b`    ,P`                           ,,.
                        mP     b"                            , 1$'
        ,.`           ,b`    ,`                              :$$' 
     ,|`             mP    ,`                                             ,mm
   ,b"              b"   ,`                ,mm      m$$    ,m          ,,`P$$
  m$`             ,b`  .` ,mm          ,.`'|$P   ,|"1$`  ,b$P       ,,`   :$1
 b$`             ,$: :,`` |$$       ,:`    $$` ,|` ,$$,,`"$$      .`      :$|
b$|            _m$`,:`    :$1    ,:`      ,$Pm|`    `    :$$,..;"'        |$:
P$b,      _;b$$b$1"       |$$ ,,``       ,$$"             ``'             $$
 ```"```'"    b$P         `""`           ""`                             ,P`
             `"`                                              '$$b,,...-'

"As a general rule, don't solve puzzles that open portals to Hell." - excerpt from "A Horror Movie Character's Survival Guide"