>I do not have a Web browser to base my browser on, so my question is
>what is the best way to parse HTML for my purposes? I am using a
>Windows 3.x platform (16-bit).
>
>Some ideas I have thought of doing include:
>
>- using sgmls
This will work, but it may not be convenient.
>- using the W3C Reference Library
The HTML parsing code in the W3C reference library has gotten
kinda crufty. Henrik has been concentrating on protocols
for quite some time, and the SGML/HTML stuff hasn't been
revised much, even though we've found some bugs and changed
our minds about the best way to do some things.
I've been working on some code to update the library. I have
it working, but I haven't done much integration with the
library.
A tech report describing my work is in progress at:
"A Lexical Analyzer for HTML and Basic SGML"
$Id: sgml-lex.html,v 1.8 1995/10/11 21:47:30 connolly Exp $
http://www.w3.org/pub/WWW/MarkUp/SGML/sgml-lex/sgml-lex.html
It includes a lex spec. You probably can't run lex on a 16bit
platform, but you should be able to use the code that lex
spits out when I run it.
Let me know if you want to be an alpha tester. I don't have
a public distribution ready.
Dan