> Is there a utility to strip away HTML tags. Yes I know, WHY? I've been task
> to do such a thing at work. Any info would be greatly appreciated.
sgmls and sgmlsasp with an empty replacement file
will do the trick:
sgmls html.decl YourFile.html | sgmlsasp /dev/null > YourFile.txt
This assumes that YourFile.html is valid HTML, of course...
The output will be the text portions of YourFile.html,
with references expanded and all other markup removed.
If you're on a DOS system, substitute any empty file for /dev/null;
I don't know about other systems.