I suggested this to at least one robot author a while ago in the
context of URL checking (Hi Roy :-), but there are a number of
problems: CGI-script generated pages are excluded, access
authorisation is ignored, and you need to parse server config files to
look at URL mappings.
> then offered to serve those indexes from here, would people use it?
Well, by just making the file available on a well-known place anybody
can use locally-generated map. Ehr /ls-R.txt ?
> In other words, as was suggested here, you'd maintain your index locally,
> then ship it to Verity to be served by our Web server.
Or rather, you pull it whenever needed.
I think the problems identified above are rather non-trivial; and that
a trivial solution may give a significant number of bogus URL's. Even
with a local HTTP robot you have access-permission issues, but at
least you know that correct URL's get out.
X-400: C=GB; A= ; P=Nexor; O=Nexor; S=koster; I=M
X-500: c=GB@o=NEXOR Ltd@cn=Martijn Koster