Re: Searchable Web info (was Finding CGI spec...)
web@sowebo.charm.net
Mon, 9 Jan 1995 18:29:27 +0100
Nick Arnett wrote:
>[...] 
> >        I did a search on "cgi" and got back a doc with a name I didn't
> >        recognise. Now although I have several hundreds of HTML files,
> >        like my children, I know most of them by name :*) I think you got
> >        the href from a file that has a Base tag pointing to another server.
> 
> Our spider doesn't follow links to servers other than the one where it
> starts (we trigger each index for each server individually).  Documents
> from other servers would have come from distinct indexing sessions.
> 
> Having said that, I'm not sure exactly what you're describing here.  Can
> you describe it a bit more?
> 
	OK: as I'm not sure what you're not sure of, pls excuse if I
	explain the obvious :*). Relative URLs are normally understood
	to be relative to the directory the file is in. But the Base tag
	can make the URL be relative to any other directory - and on any
	other server. In the particular instance I had noticed, the file
	was in fact adapted from the TOC of Ian Graham's HTML tutorial;
	I didn't want to move all the sub files over so I just made the
	Base tag point to the original TOC - not on my server. So if the
	spider finds a reference in this file to "server-cgi-bin.html"
	it should realise I don't actually *have* that file - it's where
	Base says it is, i.e. some other server, in this case. If it doesn't
	want to go on sidetrips to other sites I guess it's just going to
	have to ignore relative URLs in files having Bases pointing to other
	servers.
Alan.
        ________________Alan_&_Lucy_Richmond__________________________
         CyberWeb / Virtual Library: a wealth of information on World
          SoftWare      http://WWW.Charm.Net/~web/              Wide
              WWW Systems Engineering ***** web@Stars.com *****	Web