Re: Common Log format

Brian Behlendorf (brian@wired.com)
Wed, 12 Apr 1995 22:31:26 +0500


On Wed, 12 Apr 1995, Paul Phillips wrote:
> (Assuming that index.html is the dir index file, this too can vary.) Are
> the current logfile processing programs taking this vagarity into
> account? I intend to log
>
> GET /dirname/index.html
>
> in all cases where index.html existed, and
>
> GET /dirname/
>
> in all cases where it doesn't, unless somebody can provide me with a
> really good reason not to.

I was going to say this is something you could do with a Perl script, one
that parsed your config files so it knew accesses to /dirname/ =
/dirname/index.html, but that information isn't necessarily available at
log-analysis time if your site changes frequently.

The real question is - what do you do when the requested object and the
object actually delivered differ? This can be because of short cuts
(DirectoryIndex in httpd, soft links in the file system, etc) or now content
negotiation, where a request for /dirname/ could return /dirname/index,
/dirname/index.html, /dirname/index.html3, or /dirname/index.cgi.
However, there are good debugging-related reasons for knowing the actual
request was.

The choice is yours if you're going to change from logging the actual
request to logging the actual object served; maybe this is an issue for
CLF-NG. A few of us have ideas about that - like maybe it should really
just be an API whose expression can be a simple string of variables,
like

$HOST $IDENTD $AUTHUSER [$MDAY/$MN/$YR:$HR:$MN:$SC $GMT] "$REQUEST" $ERRCODE $LENGTH

to represent the current CLF. This string could be shared with log file
analysers and other applications that required it.

Wishing I was in Darmstadt,
Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@hotwired.com brian@hyperreal.com http://www.hotwired.com/Staff/brian/