Re: CGI spec revisited

Brian Behlendorf (brian@organic.com)
Wed, 3 May 1995 23:32:12 +0500


On Sat, 29 Apr 1995, Marc Hedlund wrote:
> Is there any interest for a new look at the CGI spec? I don't just mean at
> NCSA....

Yes, definitely. Though the limits of the CGI interface may dictate that
there's only so far we can go with this, at which point a real runtime
API might be a good thing to look at (like NetScape's). Perhaps someone
should start www-servers?

> A few issues to kick around:
>
> * I like NCSA's DOCUMENT_ROOT idea (which Paul mentions). A number of
> people have bitched about not being able to reliably determine the document
> root or server root across a variety of servers without asking for help
> from the humans.

I added this many moons ago to my hacked version of httpd, and it found
its way to the Apache team which is where I presume NCSA picked it up
from. Basically I use it because a lot of my site creation has to be
self-contained, where CGI scripts and libraries and data has to sit
within one subdirectory. It's also just good programming to abstract
away as much as you can, so cases where I had scripts that had to
assemble a composite page from a bunch of sub objects which were
themselves accessible individually through the web server had to know
where to open() files.

Anyways, if it makes its way into a spec I would be happy to condition it
on "this is only appropriate for those web servers that employ the
concept of a document root". For those web sites that change the file
system they point to based on IP number or path mapping, the concept of
the "document root" still holds.

> * Is there any consensus about what should happen to POSTed data if the
> client receives a redirect? I remember reading somewhere that POSTs get
> turned into GETs if redirected; and a couple of browsers mangle POSTs into
> PATH_INFO (!?!) if passed through a proxy, as I recall. Why, I ask you,
> why? Shouldn't POSTs stay POSTs?

Hmm... I think the problem is that the HTTP method is not expressible in
a URI, so I don't think the browser has any choice but to do a GET on a
redirect. I don't think this is a CGI issue.

> * A couple of people have suggested to me hashing out the horrible ACCEPT
> issue in a new CGI spec. I'm not fond of that idea; I think that's an
> HTTP-wg problem. However, maybe something could be done to improve the
> amount of information scripts receive from the client, apart from MIME-type
> content negotiation. If a server and a client are negotiating directly,
> the HTTP spec would govern; if a gateway stands between the two, content
> negotiation can also include the following.... etc.

Negotiation, horrible? :) Nah, what's needed is a common function that
the CGI script can call that takes as input the Accept: string and a list
of possible data types the script can return, and returns the most
appropriate data type (text/html vs. text/html3 for example). The server
can't do this ahead of time because only the CGI script knows what data
formats it can return. Hmm - it would seem to me that the server should
tell the CGI script what its "qs" values are, though, unless it wants to
handle the q*qs operations on the Accept: string before it gets passed
to the CGI script. Comments?

Also for the CGI 1.2 stew: NCSA and Apache diverged unfortunately in the
issue of CGI variables with internal redirects. In both, you can point
to another URI to be accessed when an error occurs - like having an
access which results in a 404 get pointed to /404.html or even /404.cgi,
which could potentially return some cool info. 404.cgi needs to know
some information about the original access to make some intelligent
decisions, but it would be incorrect to just make its CGI environment
exactly that of the error-causing access. Thus, Apache introduces a few
more variables using REDIRECT_ as the base, and NCSA used ERROR_ as the
base. We chose the former as they aren't necessarily the result of an
error (like 401 responses for example), and when HTTP/1.1 allows us to
send a Base: header then scripts like imagemap can be made one access
instead of two using internal redirection.

Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com brian@hyperreal.com http://www.[hyperreal,organic].com/