Re: CGI/1.0 --- what's wrong with the status quo?

Rob McCool (robm@ncsa.uiuc.edu)
Thu, 30 Dec 1993 12:39:01 -0600


/*
* Re: CGI/1.0 --- what's wrong with the status quo? by Robert S. Thau (rst@ai.mit.edu)
* written on Dec 28, 5:32pm.
*
* First off --- CGI/1.0 already has a naming convention which some people
* find at least irksome, the 'nph-' business. Secondly --- if scripts and
* ordinary files coexist in the same directories, and the server can't tell
* them apart by the names, then how *can* it tell them apart? How is the
* server to know whether to read the file or to run it?
*
* I suppose one could do something with file permissions, but I honestly
* prefer suffixing the name with '.doit'. The trouble with using permission
* bits is that stray 'x' bits do occasionally get set on ordinary files.

I agree about stray x bits, but I don't know about a suffix convention
either. I tend to prefer a config. file directive, since it gives the
administrator a bit more control over what is being executed in his/her
server.

* With my server the way it is, this doesn't matter. On the other hand, if
* the server were using the x bits to tell whether to run the file, and a
* stray 'x' bit landed on some gateway's conversheet, the server would
* wind up trying to exec() a file full of HTML, fail, and return a '500
* Server Error' which *really* confuses the hell out of some poor novice.
* ("The file is there. Why can't the server read it?").

Ayup.

* In short, the naming convention makes it obvious, simply by looking at a
* file, whether it is a script which the server should run, or an ordinary
* file which the server should just throw over the transom. From a *user's*
* perspective, that's simplicitly --- even if it takes ten more lines of code
* in the server. (This is not an exaggeration, BTW --- see below).

A naming convention is just as flexible if you were to use a psuedo mime
type, such as application/x-www-cgi-script, and then map multple suffixes to
it (say, .cgi, .pl, .exe, you get the picture.)

* As the author of several scripts, I regard
* your proposed changes as *adding* complexity, by giving me one more
* inessential detail to keep track of. Granted, the server code does become
* perhaps a little simpler, but see below for more on how I see the
* tradeoff...

I agree here...

* In any case, the amount of code I have added to the server is *minimal* ---
* the total number of lines changed or added is well under 200. If I deleted
* all of the code related to ScriptAlias (which I no longer actually use), I
* think the server would actually shrink substantially.

The code used by ScriptAlias is actually just a minor piece of the actual
Alias engine, which, if you removed it, would probably clock in under 100
lines.

* > All I am saying is SIMPLE IS GOOD. Unnecessary complexity
* > is bad.
*
* I suppose most people would agree with this in the abstract --- until you
* get around to the tricky issues of what exactly is "complexity", and what
* is "necessary", from whose perspective. In particular, as I've said, you
* are proposing to *add* complexity from the perspective of the script writer
* --- in terms of requiring a fixed form for the parameters of their scripts,
* which is one more inessential detail to keep track of and get right --- in
* order to keep *your* code simple and clean:

I would agree with that abstract too. Perhaps it would have been better if
Charles had made his suggestion a month ago. At that time, it probably would
have made it into the CGI spec.

But it's too late now. Any changes we make that will break old scripts are
going to have to be scrutinized and their costs and benefits weighed
carefully to avoid a negative impact on the growing base of script authors.

* The simplicification is in whatever routine in the server identifies the
* PATH_INFO parameters to a CGI script. In the distributed NCSA server, this
* routine is 22 lines of code (get_path_info in http_script.c), two of which
* are blank. In my version, it's 62 lines, but I can shrink it to 29 by
* reverting to the original code's K&R brace style, and stripping out blank
* lines and comments. (BTW, I'm counting these 33 lines of braces and
* whitespace in the change count above. Also, BTW, the extra nine lines of
* executable code here are the ones that add the '.doit' and '.nph' suffixes
* before checking for the existence of the script --- the naming convention
* mentioned above. We are not talking about an enormous amount of code to
* implement *any* of this stuff).

Well, NCSA httpd it's simple, I don't know about the others... At any rate,
most of the time this stuff is never a lot of code, it's just agreeing on
protocols etc....

* The complication is in every CGI script that takes PATH_INFO. At my site,
* that includes 'imagemap' (which may well be the single most used CGI script
* anyplace), my info gateway, and several scripts which form a community
* hotlist system which I'm playing around with, along with a few more minor
* experiments.

I've seen countless others... the documentation for CGI currently ``sucks'',
which has limited its acceptance thus far, but there is still a substantial
script authoring community which we have to keep in mind when considering
protocol changes.

* > I would be interested in hearing from server writers, like Rob McCool, Tony
* > Sanders and the CERN server author. Also the views of script writers
* > would be valuable.
*
* I've never written a whole server, but I have written several nontrivial
* scripts. You've got my opinion...
*/

Certainly, I would like to hear from script authors as well since their
input is a lot more important than the server authors to me... the script
authors are the ones who have to use what we put out.

--Rob