Re: BGI-spec 1.4

Simon E Spero (ses@tipper.oit.unc.edu)
Mon, 04 Jul 94 17:42:50 -0400


Guido.van.Rossum@cwi.nl writes:
>> The design is somewhat inspired by the Plan 9 file system, and to a lesser
>> extent, the extension system used for the System V.4 name resolution library.
>
>Sure... Now tell me how you are planning to implement this. Suppose
>I'm using SVR4 style dynamic linking of libraries, and I want to mount
>an extension at /cwi/people/ -- where do I put my .so file, and how do
>I tell the server that it's there? (As an aside, can I tell the
>server to load a new version of the .so file without bringing it
>down?) Note that I'm not cynical -- I just like to know.

The configuration file looks like;

# mount-point prefix module options
/ file file_handler.o root=/html

Restarting is a little tricky for multi-threaded processes; you need to leave
any active threads running until they complete, without making their symbols
disappear out from under them. The approach I use is to fork1(2) from inside
the accepting thread, with the parent closing the accept socket and thr_exit(2)
and with the child duping the handle onto stdin, and re-execing the daemon
with an argument telling it to use stdin as the socket to accept on. This
can add a little delay to the processing of pending connections, but is the
only way I've found to avoid the possiblity of losing connections.

>
>Is this done on a pathname component basis, or on string comparison?
>If I had a directory named /pictures-huge/, would it be served by the
>picture_handler or by file_handker? (I hope the latter, but somehow
>your example doesn't makle this clear -- especially since it
>explicitly shows the reverse case.)

It all depends on whether you mount the handler on '/pictures' or on
'/pictures/' (using the principle of longest match, this lets you achieve
either behaviour; the former behaviour is useful if you want to mount a
handler for users home directories on "/~".

>
>> The value returned should either be 0, indicating that a problem
>> occured
>
>Who's responsible for logging an error in this case? I'd like to be
>able to pass an error string on to the client that's unfortunate

There is no client at this stage; this is all handled at startup time.

>> int <module>_umount(char* mount_point, void* cookie)
>>
>
>Surely the cookie contains the mount point, so the mount_point
>argument is redundant. Also maybe rename to <module>_unmount (no need

The contents of the cookie are undefined; it's up to the module what it
keeps or discards.

>
>> uri: The uri passed for this request. All hex escapes will be replaced
>> by the corresponding characters before this routine is called.
>
>I think you will have to leave the hex escapes in. E.g. if a '?'
>occurs in a pathname, it should be encoded, but a '?' meaning a search

You missed the really stupid problem here :-) I forgot about the possibility
of %00; current version has reverted to leaving in the escapes.

>
>> version: The version string passed in the request. If no version was passed,
>> this string will be set to null.
>
>Just to be sure, this would be "HTTP/1.0" currently, or NULL for HTTP
>0.9 GET requests, right?

Yes. It's not possible to break things down any further, as there is no
guarantee that the version string will always be of the form HTTP/xxxxx

>
>What can I expect to be in the buffer? A random amount of data after
>the first line of the request? Can I overwrite the data in the
>buffer? (I suppose so, otherwise a pointer and a count would be
>sufficient.)

The data can be over-written as needed; the buffer contains any data that
was available at the time the request was first consumed.

>
>> Result code:
>>
>> If no errors occur, the handler function should return 0 or 200. If an error
>> occurs, the handler should return either 0, or a valid HTTP error code. If
>> a status code other than 200 is returned, the server will generate an
>> appropriate error message.
>
>I'm sorry, this is totally ambiguous. Does a return value of 0 mean
>success or failure? If a handler encounters an error after it has

Yes. A return of zero means the handler either succeeded or failed :-) It also
means that the server shouldn't bother trying to generate an error message.
A result of 200 means that the handler definitely succeeded, and the
transaction is complete; a result of anything else means an error definitely
occured and that the module should generate an error message for the client.

>started writing data to the socket, what should it do? (Since this is
>a high performance protocol, that could easily happen!)

The handler should return 0, indicating that the server shouldn't generate
any messages, and should close down the socket.

>
>> All handler functions must be re-entrant.
>
>Are you planning to use multiple threads, or to call handlers from
>signal handlers? Do you provide synchronization primitives (e.g. to
>serialize access to the stuff in the *cookie buffer)?

The handlers are all running multi-threaded; the library doesn't currently
export any synchronisation primitives, although the modules themselves are
free to use any services the host system supports. It might be worth adding
some wrapper functions to the library to make code easier to port.

>
>> int http_error(int socket, int code, char* version)
>> Generate an error message corresponding to error 'code'
>
>"Generate"... what exactly does this do? Write a complete HTTP error
>response? Can I write some data to the socket afterwards?

Yes, and yes. http_error generates a response, and produces an error message.
The header for the response specifies "Content-Type: text/html", but does
not include a "Content-Length:" field.

Any extra data sent to the socket must be valid html code; the module can
rely on the html document being at the start of a paragraph.

>
>Some ideas... Decode % escapes in a string; (shallowly) parse the
>next RFC-822 header (something like return a pointer to the name, with
>the colon zapped, plus pointers to the start and end of the header
>text -- possibly spanning continuation lines); skip to the end of
>RFC-822 headers.

I currently have a header decoder, whose design may be changing, hence
its lack of inclusion in the spec. The current decoder bursts an entire
request into a data-structure, with defined fields being stored in named
slots.

The main source for new library functions is Jon Magid, who is working on the
cgi_emulation module by writing calls to non-existent library functions
and then asking me to implement them. :-)

>(Actually, at supposedly little cost, can't you use stdin instead of a
>raw socket? Usi fdopen(sock, "r") to open a FILE and then you can
>just use fgets() to read the next line if you really want to parse

s/stdin/stdio/ (I got a bit confused when the the first line was the last
line on the screen :-)

That's a possiblity which I thought about; ther lose with fgets is that it
copies data from the buffer into the passed-in argument. There is also some
lossage when using stdio streams in bi-directional mode. Also, since
the file zapper works by mapping the file, and then calling write with
the ptr to the mapped file, I don't want stdio even thinking about copying
the data into its buffer.

However, I am open to suggestions on this.

Simon