Warning of anomalous behavior of select() on 386/ix
Tim Bray
tbray at watdragon.waterloo.edu
Mon Feb 19 15:33:34 AEST 1990
To state the problem simply: Standard bsd-style co-operative socket setup via
socket(), bind(), listen(), socket(), connect(), accept(). The calls work
fine and the two processes are talking. Problem is that the server when it
does a select() that includes the new connection in the bitmask right after
it's established, may see an exception condition advertised. This is not the
case on `real' Berkeley systems. However, if that exception is ignored, they
can proceed to use the connection just fine.
The details: Fairly standard TCP/IP application (*every* function return value
is checked; suppressed for brevity). The server process does the following:
int s;
struct sockaddr_in server;
s = socket(AF_INET, SOCK_STREAM, 0);
server.sin_family = AF_INET;
server.sin_addr.s_addr = INADDR_ANY;
server.sin_port = 0;
bind(s, &server, sizeof(server));
length = sizeof(server);
getsockname(s, &server, &length);
/* advertise the socket id (server.sin_port) */
/* ... later ... */
listen(s, 5);
/* in event loop */
new_client = accept(s, 0, 0);
/* and chatter away */
The client does the following
hp = gethostbyname(host);
server.sin_family = AF_INET;
bcopy(hp->h_addr, &server.sin_addr, hp->h_length);
/* kernel is the advertised socket from the server */
server.sin_port = htons(kernel);
session = socket(AF_INET, SOCK_STREAM, 0);
connect(session, &server, sizeof(server));
Everything's OK now, the client & server are talking.
Now the server drops into a loop where among other things he does a
non-blocking select() on several input files including the just-established
'new_client' to see what's up. The first time is very soon after the accept()
call just above. Depending on circumstances, the client may have sent a
message through the socket, which may or may not have arrived. Imagine the
server's surprise when, on 386/ix 2.0.2, TCP/IP 1.1.2, select() says there's
an exception! errno isn't set. So I tried doing a read() of one byte on the
exception-labelled file; it failed, complaining about EBADMSG (the sys_errlst
message however does not correspond to the comment in sys/errno.h, grrr).
This code is known to work on sun, ultrix, 4.3, sequent, etc., etc.. In a fit
of desperation, I put in a hack saying 'Ignore exceptional conditions on that
first select()' (a sick, twisted thing to do) and everything went just fine.
All this ugly stuff is hidden down in low-level routines so that the software
that's really doing work knows nothing about sockets or such ugliness. So one
smart(?) answer would be to recode it all using Real System Five stuff...
Well anyhow, Tim Bray
Open Text Systems, Waterloo, Ont.
More information about the Comp.unix.wizards
mailing list