limits on sockets
James H. Coombs
JAZBO at brownvm.brown.edu
Fri Feb 23 02:50:40 AEST 1990
Jerome Freedman writes
1) a read (or write) on a socket involves a buffer in which the data
to be read/written is contained. This buffer can be adjusted,
according to TFM, but what are the limits of the adjustment. What is
the default size?
How can the buffer be adjusted? You supply a buffer in the read/write call,
but that is not the same buffer that the transport layer uses. For
read/write, you can supply any size buffer you please, just as with any file
descriptor read/write.
2) Suppose I get more data (TCP socket) then I have buffer for how
can I avoid dropping bytes on the floor when I read.
You can read recursively. Just as you read a large file in chunks, so you can
read from a socket in chunks. You don't have to read all of the pending data
at once. The thing that you have to watch out for is reading before all of
the expected data has arrived. If, for example, you know that the writer has
sent 1 Kb, or is supposed to have sent 1 Kb, then you have to stay in the read
loop until you get the entire "packet." I handle this by sending a length
prefix. I also put the burden on the client to supply an adequately sized
buffer to the read routine. This latter constraint, however, requires a clear
protocol for the application.
The primary point is that data stays "in the socket" until you read it (or
something extraordinary occurs).
3) What if I am writing data (TCP socket - in fact all questions
refer to TCP sockets)
write() will send what it can and block until it can send more. You don't have
a problem unless you can't afford to block. Some people prefer to use
nonblocking reads and writes and use select() to handle their own more precise
blocking. When the select() times out, they decide that there is a probable
communications failure. If you don't have complete control over both ends of
the communication, this more robust approach is probably appropriate. If you
are just getting started with sockets, however, you can postpone the
complications.
Another thing to watch out for is the interruption of a read/write (or
send/recv) by a signal. The return code will be EOF, but errno will be EINTR.
There has not been a failure in the communications routine, and you should
loop on the read/write. In a group development environment, you just have to
accept that signals will go off without your knowing or caring what they are.
Or, you may later decide that you need a signal for some reason, and then you
may find that it breaks your communications library.
Communications library, now there is a good point. If you write your own
library to provide a high-level interface to sockets, then you will be a lot
happier in the long run. For example, when someone started using setitimer(),
I found that I had to check for EINTR. Because I had isolated all socket
access, it did not take me long to upgrade for all applications. If I had
made direct calls to sockets in the applications, then I would have been off
on a long chase. Something similar occurred when I switched from read/write
to send/recv so that I could use out-of-band data.
--Jim
Dr. James H. Coombs
Senior Software Engineer, Research
Institute for Research in Information and Scholarship (IRIS)
Brown University, Box 1946
Providence, RI 02912
jazbo at brownvm.bitnet
Acknowledge-To: <JAZBO at BROWNVM>
More information about the Comp.unix.questions
mailing list