problems with TCSETAF and rlogin
Andrew H. Marrinson
andy at xwkg.Icom.Com
Sat Nov 3 05:45:49 AEST 1990
Hello,
I originally found this problem using nn when logged in via rlogin
from one Interactive Unix 2.0.2 box to another. The symptom was
missing output in the nn menu.
Further investigation revealed that this was due to a bug which was
exercised whenever nn (or any other program) used TCSETAF under
rlogin. The manual page for TCSETAF states that it ``waits for output
to drain, then flushes the input, then sets the parameters''.
However, when using rlogin it seems that the output gets flushed as
well.
In further communication with other nn users, this bug has been observed
on several systems that combine System V or POSIX style termio and BSD
style networking. Anybody who is (or knows someone who is)
maintaining such a system should definitely look into this to see
whether it affects your kernel.
Essentially, what happens is, a program outputs some data to a
pseudo-tty. Rlogind reads that data and sends it to TCP, which begins
assembling it into a packet. Then the program does a TCSETAF.
Evidently, this results in a pseudo-tty control packet containing
TIOCPKT_FLUSHREAD|TIOCPKT_FLUSHWRITE (03H on most systems). (This
would seem to be the bug right here -- it appears to be in the pty
driver.) Rlogind sends that using the MSG_OOB flag to send(2). The
03H gets appended to the packet being constructed from the previous
(normal) output and the urgent pointer is pointed at it. The packet
then gets sent to the client rlogin process looking like this:
<TCP HEADER><NORMAL OUTPUT FROM NN><03H>
^
URGENT POINTER POINTS HERE----------+
The receiving rlogin client then either ignores the normal output
because of the urgent pointer or flushes it because that's what the
03H says to do (I can't remember exactly how OOB data works) either
way output disappears that shouldn't have.
As I mentioned above, I believe that the bug lies in the pty driver,
which should not flush the output in this situation. What it should
do is open to conjecture, I'm not sure there is anyway to match
exactly the semantics of TCSETAF using the pty packet protocol, but
what it does now clearly loses big.
I urge everyone maintaining a system with the combination of BSD
pseudo-ttys and System V/POSIX termio(s) to check their implementation
for this bug. Below is a short test program that can be used to do
this. It prints some data, does a TCSETAF, then waits for a keypress.
Because TCP may sometimes have already sent the packet containing the
data when the flush is received it doesn't happen everytime, so run
the program several times. If even once it waits for a keypress
without outputting anything you have the bug!
If you want more information on this, please don't hesitate to email
(andy at icom.icom.com) and I'll try to help you out.
BEGIN TEST PROGRAM
#include <stdio.h>
#include <termio.h>
main ()
{
static struct termio buf;
static char string[] = "Andy was here. Let's make this really long\n\
and put a lot of separate lines in it. This will more or less\n\
simulate what nn is doing when it screws up...\n";
ioctl (0, TCGETA, &buf);
write (1, string, sizeof (string));
ioctl (0, TCSETAF, &buf);
getchar ();
}
END TEST PROGRAM
--
Andrew H. Marrinson
Icom Systems, Inc.
Wheeling, IL, USA
(andy at icom.icom.com)
More information about the Comp.unix.internals
mailing list