4.2 BUGLIST, part 8 of 10
Vance Vaughan
vance at mtxinu.UUCP
Sat Nov 10 03:03:25 AEST 1984
4.2 BUGLIST ABSTRACTS from MT XINU, part 8 of 10:
The following is part of the 4.2 buglist abstracts as processed by
Mt Xinu. The initial line of each abstract gives the offending
program or source file and source directory (separated by --),
who submitted the bug, when, and whether or not it contained
a proposed fix. Due to license restrictions, no source is
included in these abstracts.
Important general information and disclaimers about this and
other lists is appended at the end of the list...
sys/kern_time.c--sys salkind at nyu (Lou Salkind) 10 Mar 84 +FIX
The timezone field in the settimeofday system call is ignored.
(I discovered this when I tried to change the PST timezone on our
Pyramid system.)
REPEAT BY:
Run the program below and you will see no difference.
_______________________________________________________________________________
sys/pty.c--sys Spencer W. Thomas <thomas at utah-cs> 26 Jul 83 +FIX
When writing more than TTYHOG characters to the controlling end
of a PTY in cooked mode, characters will be lost. The PTY
should either block or return a partial count (in non-blocking
mode). However, the write completes, but all characters above
TTYHOG have been dumped on the floor by ttyinput. [Note: the
string being written should have several newlines in it.]
REPEAT BY:
This program demonstrates the problem. It writes 6 65
character lines (including newline) in one write into the
controlling end of a pty. A fork reads the lines from the
slave end. It only successfully reads the first 3 lines. The
write returns successfully with a count of 390 bytes written.
When a newline is finally written to the controlling end, the
slave reads one more partial line. The total number of bytes
read by the slave is TTYHOG (+1 for the extra newline). If
TTYHOG is greater than 390 on your system, increase the number
of bytes written by the controller.
================================================================
/*
* tstpty.c - Test pty bug.
*
* Author: Spencer W. Thomas
*/
#include <stdio.h>
char tststring[] =
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\n";
char sendbuf[BUFSIZ];
main()
{
int ptcfd, ptsfd, n;
if ( (ptcfd = open("/dev/ptyqf", 2)) < 0) { perror("ptyqf"); exit(1); }
if ( (ptsfd = open("/dev/ttyqf", 2)) < 0) { perror("ttyqf"); exit(1); }
if (fork() == 0)
{
close(ptcfd);
while ((n = read(ptsfd, sendbuf, BUFSIZ)) > 0)
printf( "%d:%*.*s", n, n, n, sendbuf );
exit(0);
}
strcpy( sendbuf, tststring );
for (n=0; n<5; n++)
strcat( sendbuf, tststring );
printf( "Buflen = %d\n", strlen(sendbuf) );
n = write( ptcfd, sendbuf, strlen(sendbuf) );
printf( "Write returned %d\n", n );
sleep(2);
printf( "Sending newline\n" );
write( ptcfd, "\n", 1 );
close( ptcfd );
wait(0);
}
================================================================
_______________________________________________________________________________
sys/socket--sys Spencer W. Thomas <thomas%UTAH-GR at utah-cs> 16 Aug 83
A write to a pipe with a bad buffer address does not return an
error code (under 4.1a), or returns the wrong error code (under 4.2).
On 4.1a, it appears to write garbage into the pipe "forever" (longer
than I was willing to wait for it).
REPEAT BY:
Compile this program:
main()
{
write( 1, 0xabcde, 512 );
perror("write");
}
Running a.out gives "write: Bad address".
a.out >/dev/null prints "write: Error 0" (another bug, actually).
a.out | see prints a lot of garbage (looks like its just
running through the buffer pool to me) on 4.1a. On 4.2 it gives
the error "write: No buffer space available", obviously the
wrong error message.
_______________________________________________________________________________
sys/sys_generic.c--sys Marc Shapiro 26 Jul 84 +FIX
The arguments passed to the select system call are 3 longs,
which are copied into an array of 3 ints (ibits), then
back from obits. The manual entry for select specifies those
3 arguments as "int *readfds, *writefds, *exceptfds".
This is non-portable to machines where an int is 2 bytes,
if NOFILES>15.
REPEAT BY:
Reading the code (lines 254, 273-275, etc.)
and manual entry for select(2)
FIX:
declare all the above variables as longs.
-------
_______________________________________________________________________________
sys/sys_generic.c--sys Mike Braca <mjb%Brown at UDel-Relay> 3 Oct 83 +FIX
I claim that exceeding file size limits does not work as
advertised. According to the man page for getrlimit(2), when
you hit a soft limit you should get a signal (in this case
SIGXFSZ) and when you hit the hard limit things stop working.
Here is what the man page for getrlimit(2) has to say about it:
"A resource limit is specified as a soft limit and a hard
limit. When a soft limit is exceeded a process may receive
a signal (for example, if the cpu time is exceeded), but it
will be allowed to continue execution until it reaches the
hard limit (or modifies its resource limit)....
A file i/o operation which would create a file which is too
large will cause a signal SIGXFSZ to be generated, this nor-
mally terminates the process, but may be caught."
The way I read this is that the write should succeed if the
"soft" limit is exceeded, so if you ignore SIGXFSZ you effectively
ignore the soft limit. However the write fails (with the wrong
error code, but that's another bug report).
The man page for write(2) is no help, here's what it says about it:
"[EFBIG] An attempt was made to write a file that
exceeds the process's file size limit or the
maximum file size."
It doesn't specify "soft" or "hard" limit. I, of course, understood
"hard" limit.
REPEAT BY:
Read the man page for getrlimit(2) and become confused about
whether or not write()s will succeed after you hit the soft
file limit. Write a program that expects that when you ignore
SIGXFSZ, the "soft" limit will be ignored. Set your "soft"
filesize limit to something small, and run the program. Watch
in amazement as the program runs to completion, but the file
it produces is incomplete.
E.g. compile and run this program and watch it fail:
#include <time.h>
#include <resource.h>
#include <stdio.h>
#include <signal.h>
main() {
struct rlimit lims; int fd, rc;
signal(SIGXFSZ, SIG_IGN);
lims.rlim_cur = 0;
lims.rlim_max = RLIM_INFINITY;
setrlimit(RLIMIT_FSIZE, &lims);
fd = creat("/tmp/fsizetest", 0666);
rc = write(fd, "This will not work\n", 19);
if (rc < 0) perror("write");
}
_______________________________________________________________________________
sys/sys_generic.c--sys Mike Braca <mjb%Brown at UDel-Relay> 3 Oct 83 +FIX
When a process exceeds its file size limit, the write
fails with error EMFILE (too many open files).
It should actually fail with error EFBIG (file too big).
REPEAT BY:
Read the manual page for write(2), then compile and run
the following program:
#include <time.h>
#include <resource.h>
#include <stdio.h>
#include <signal.h>
main() {
struct rlimit lims; int fd, rc;
signal(SIGXFSZ, SIG_IGN);
lims.rlim_cur = 0;
lims.rlim_max = RLIM_INFINITY;
setrlimit(RLIMIT_FSIZE, &lims);
fd = creat("/tmp/fsizetest", 0666);
rc = write(fd, "This will not work\n", 19);
if (rc < 0) perror("write");
}
_______________________________________________________________________________
sys/sys_generic.c--sys rws at mit-bold (Robert W. Scheifler) 25 Feb 84 +FIX
If a SIGTSTP is generated on the controlling tty of a process
that is waiting in a select() on that tty, the process will
mysteriously vanish.
REPEAT BY:
Run in foreground the program:
main()
{
int fds = 1;
select(1, &fds, 0, 0, 0);
}
and then generate SIGTSTP from the keyboard. The process will
correctly suspend, but as soon as a character becomes available
for input to the terminal (i.e. as soon as you type CR to the shell),
the process will vanish.
Why: At the select(), the tty t_rsel gets set to the process, but no
chars are available, so the process goes into state SSLEEP on &selwait.
When the suspend character is typed, a psignal() on the process
changes its state to SSTOP and sets p_cursig to SIGTSTP. When input
chars are made available to the tty, a ttwakeup() is performed, which
calls selwakeup() because t_rsel is still set. Since this is the only
process that has done select() on the tty, there are no collisions,
and selwakeup() simply calls setrun() on the process rather than
calling wakeup(). Therein lies the bug, because this bogusly makes
the process runnable, and it will run before the input chars are
gobbled, and so the select() will succeed and try to return. However,
p_cursig is still set to SIGTSTP, and syscall() will see it and call
psig(), which will call exit() and the process will vanish.
Also note another bug (which I don't propose a fix for here): select()
will succeed on a tty even if the process and the tty are in different
process groups. So the process will think there is data to read, and
then hang trying to do the actual read.
_______________________________________________________________________________
sys/sys_xxx.c--sys cbosgd!mark (Mark Horton) 29 Jul 83 +FIX
Accounting gets turned off when there is plenty of space
on the /usr filesystem.
REPEAT BY:
Fill up /usr to where df shows over 92% full.
The console will print "Accounting suspended" and
all accounting is turned off. If your system is
already over 92% full, this happens when /etc/rc
tries to turn on accounting.
The manual claims that when the disk fills up (I read
this to mean 100% full), accounting is turned off.
The code suggests that the intent was that if it gets
less than 2% free, accounting is turned off, and if it
gets over 4% free, it will be turned back on. In reality,
the numbers used are 8% and 16%.
_______________________________________________________________________________
sys/tty.c--sys Mike Braca <mjb%Brown at UDel-Relay> 27 Sep 83 +FIX
Setting TANDEM when in cooked mode loses big. When the input
queue gets bigger than TTYHOG/2 a STOP char is sent, but a
START char will never be sent because the input queue won't
get smaller until a break character is received. Since the
sender is blocked, it can't send the break character!
REPEAT BY:
Get on a terminal that does ^S ^Q protocol. Type "stty tandem".
Then type enough characters that the terminal locks. Notice
how the terminal never unlocks.
_______________________________________________________________________________
sys/tty.c--sys davec at BERKELEY 19 Aug 83 +FIX
A bug that might be interesting to those using Un*x on machines
other than Vaxen -
In tty.c, the function scanc() (which is replaced by a sed script
on vax and sun versions) returns an incorrect value. As you
can verify by looking up the scanc instruction in the Vax Architecture
handbook, scanc leaves the number of bytes remaining in r0. The
scanc() function incorrectly leaves an index to the character which
fit the mask. So the "return (i);" should be changed to
return (size - i);
It means the difference of the tty working and not working!!
Hoping its not too late for 4.2 ...
Dave Cobbley
Engineering Computing Systems
Tektronix, Inc.
tektronix!tekecs!davec
(503) 685-2383
_______________________________________________________________________________
sys/tty.c--sys chris at maryland (Chris Torek) 2 Aug 84
I'm not sure if this is a bug or a feature, but select ignores
process groups when determining whether to return true for a
tty.
REPEAT BY:
Run the following program in the background, then hit RETURN.
#include <sys/types.h>
#include <sys/time.h>
main () {
int in, ex, sel;
in = ex = 1;
sel = select (1, &in, (int *) 0, &ex, (struct timeval *) 0);
printf ("sel=%d in=%d ex=%d\n", sel, in, ex);
exit (0);
}
There seems to be a relationship between this and the fact that
select mysteriously dies if you type a ^Z followed by a return.
(Csh tells you the job is ``stopped'' and then it vanishes from
the list of active jobs.)
Chris
_______________________________________________________________________________
sys/tty.c--sys Michael John Muuss <mike at brl-vgr> 15 Dec 83 +FIX
If the high bit of the "local flags" is set, TIOCLGET smears
that bit across the high halfword of the int by the >>16.
Credit for finding this goes to Doug Gwyn, <Gwyn at BRL>.
REPEAT BY:
Set the bit with TIOCLSET, and read it back with TIOCLGET.
_______________________________________________________________________________
sys/tty.c,tty_subr.c,vaxuba/dz.c--sys koda at hobgoblin 22 Feb 84 +FIX
3Com interfaces interrupt at IPL 16 while DZ's come in at 15.
The DZ code assumes that spl5 (IPL 15) is good enough to hold
off any pending interrupts.
REPEAT BY:
Combination of moderate dz and ethernet activity will cause
random panics.
FIX:
Replace all spl5 to spl6 in above mentioned modules. Actually
spl6 (IPL 18) is a little over kill but there is no other good
pre-defined level unless you edit asm.sed.
_______________________________________________________________________________
sys/tty_pty.c--sys decvax!mcvax!jim (Jim McKie) 6 Apr 84 +FIX
1) When the slave end of a pseudo-tty closes, the controlling
side is not informed if it is already trying to read from
the device.
2) As it says in the manual, it should be but isn't possible to
send an end-of-file to the slave side by the controlling side
doing a 0-length write in TIOCREMOTE mode.
REPEAT BY:
1) The following short program may or may not cause the parent to
wait forever in the read(), depending on whether it gets there
before the child exits. It is always possible to put a sleep()
in the child process before the exit to ensure the parent gets
to read.
main()
{
switch(fork()){
case -1:
perror("fork");
break;
case 0:
child();
/*NOTREACHED*/
default:
parent();
/*NOTREACHED*/
}
exit(1);
}
child()
{
register int fd;
if((fd = open("/dev/ttyp4", 1)) == -1){
perror("/dev/ttyp4");
exit(1);
}
(void) write(fd, "Hello world\n", 12);
exit(0);
}
parent()
{
register int fd, n;
char buf[100];
if((fd = open("/dev/ptyp4", 2)) == -1){
perror("/dev/ptyp4");
exit(1);
}
while((n = read(fd, buf, sizeof(buf))) > 0)
(void) write(1, buf, n);
if(n == -1)
perror("read");
else
printf("EOF");
exit(0);
}
2) Typing EOF to a process expecting input from a shell window in
EMACS - the process is undisturbed.
_______________________________________________________________________________
sys/tty_pty.c--sys Web Dove <dove at sylvester> 21 Feb 84
Using pty's as a link to remote sites means characters read from
the ptc side get sent to the remote terminal. When these characters are
02xx timing characters they are sent directly. This means that if the
user tty program doesn't translate them, the terminal gets broken. Since
the user terminal program generally doesn't know whether the pty is in raw
vs cooked mode, it isn't a simple thing for it to expand those timing
characters properly.
REPEAT BY:
We have seen the problem with a non-translating server for remote
terminals. Because the timing characters are not translated, the terminal
gets broken.
FIX:
Add code in ptcread() to check for cooked mode and if cooked, to
translate the timing characters into an appropriate number of nulls for the
current speed that the terminal is operating at.
_______________________________________________________________________________
sys/tty_pty.c--sys decvax!uthub!thomson (Brian Thomson) 19 Jun 84 +FIX
Oink oink!
That is the sound that data makes as it travels through
4.2BSD's pseudo-tty driver. Even in the high-volume
direction (slave to controller) there is a great deal
of code executed per-character. On our otherwise idle 750 I
measured the maximum pty throughput at 5K chars/sec.;
after applying the following mods it reached 30K chars/sec.
If your machine is often accessed through rlogin(1c) this
can mean considerable savings in system-state CPU time.
REPEAT BY:
Run this program and use iostat(1) to see what your character
rate is.
#include <sys/types.h>
char buf[1024];
int wsize = 1024;
main()
{
int csock, dsock, i;
for(i=0 ; i<wsize; i++)
buf[i] = '0';
csock = getpty(&dsock);
if(csock == -1) {
perror("ptty");
exit(1);
}
if(fork() == 0) {
/* Child, writes on slave. */
close(csock);
while(write(dsock, buf, wsize) != -1)
;
} else {
/* parent reads from controller */
close(dsock);
while(read(csock, buf, wsize) != -1)
;
}
exit(0);
}
getpty(ip)
int *ip;
{
static char name[] = "/dev/ptyp0";
int i;
int res;
for(i=0; i<16; i++) {
name[9] = i+'0';
res = open(name, 2);
if(res != -1) {
name[5] = 't';
*ip = open(name, 2);
if(*ip != -1)
return(res);
name[5] = 'p';
close(res);
}
}
return(-1);
}
_______________________________________________________________________________
sys/tty_tb.c--sys guest at ucbarpa (Guest Account) 19 Jun 84 +FIX
tbioctl() was apparently never converted to operate in a 4.2bsd
environment.
REPEAT BY:
Examine code in tbioctl() in tty_tb.c.
_______________________________________________________________________________
sys/tty_tb.c--sys dagobah!bill (Bill Reeves) 13 Sep 83 +FIX
When a tablet is closed the inuse flag is not cleared.
Thus after a while all tablets are unavailable.
REPEAT BY:
Just use it for a while.
_______________________________________________________________________________
sys/ufs_alloc.c--sys decvax!jmcg (Jim McGinness) 6 Feb 84 +FIX
There is a buffer etiquette bug in the cylinder group resource
allocation routines `alloccg' and `ialloccg'. It causes a
system buffer covering the cylinder group resource counts to be
marked BUSY which eventually causes other processes to be
blocked with inodes locked. If there are sufficiently many
processes trying to create, extend, or remove files from that
cylinder group, the root inode will be locked and the system
will appear to be hung.
REPEAT BY:
A prerequisite for this to happen is that a file system must
have become full or almost full so that the resource counts
in the cylinder groups are zero. The way the problem has
occurred on decvax (and on cbosgd) was for the file system
containing the uucp spool directories to become almost full.
_______________________________________________________________________________
sys/ufs_alloc.c--sys mckusick at ucbmonet (Kirk Mckusick) 1 Oct 84 +FIX
There are two bugs in checking to see if a fragment can be
incresed in size. The first bug always causes the check to fail,
forcing a new fragment to be allocated. This failure causes a
minor performance degredation, but is otherwise harmless. The
second bug could potentially cause a system panic, but never
occurs because of the first bug.
REPEAT BY:
Though generating a panic is possible in theory, constructing
an example is difficult.
_______________________________________________________________________________
sys/ufs_mount.c--sys guest at ucbarpa (Guest Account) 19 Jun 84 +FIX
The mountfs() routine in ufs_mount.c fails to validate some
critical data in the superblock before using the data.
This can cause UNIX to crash if you inadvertently (or
purposely) try to mount a disk with garbage on it.
REPEAT BY:
Mount a filesystem whose superblock contains an absurd
fs_sbsize value.
_______________________________________________________________________________
sys/ufs_mount.c--sys guest at ucbarpa (Guest Account) 19 Jun 84 +FIX
getmdev() in ufs_mount.c forgets to iput() on error cases.
This can result in a hung system following a rejected mount request.
REPEAT BY:
The following command sequence will wedge UNIX:
/etc/mount /dev/rhp1a /mnt
ls -l /dev/rhp1a
Any character-special device can be used in lieu of /dev/rhp1a.
_______________________________________________________________________________
sys/ufs_namei.c--sys Jeff Schwab <jrs at Purdue.ARPA> 25 May 84 +FIX
It appears that some versions of the 4.2 kernel have re-implemented
the concept of "sticky" directories. The existing code catches
the case where a user is attempting to delete a file he does not
own, but failes to catch the rename case. Under many conditions,
a rename can cause many of the same problems as a delete.
REPEAT BY:
Create a file in a sticky directory that you don't own. Then try
and rename it and you can!
_______________________________________________________________________________
sys/ufs_nami.c?--sys Chris Kent <kent at BERKELEY> 28 Jun 83
There seems to be a rather odd behaviour in nami. I will
describe it as best I can, but since I can only cause it to happen in a
particularly small set of circumstances, I don't fully understand it
yet.
It began when we tried to compile uucp; creat() calls to creat the
temporary files which are then linked to were failing with ENOENT. The
person working on the compile tried many things, and became suspicious
of the chdir call. He replaced
chdir(Spool)
with
chdir("/usr/spool/uucp/"); /* note trailing / */
and things began working again. Removing the trailing / causes things
to break. It turns out that changing
chdir(Spool)
to
chdir(Spool); chdir(".");
also fixes things. Thus it would appear that some ending condition in
namei() (?) is munged.
REPEAT BY:
Simple programs work fine; I can't construct a program that
fails. However, all the uucp family programs fail in the same way.
I have looked for aliases to chdir in uucp sources that might cause
this, but have not found anything. Similarly for creat().
_______________________________________________________________________________
sys/ufs_syscalls.c--sys mckusick at ucbmonet (Kirk Mckusick) 30 Jun 84 +FIX
There is a race condition between the `unlink' and `rename'
system calls that can cause the system to leave a reference
in a directory that points to an unallocated inode. The next
time the entry is accessed, the system panics with a "freeing
free inode panic".
REPEAT BY:
When the following two programs are run, the directory entry
for "AA" eventually points to an unallocated inode and the
`rename' system call panics when it tries to delete the previous
file associated with "AA" in preparation for renaming "A".
main()
{
while(1) {
close(creat("A",0666));
rename("A","AA");
}
}
main()
{
while(1) {
unlink("A");
}
}
_______________________________________________________________________________
sys/ufs_syscalls.c--sys watmath!arwhite (Alex White) 8 Feb 84 +FIX
copen doesn't check permissions if FTRUNC is specified but FWRITE
isn't. This means you can truncate files you don't have perms on,
and truncate to zero length DIRECTORIES!!!!
REPEAT BY:
#include <sys/file.h>
main()
{
open("xyz", O_TRUNC|O_RDONLY); /* xyz with no write perms */
open(".", O_TRUNC|O_RDONLY); /* Directory is truncated! */
}
_______________________________________________________________________________
sys/ufs_syscalls.c--sys mazama!stew (Stewart Levin) 10 Jan 84 +FIX
Our local software (as well as commands like `tee') rely on the
ESPIPE error from lseek() to determine whether data is coming/going
down a pipe. When converting to 4.2 this failed and we tracked it
down to lseek setting an EINVAL rather than ESPIPE error number.
REPEAT BY:
call lseek on a pipe.
FIX:
change source at line 371 in ufs_syscalls to
if(fp == NULL) {
if(u.u_error == EINVAL) u.u_error = ESPIPE;
return;
}
_______________________________________________________________________________
sys/ufs_syscalls.c--sys mazama!stew (Stewart Levin) 3 Sep 84
My program was issuing relative seeks, checking for a -1 return
code, and then issuing a read, again checking for a -1 return code.
The read did return -1 and set EINVAL. The same arguments had been
passed to read() in 20 previous calls. Finally I found that the
file offset had been decremented below zero by the previous lseek.
REPEAT BY:
printf("%d\n",lseek(fd,10L,0));
printf("%d\n",lseek(fd,-30L,0));
printf("%d\n",read(fd,buffer,10));
FIX:
In lseek() copy fp->f_offset into a local variable and operate on
it. If the result is negative, set u.u_error = EINVAL otherwise
store it back into fp->f_offset.
_______________________________________________________________________________
sys/ufs_tables.c--sys salkind at nyu (Lou Salkind) 22 May 84 +FIX
Both ufs_subr.c and ufs_tables.c are used by fsck. In
ufs_subr.c, the location of the #include files depends on
#ifdef KERNEL. Not so in ufs_tables.c!
REPEAT BY:
Compile fsck. Have different header files floating around.
FIX:
I have changed in ufs_tables.c the line
#include "../h/param.h"
to read
#ifdef KERNEL
#include "../h/param.h"
#else
#include <sys/param.h>
#endif
The other possible change would be to eliminate the KERNEL
#ifdef's in ufs_tables.c.
_______________________________________________________________________________
sys/uipc_socket.c--sys sdcsvax!sdccsu3!madden at Nosc 7 Nov 83 +FIX
Under 4.2 BSD, termination of a program which has
invoked a listen on a UNIX domain socket will cause an interminable
loop at net interrupt level if there are pending connections which
have not yet been accepted.
REPEAT BY:
Run program A below in the background. Run program B
twice. Kill program A. The result should be a system hang at
net interrupt level.
_______________________________________________________________________________
sys/uipc_socket.c--sys Mike Braca <mjb%Brown at UDel-Relay> 27 Sep 83 +FIX
If you try to write 64K or more to a pipe in a single write()
system call, the system will crash with "panic: sbflush 2"
when the reader closes its end of the pipe.
There is a bug in the socket sending routine (sosend()) whereby
it doesn't ever do partial writes to the socket. So when you
tell it to write 64KB, by golly, it just jams the data in the
pipe without regard for the arbitrary pipe size limit of 4KB.
This in itself is not that bad (after all, we have 6MB of
memory!), but, alas, the size of the data queued in the socket
is kept in a short int. So 64KB of buffers have been allocated,
but the count has wrapped to 0. The read statement doesn't do
anything. But on closing the 'read' half of the pipe, the
system crashes because it can't figure out how to de-allocate
the 64KB worth of memory buffers.
REPEAT BY:
Compile and run the following program:
main()
{
int pipefd[2];
char data[64*1024];
pipe(pipefd);
write(pipefd[1], data, 64*1024);
/* Shouldn't get here, right? WRONG! */
/* (it should hang because the pipe's not that big, */
/* and no one is reading it) */
close(pipefd[1]);
while (read(pipefd[0], data, 1) > 0);
close(pipefd[0]);
/* SURPRISE! your system just crashed. */
}
_______________________________________________________________________________
sys/uipc_socket.c--sys sun!rusty (Russel Sandberg) 2 Apr 84 +FIX
Send or sendto of zero length udp packet returns with no error but
doesn't send anything.
REPEAT BY:
Write a program to send zero length udp packets.
_______________________________________________________________________________
sys/uipc_socket2.c--sys genji at UCBTOPAZ.CC (Genji Schmeder) 14 Oct 83
large network buffer causing "sbflush 2" panic
_______________________________________________________________________________
sys/uipc_syscalls.c--sys Dave Rosenthal 6 Jul 84 +FIX
The value returned by a successful socketpair() call is the same
as the value in sv[1]. The manual says it should be zero. A
trivial bug.
REPEAT BY:
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/file.h>
main(argc,argv)
int argc;
char **argv;
{
int sv[2],
res;
printf("socketpair() returns %d\n",
socketpair (AF_UNIX, SOCK_DGRAM, 0, sv));
exit (0);
}
_______________________________________________________________________________
sys/uipc_usrreq.c--sys ralph (Ralph Campbell) 12 Sep 83 +FIX
If you pass more than one file descriptor in a message it
won't work right.
FIX:
Apply following diff to uipc_usrreq.c/unp_externalize()
------- uipc_usrreq.c -------
473c473
< *(int *)rp = f;
---
> *(int *)rp++ = f;
_______________________________________________________________________________
sys/uipc_usrreq.c--sys watmath!arwhite (Alex White) 23 Jan 84 +FIX
Receiving data with MSG_OOB set causes panic in the unix domain.
soreceive() calls pr_usrreq with a newly allocated mbuf,
but the code for PRU_RCVOOB is non-existent, hence it always
frees it, when it returns to soreceive that tries to free it
again and panics.
REPEAT BY:
Just do an recv with the flag MSG_OOB set in the unix domain.
_______________________________________________________________________________
sys/uipc_usrreq.c--sys watmath!arwhite (Alex White) 20 Feb 84 +FIX
Accept -> soaccept -> uipc_usrreq(PRU_ACCEPT) -> bcopy
Bcopy dies as unp->unp_remaddr == 0
Why? Because the connect which this accept refers to,
connect -> soconnect -> uipc_usrreq(PRU_CONNECT) -> unp_connect
-> unp_connect2 -> m_copy; m_copy has run out of mbufs and returns
zero into unp->unp_remaddr.
REPEAT BY:
Ya gotta be kidding, it was after 18,000 requests for memory denied
that we got this one. And there seem to be soooo many bugs that
occur if you run out, it isn't funny; you'll never get this one
a second time! you'll get hit by one of the others.
However, for anybody that wants to try, I enclose changes to
kern_exit.c so that when you run out of mbufs you won't panic the
next time a process exits...
We keep on running out of mbufs - generally ~600 mbufs allocated
to socket structures, ~600 allocated to protocol control blocks,
and ~100 to socket names and addresses. (However, we've also
had similar crashes without the socket name and address mbufs).
We have hundreds of students running a 5-process game communicating
via pipes OR sockets in the unix domain. No, it doesn't seem
to be legitimate running out because of too many pipes.
There don't seem to be enough sitting around after the crash.
I've looked at most student's programmes, and haven't found any
yet which seems to cause any problems.
_______________________________________________________________________________
sys/uipc_usrreq.c--sys spgggm at ucbopal.CC (Greg Minshall) 31 Jan 84
If you open a socket in the Unix domain, using Datagrams, you
should be able to do a connect(), and then just standard writes
(or send()s, or whatever). Instead, after the connect(), a write
gives you an EDESTADDRREQ. This is because connect()
(actually, unp_connect2() in uipc_userreq.c) never actually sets
SS_ISCONNECTED.
REPEAT BY:
Here are two programs, main()+hintSet and main2()+hintSend
that demonstrate the problem. (also included is makefile)
###main.c
#include <sys/time.h>
#include <signal.h>
main()
{
int hintSet(), hintClear();
int nfds, hintNo;
long readfs;
struct timeval timeout;
timeout.tv_usec = 0; /* micro seconds */
timeout.tv_sec = 10; /* seconds */
hintNo = hintSet("hint");
for (; 1 ;) {
readfs = 1<<hintNo;
nfds = select(hintNo+1, &readfs, (long *) 0,
(long *) 0, &timeout);
if (nfds > 0) {
if ((readfs & (1<<hintNo)) != 0)
hintClear(hintNo);
printf("hinted\n");
} else if (nfds == 0)
printf("timed out\n");
else {
perror("select");
exit(1);
}
}
return(0);
}
###hintSet.c
#include <fcntl.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <sys/socket.h>
#include <sys/un.h>
extern int errno;
/* hintSet - make room for someone to send us a hint
path is a unix path name to be used as the address of the
hint.
we return an int (actually, a file descriptor), which should
then be used in a select(2). the following code shows usage...
#include <sys/time.h>
#include <signal.h>
main()
{
int hintSet(), hintClear();
int nfds, hintNo;
long readfs;
struct timeval timeout;
timeout.tv_usec = 0; micro seconds
timeout.tv_sec = 10; seconds
hintNo = hintSet("hint");
for (; 1 ;) {
readfs = 1<<hintNo;
nfds = select(hintNo+1, &readfs, (long *) 0,
(long *) 0, &timeout);
if (nfds > 0) {
if ((readfs & (1<<hintNo)) != 0)
hintClear(hintNo);
printf("hinted\n");
else if (nfds == 0)
printf("timed out\n");
else {
perror("select");
exit(1);
}
}
}
return(0);
}
Note that after the select, "hintClear" MUST be called, else
any future selects become no-ops.
We take error exits for strange events.
If the file name is already in use as a socket, we attempt to
unlink it (and unfortunate occurrence).
If hintSet is called TWICE (even from two seperate users) with
the same pathname, the second caller will pick up all future
hints from "hintSend".
*/
int
hintSet(path)
char *path; /* path name for hint */
{
int s, length, diddle, savedError;
long mypid;
struct sockaddr_un foo;
s = socket(AF_UNIX, SOCK_DGRAM, 0);
if (s == -1) {
perror("hintSet: socket");
exit(1);
}
length = strlen(path);
if (length > sizeof foo.sun_path)
length = sizeof foo.sun_path;
strncpy(foo.sun_path, path, length);
if (bind(s, &foo, (sizeof foo)-1) == -1) {
if ( ((savedError = errno) == EADDRINUSE) &&
(open(path, O_RDONLY) == -1) &&
(errno == EOPNOTSUPP) &&
(unlink(path) != -1) &&
(bind(s, &foo, (sizeof foo)-1) != -1) )
;
else {
errno = savedError;
perror("hintSet: bind");
exit(1);
}
}
/* set non blocking... */
diddle = fcntl(s, F_GETFL, 0);
if (diddle == -1) {
perror("hintSet: fcntl F_GETFL");
exit(1);
}
diddle = fcntl(s, F_SETFL, diddle | FNDELAY);
if (diddle == -1) {
perror("hintSet: fcntl F_SETFL");
exit(1);
}
hintClear(s);
return(s);
}
/* hintClear - clear any hints outstanding on our area... */
int
hintClear(s)
int s;
{
char buffer[1024];
while (((read(s, buffer, 1024)) != -1) ||
(errno != EWOULDBLOCK))
;
}
###main2.c
#include <sys/time.h>
#include <signal.h>
main()
{
int hintSend();
hintSend("hint");
}
###hintSend2.c
#include <fcntl.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <sys/socket.h>
#include <sys/un.h>
#define MESSAGE "hint"
extern int errno;
/* hintSend - send a hint to an address in the Unix domain.
the argument 'path' is the Unix pathname waiting for
the hint.
hintSend takes error exits if too strange of things happen
if "path" doesn't exist, or if no one is connected to it,
we just quietly return to the caller.
*/
hintSend(path)
char *path; /* path name for hint */
{
int s, length;
struct sockaddr_un foo;
s = socket(AF_UNIX, SOCK_DGRAM, 0);
if (s == -1) {
perror("hintSend: socket");
exit(1);
}
length = strlen(path);
if (length > sizeof foo.sun_path)
length = sizeof foo.sun_path;
strncpy(foo.sun_path, path, length);
if (connect(s, &foo, (sizeof foo)-1) == -1) {
perror("hintSend: connect");
exit(1);
}
if (write(s, MESSAGE, strlen(MESSAGE)) == -1) {
perror("hintSend: write");
exit(1);
}
}
###makefile
CFLAGS = -g
main: main.o hintSet.o
$(CC) $(CFLAGS) main.o hintSet.o -o main
main2: main2.o hintSend.o
$(CC) $(CFLAGS) main2.o hintSend.o -o main2
main3: main2.o hintSend2.o
$(CC) $(CFLAGS) main2.o hintSend2.o -o main3
main.o: main.c
hintSet.o: hintSet.c
main2.o: main2.c
hintSend.o: hintSend.c
hintSend2.o: hintSend2.c
FIX:
in sys/sys/uipc_usrreq.c, routine unp_connect2, in the switch
under "case SOCK_DGRAM", add the line
soisconnected(so);
to get the stuff set right. (untested fix).
_______________________________________________________________________________
sys/various.c--sys kre at ucbmonet (Robert Elz) 21 Oct 83 +FIX
This just might, I say might, cause I'm not sure yet, be that
elusive inode bug we've been having. It certainly is an inode bug,
but I'm yet to be convinced that all the problems we've been having
ultimately descend from this (which is either 1 or 2 bugs, depending
on how that you look at it).
The scenario is something like this ...
Process is closing a char device, most probably a terminal,
close calls closef() and when it returns, sets u_ofile to NULL.
closef() calls ino_close() which does an iput() on the inode, then
calls *(devsw[].d_close)(). Now imagine that the close routine
is going to need to wait for output queues to drain, or whatever,
so it sleeps, and while its asleep, a signal occurs. The longjmp(u_qsave)
exits back to syscall() which (lets say for simplicity) calls psig()
and then exit(). exit(), noticing a u_ofile that is != NULL (close()
doesn't set it to NULL till closef() returns, which its not going to do)
then calls closef() again (nb: exit sets u_ofile to NULL before closef())
closef() again calls ino_close, which does another iput(), causing
the inode reference count to end up at -1 (and most probably
doing many other nasty things).
To fix that, I have just moved the u.u_ofile[..] = NULL; to
before the close (I did the same thing with u_pofile[] = 0 (*pf = 0)
but I am less sure that that is important).
While I was looking at that, I saw a related, and somewhat messier bug.
The scenario this time starts out the same way, down as far as the
iput() in ino_close(). Just after that, in a line marked with XXX,
f_count is set to 0. Then the routine belts off to the devsw[].d_close
routine (or maybe just doing the itrunc(), ifree(), dqrele() sequence).
Anything that has a sleep() in it. We don't need an interruptible
sleep this time. While process is sleeping, some other proc does
a falloc(), and finds our file slot, that we've generously given away
before we're really finished with it. It then grabs it, carefully
setting f_count to 1 (or whatever) to mark the file in use. Then
our process finishes its sleep, and returns from ino_close to
closef() which then sets f_count to zero. A little later, the second
process does a close, which noticing that f_count < 1, does all the
right things, and no problems ensue. But, if in the meantime, some
third process has found this file slot, with its f_count == 0, and
grabbed it again, there are now 2 refs, & a ref count of 1. As each
of them close it, they both do an iput(), making the ref count go to -1,
and generally stuffing things. My fix to this is a real kludge, and
is explained below. It should be done properly.
REPEAT BY:
Make the load average very high (say 50 to 60 or more)
with large numbers of processes opening and closing files, etc.
(You need that because of the way that falloc() scans the file table).
Run the system like that continuously for a day or so. Do a "pstat -i"
and look for something with a ref count of 255 (which is really 65535
or -1, pstat conveniently masks the i_count with 0377, but that is
another bug entirely).
_______________________________________________________________________________
sys/vm_mem.c--sys rws at mit-bold (Robert W. Scheifler) 7 Nov 83 +FIX
On large partitions (> 2^19 blocks), the block number gets
sign extended, causing panic: munhash. The Berkeley code
should work, but there appears to be a bug in the C compiler.
REPEAT BY:
Try to use lots of a large partition.
FIX:
In /sys/sys/vm_mem.c in memall() the code
swapdev : mount[c->c_mdev].m_dev, (daddr_t)(u_long)c->c_blkno
should be changed to
swapdev : mount[c->c_mdev].m_dev, c->c_blkno
and in /sys/vax/vm_machdep.c in chgprot() the code
munhash(mount[c->c_mdev].m_dev, (daddr_t)(u_long)c->c_blkno);
should be changed to
munhash(mount[c->c_mdev].m_dev, c->c_blkno);
because the C compiler apparently incorrectly folds the (daddr_t) and
(u_long) together and sign extends anyway. Simply taking out the
(daddr_t)(u_long) works, although lint will probably complain about it.
_______________________________________________________________________________
sys/vm_mem.c(?)--sys bdh at cit-750 (Brian D. Horn) 29 Jun 84 +FIX
When using a debugger (anything that uses ptrace(2), this is known
to occur when using dbx and sdb) it is possible to crash the system with
a "panic: munhash". When examining the traceback it would appear that
the problem originates when a ptrace(4,...) is made (modify childs text
segment). We are running on a VAX-11/750 with 2Mbytes real memory.
REPEAT BY:
Seems to be non-deterministic in nature. Best guess as to how to
repeat this is to debug (using dbx or sdb) a "large" (1M or bigger)
program and setting a breakpoint or two and starting it running.
No guarantee that this will cause the panic however.
_______________________________________________________________________________
sys/vm_swp.c--sys lwa at mit-mrclean (Larry Allen) 30 Oct 84 +FIX
A readv or writev call to or from a raw disk only does the operation
specified by the first element of the io vector.
REPEAT BY:
Try to perform a readv from a raw disk, specifying a two-element io
vector. The number of bytes read will equal the number of bytes
specified in the first element of the io vector.
_______________________________________________________________________________
sys/vmpage.c--sys allegra!princeton!astro 6 Jun 83 +FIX
<<FIX THIS BUG IF YOU DO LARGE TRANSFERS ON RAW DMA DEVICES!!!>>
There is a paging routine bug in 4.1 BSD that affects the locking of memory
for dma on Raw I/O devices. This bug can cause a process to hang at priority
-24 (PSWP+1). The problem occurs when an attempt is made to lock a page that
is in the process of being swapped out. The call in mlock in pagin() will
block if this is the case. However anything can happen during this block.
In particular some other process can have grabbed that page. Pagein() really
should start the processing of that page fault again from the beginning.
_______________________________________________________________________________
sys/vmsched.c--sys Spencer W. Thomas <thomas at utah-cs> 4 Aug 83 +FIX
The t_rm and t_vm fields in the vmtotal structure are usually too big
(more than the total real memory on our system in the case of t_rm).
REPEAT BY:
Use adb to examine the field.
_______________________________________________________________________________
sys_inode.c--sys dagobah!efo (Eben Ostby) 17 Nov 83
processes waiting for a flock lock will hang if someone waiting
for the lock is killed. The ILWAIT bit never gets cleared.
REPEAT BY:
You'd have to set up a messy sequence of people waiting for
shared and exclusive locks, then kill the right guy.
FIX:
Either ILWAIT has to be a count rather than a bit (which could be
decremented when the guy dies) or everyone waiting for any kind
of flock would have to be woken up when the guy dies.
_______________________________________________________________________________
syslog.c--lib Christopher A Kent <cak at Purdue.ARPA> 11 Jan 84 +FIX
Attempts to use syslog(3) fail in programs that perform other
network functions. Output to the log file is garbled, or correct
network system call invocations fail for no apparent reason.
The syslog supplied with sendmail does, however, function
correctly. Inspection of the two versions shows that the C library
version neglects to bind the datagram socket to any address.
Recompiling the supplied libc source and reinstalling
/lib/libc.a causes the problem to go away(!).
REPEAT BY:
Compile and run the following, both with and without an
argument, and inspect the syslog output.
/*
* demonstrate broken syslog
*/
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <syslog.h>
struct sockaddr_in sin = { AF_INET }; /* socket address */
main(argc, argv)
char **argv;
{
int s;
if(fork())
exit(0);
if(argc > 1)
openlog("stest", LOG_PID);
syslog(LOG_INFO, "starting up");
sin.sin_addr.s_addr = INADDR_ANY;
sin.sin_port = htons(77);
s = socket(AF_INET, SOCK_STREAM, 0, 0);
if(s < 0){
perror("socket");
exit(1);
}
if(bind(s, &sin, sizeof(sin)) < 0){
perror("bind");
exit(1);
}
syslog(LOG_INFO, "message1");
syslog(LOG_INFO, "message2");
}
_______________________________________________________________________________
GENERAL INFORMATION ON THE 4.2 BUGLIST FROM MT XINU
_________________________________________________________________
--IMPORTANT DISCLAIMERS--
Material in this announcement and the accompanying reports
has been edited and organized by MT XINU as a service to the
UNIX community on a non-profit, non-commercial basis. MT
XINU MAKES NO WARRANTY, EXPRESSED OR IMPLIED, ABOUT THE
ACCURACY, COMPLETENESS, OR FITNESS FOR USE FOR ANY PURPOSE
OF ANY MATERIAL INCLUDED IN THESE REPORTS.
MT XINU welcomes comments in writing about the contents of
these reports via uucp or US mail. MT XINU cannot, however,
accept telephone calls or enter into telephone conversations
about this material.
_________________________________________________________________
Legal difficulties which have delayed the distribution of
4.2bsd buglist summaries by MT XINU have been resolved and
three versions of the buglist are now available.
The current buglist has been derived from reports submitted
to 4bsd-bugs at BERKELEY (not from reports submitted only to
net.bugs.4bsd, for example). Reports are integrated into
the buglist as they are received, so that any distributions
are current to within a week or so.
Buglists now being distributed are essentially "raw". No
judgment has been passed as to whether the submitted bug is
real or not or whether it has been fixed. Only minimal edit-
ing has been done to produce a manageable list. Reports
which are complaints (rather than bug reports) have been
eliminated; obscenities and content-free flames have been
eliminated; and duplicates have been combined. The result-
ing collection contains over 500 bugs.
Three versions of the buglist are now ready for distribu-
tion:
2-Liners:
Two lines per bug, including a concise description, the
affected module, the submittor. Approximately 55K
bytes, it is being distributed to net.sources con-
currently with this announcement.
All-but-Source:
All material, except that all but the most inocuous of
source material has been removed to meet AT&T license
restrictions. Nearly a mega-byte, this will be
distributed to net.sources in several 50K byte pieces
later this week.
A paper listing or mag tape is also available, see
below.
Please note that local usenet size restrictions may
prevent large files from being received and/or
retransmitted. MT XINU will not dump this material on
the net a second time; if your site has not received
material of interest to you within a reasonable time,
please send for a paper or tape copy.
All-with-Source (FOR SOURCE LICENSEES ONLY):
4.2 licensees who also have a suitable AT&T source
license can obtain a tape containing all the material,
including proposed source fixes where such were submit-
ted.
Once again, MT XINU has not evaluated, tested or passed
judgment on proposed fixes; all we have done is organ-
ize the collection and eliminate obvious irrelevancies
and duplications.
A free paper copy of the All-but-Source list can be obtained
by sending mail to:
MT XINU
739 Allston Way
Berkeley CA 94710
attn: buglist
or electronic mail to:
ucbvax!mtxinu!buglist
(Be sure to include your US mail address!)
For a tape, send a check for $110 or a purchase order for
$150 to cover MT XINU's costs to the address given above
(California orders add sales tax). For the All-with-Source
list, mail us a request for the details of license verifica-
tion at either of the above addresses.
More information about the Comp.sources.unix
mailing list