sys/kern_discrip.c BUG fixes 4.[23]BSD vaxen and non-vaxens +FIX
ryu at kwmux.UUCP
ryu at kwmux.UUCP
Mon Jul 14 09:43:07 AEST 1986
4.2BSD, 4.3BSD, vaxen and non-vaxen fixes in +FIX
sys/sys/kern_descrip.c
sys/sys/sys_inode.c
sys/sys/sys_socket.c
by Ryu Koriyama, Waseda Univ. Sci. & Eng.
It is possible to close a file descriptor more than once,
or otherwise use it after it has been closed, and
possibly after another process has reallocated it or
reallocated the in-core inode it points to.
This has been first reported by
Jeff Mogul <mogul at navajo> 13 Dec 83
and temporaly fix made by
Brian Thomson decvax!uthub!tompson 13 Feb 84
REPEAT BY:
First install Jeff Mogul's firewall panic in
closef() [ref. unix-wizards <14552 at sri-arpa.UUCP> Dec 12 83]
you may have already seen this. If not, then put it in first
Like so:
kern_descrip.c, closef(), before
(*fp->f_ops->fo_close)(fp);
insert
if(fp->f_count < 1)
panic("closef: f_count < 1");
If you run the system heavly, you will soon meet a panic.
FIX:
What is happening is,
1. Usr calls close() via syscall
2. In close() it call closef()
3. Closef() calls (*fp->f_ops->fo_close)() which
may be ether ino_close() or soo_close()
4.1. In ino_close(), after iput(), ino_close() sets
fp->f_count to 0, which means (struct file *)fp is
no longer used and is now an open slot.
This is because we no longer need fp.
Why we can say this? We are still passing fp to
[cb]devsw[major(dev)].d_close(), BUT in ino_close(),
a little after fp->f_count is set to 0, we are using
fp as a variable to check that another inode for the
same device isn't active.
So what we are passing to [cb]devsw[].d_close() is
garbage which will have fileNFILE in it (see the code).
4.1.2. Now suppose we block in [cb]devsw[].d_close().
4.1.3.
Fp is now a free slot.
Someone, via falloc, gets this file structure.
This is because falloc is told that a file structure
which f_count field is 0 is a free slot.
So now someone has this file structure.
4.1.4. Now the blocking [cb]devsw[].d_close() in 4.1.2 above
has ended.
We will now return to closef().
4.1.5. In closef(), as soon as we have returned, closef() will
set the f_count field to 0 since he thinks that
the file structure is now longer needed. BOGUS!!!
Someone is using that file structure!
Sooner or later someone comes around and gets that
file structure via falloc(). HAVOC!!!
Now we have a file structure which is shared by
2 people. We will get a mixed file. DOOMED!!!
4.2. Now then, in soo_close() we only use the f_data
field in the file structure. So we can't give
fp away.
Soo_close() calles soclose() to do the works.
4.2.1. Soon after soclose() returns, soo_close()
will set fp->f_data to 0.
WHY? Why does he have to set fp->f_data to 0?
What happened to fp->f_close?
Why doesn't he free the file structure which
we no longer need?
4.2.2. Well as soon as we return to closef(), closef() will
set the f_count to 0 and free the file structure slot.
Here we are happly ever after.
Now we know the problem.
The problem is WHO is responsible to free the file structure slot.
If closef() is responsible, ino_close() should not free it.
If fileops.fo_close() is responsible, soo_close() should do free it.
Looking at the code, in ino_close(), there is the next line
fp->f_count = 0; /* XXX Should catch */
So, the author is asking fileops.fo_close() to be responsible in
freeing the file structure slot.
Here comes the bug.
Now when we moved to 4.2BSD, we have got the socket.
We got the soo_close() routine.
But the soo_close() routine does not free the file structure slot.
So we had to do it someware else.
So the code went into closef(). THIS IS THE BUG!!!
The FIX at last!
Since the fileops.fo_close() routine should be responsible for
freeing the file structure slot,
1. closef() should not set the f_count field to 0
2. all of the fileops.fo_close() routines should
set the f_count field to 0 as soon as they have
no need of the file structure slot.
So the fix will be,
1. take out the fp->f_count=0; code out of closef();
2. put in the fp->f_count=0; code into the last line
of soo_close(9 routine.
This way we will be freeing the file structure slot as soon as
we are finished with it, and we can have less chance not
able to be getting a file structure slot.
FIX:
file: sys/sys/kern_descrip.c
routine: closef()
modify:
}
(*fp->f_ops->fo_close)(fp);
fp->f_count = 0;
}
to:
}
(*fp->f_ops->fo_close)(fp);
}
file: sys/sys_socket.c
routine: soo_close()
modify:
error = soclose((struct socket *0fp->f_data);
fp->f_data = 0;
return (error);
}
to:
error = soclose((struct socket *0fp->f_data);
fp->f_data = 0; /* XXX Just in case... */
fp->f_count = 0;
return (error);
}
END OF FIX
Ryu Koriyama
Waseda Univ. Sci. & Eng.
To contact:
Foretune Co.
R&D Div.
Towa Building Room 55
3-5-1 Suido
Bunkyo-ku, Tokyo, Japan
ZIP 112
Phone 03-811-7861 (day and night)
Or on the net:
via JUNET
ryu at psycho.juice
!titcca!kwmux!kiwi!psycho!ryu
P.S. This fix was posted via this channel since the vax I am using is
not connected to any net.
You can contact me via this address.
P.P.S. This fix is still needed in 4.3BSD and other codes than vaxen ones.
We have this code running on heavly loaded vaxes but have no
problem. At least this is NOT an ENBUG.
P.P.P.S. Special thanks to
Mr. Yoshihiro Magara yoshih-m at ascii.junet
Mr. Masahiko Kiso masahi-k at ascii.junet
for the support and testing the debug.
More information about the Comp.unix.wizards
mailing list