Cron bug on the SS-1 under 4.0.3c [summary, long]
Dieter Muller
dworkin at solbourne.com
Sun Feb 11 07:33:53 AEST 1990
In article <4887 at brazos.Rice.edu> ajudge at maths.tcd.ie writes:
>X-Sun-Spots-Digest: Volume 9, Issue 36, message 12
>
>Here is a summary of the replies I have received about a cron bug which
>causes some cron jobs to be run twice.
>
>The bug is acknowledged by Sun and a patch is available, but even after
>the patch the problem still recurs.
The `bug' appears to be a kernel problem, technically. What happens is
that cron does a sleep for N seconds, but wakes up after N-1 seconds. It
starts the next job (the one we're a second early for), and then performs
a reschedule. Well, since the time for the `next' job (the one we just
started) hasn't arrived yet, put it back at the front of the list, sleep 1
second, and poof! You just ran the job twice....
This is a side-effect of the mechanism for user crontabs. Specifically,
while cron is `sleeping', it's really waiting in a select for messages on
a named pipe. If a message came in (user X's crontab changed, etc), it
handles that and goes back into the select. If the select timed out, cron
assumes the timer expired, and that no other external event occurred to
fake out the timer.
A simple way to demonstrate the problem is to send a SIGALRM to the cron
process. And, as mentioned above, the official Sun fixes don't.
The correct fix is for cron to check the time after a select time out. If
the desired time hasn't yet occurred, reset the timer and go back to the
select (basically, act like a null message came in on the pipe). I put
this fix into Solbourne's version of cron, and we haven't heard of the
problem recurring since then.
Dworkin
boulder!stan!dworkin dworkin%stan at boulder.colorado.edu dworkin at solbourne.com
More information about the Comp.sys.sun
mailing list