Sort bug causes data loss
Wm E Davidsen Jr
davidsen at crdos1.crd.ge.COM
Tue Sep 18 01:36:12 AEST 1990
I have discovered what appears to be a serious bug in the sort
routine used in several SysV variants including Stellar. Since it
causes silent loss of data I am cross posting a bit more than I usually
do.
The problem occurs when the options -n (numeric) and -u (discard
duplicates) are used together sorting data which has a fixed width
numeric as the first key field. The results is output of only one line,
regardless of the input data. I found this by losing 15 months of data
(yes it was backed up). Since sort is often in shell scripts run from
cron to do system things, this problem might not be instantly noticed.
I have generated the following shell script to test for the problem.
#!/bin/sh
# this tests sort for bugs in -nu option
# as found in SCO Xenix and UNIX
echo ""
echo "Start test of sort error"
echo ""
sort -nu <<XX >x$$.tmp
1: a
3: b
2: c
1: a
10: x
XX
sort -n <<XX | uniq >y$$.tmp
1: a
3: b
2: c
1: a
10: x
XX
echo "Starting check"
if [ `cat x$$.tmp | wc -l` -ne 4 ]
then echo "Error in sort with -nu option."
echo "Output was:"; cat x$$.tmp
echo "Should be:"; cat y$$.tmp
else echo "Output appears okay unless diff reported below:"
diff x$$.tmp y$$.tmp
fi #
rm [xy]$$.tmp
echo "Test ends"
# ================================================================
Of course someone may tell me it's supposed to work that way, and that
the BSD version is broken.
Suggested workaround is to pipe sort through uniq rather than use the
-u option. The form "sort -n +0nu" also worked. If I can find disk
space to load the source tape I'll check it further.
--
bill davidsen (davidsen at crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
VMS is a text-only adventure game. If you win you can use unix.
More information about the Comp.unix.sysv386
mailing list