sort question

Andrew Dingwall andrew at root.co.uk
Tue May 23 20:34:24 AEST 1989


In article <810054 at hpsemc.HP.COM> gph at hpsemc.HP.COM (Paul Houtz) writes:
>>In article <810050 at hpsemc.HP.COM> gph at hpsemc.HP.COM (Paul Houtz) writes:
>>>Right.  There is no way to do a true column sort using this utility as you
>>>can on IBM or MPE systems and here is why:   Sort requires a FIELD DELIMITER
>>>character.   That means that there is SOME character that will never be 
>>>sorted.
>>
>>But (as you yourself point out) you can set the field delimiter to
>>newline, effectively making it vanish, then use the 0.n format to
>>specify column n.
>

No, in a previous job, I made extensive use of sort +0.x -0.y ... to do column
sorts on newline-delimited records without needing to specify a field delimiter.
Admittedly, that was on unix V7 (and a long time ago!), but I have tried the
same on a System V system and it still seems to work.
The only thing that I found necessary to make the scheme work is the newline
at the end of the record and no nulls or non-ascii characters in the record
body.

>   The problem with having to set the field delimiter is that you have to
>decide what to set it TO.  Now, if you are reading from a file that has 
>binary data in it, then it is possible that a newline character could appear
>in the binary data.  This seems to me like it might be a problem.   Sort
>would think it found the end of line.
>
>   I can write a sort program that will do this column sorting for me, but
>what a pain.  It's too bad there isn't one for unix, like there is for 
>all the other major operating systems.
>
>   (On the other hand, I'll be that some third party out there has already
>written a true column sort for unix.  I just haven't found it yet.  Any
>takers?)

Yes, we have written a binary sort called binsort.
It works like the unix sort except that it works on fixed-length binary records
and sorts by column position.
It understands all the usual unix data types (char, int, float, double etc),
together with data types more usually found in the commercial world
(cobol COMP (natural byte order signed binary) and COMP-3 (bcd)).
The command-line interface is similar to the unix sort (+m.n -m.n etc)
and appropriate options are supported (-d -f -u -c -r -m -o -T).

I'm not sure under what circumstances it might be made available but, as
UniSoft are a commercial organisation, it would probably cost money!


Andrew Dingwall
andrew at root.co.uk



More information about the Comp.unix.questions mailing list