Algorithm needed: reading/writing a large file
Yasushi Kuno
kuno at gssm.otsuka.tsukuba.JUNET
Wed Jul 12 18:38:07 AEST 1989
jwp at larry.sal.wisc.edu (Jeffrey W Percival) writes:
>My question was about optimizing the process of rearranging a disk
>file according to a *given* mapping.
>One helpful person suggested reading sequentially and writing randomly,
>rather than vice-versa, and I tried that but it didn't help. I guess
>the benefit gained from using the input stream buffering was cancelled
>out by the effective loss of the output stream buffering.
>The ever-attentive and always-reliable Mr Torek suggested looking
>into disk cache algorithms in OS journals, and verily that I will do,
>with thanks to those who responded.
Oh, that seems quite right, but how about this? Open N files for
writing, read sequentially, and write each record to one of them so
that N output files form ordered bucket... i.e. every record in output
file 1 is smaller than output file 2, and so on. (You can do this
because you already have mapping in the memory.) Then original
problem reduce to sorting N smaller random files. Apply this
recursively until each file fit into your memory. N and pass number
should depend on your system. Isn't this simple? I'm curious if it
works fine or not, so let me know if you have tried this.
Yasushi Kuno
More information about the Comp.unix.wizards
mailing list