File compression program (compress R3.0) posted to mod.sources

Joe Orost joe at petsd.UUCP
Sat Jan 5 00:37:01 AEST 1985


<>
				EXTENDED ABSTRACT

          Compresses the specified files or standard input.  Each file
          is replaced by a file with the extension .Z, but only if the
          file got smaller.  If no files are specified, the
          compression is applied to the standard input and is written
          to standard output regardless of the results.  Compressed
          files can be restored to their original form by specifying
          the -d option, or by running uncompress (linked to
          compress), on the .Z files or the standard input.

          When file names are given, the ownership (if run by root),
          modes, accessed and modified times are maintained between
          the file and its .Z version.  In this respect, compress can
          be used for archival purposes, yet can still be used with
          make(1) after uncompression.

          Compress uses the modified Lempel-Ziv algorithm described in
          "A Technique for High Performance Data Compression", Terry
          A. Welch, IEEE Computer Vol 17, No 6 (June 1984), pp 8-19.
          Common substrings in the file are first replaced by 9-bit
          codes 257 and up.  When code 512 is reached, the algorithm
          switches to 10-bit codes and continues to use more bits
          until the bits limit as specified by the -b flag is reached
          (default 16).  Bits must be between 9 and 16.  The default
          can be changed in the source to allow compress to be run on
          a smaller machine.

          After the bits limit is reached, compress periodically
          checks the compression ratio.  If it is increasing, compress
          continues to use the codes that were previously found in the
          file.  However, if the compression ratio decreases, compress
          discards the table of substrings and rebuilds it from
          scratch.  This allows the algorithm to adapt to the next
          "block" of the file.

          The amount of compression obtained depends on the size of
          the input file, the amount of bits per code, and the
          distribution of character substrings.  Typically, text
          files, such as C programs, are reduced by 50-60%.
          Compression is generally much better than that achieved by
          Huffman coding (as used in pack), or adaptive Huffman coding
          (compact), and takes less time to compute.

	  Some practical uses for compress are:

		Saving disk space
		Lowering uucp phone expenses
		Saving space in archive tape storage

					regards,
					joe

--
Full-Name:  Joseph M. Orost
UUCP:       ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe
US Mail:    MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724
Phone:      (201) 870-5844
Location:   40 19'49" N / 74 04'37" W



More information about the Comp.unix.wizards mailing list