Compress Speedup Kit for 286
John Silva
jsilva at cogsci.berkeley.edu
Wed Jul 20 10:08:57 AEST 1988
Due to the number of requests for these routines I have received, I have
decided to post them. Please keep in mind that these were written for
286 based SCO V2.2.0g, but may work on your implementation of xenix.
The best way to find out is to try them and see if they crash anything..
Hacking around with adb, I have discovered quite a few library routines
which utilize 32 bit shifts, such as _doprint (the kernel of the printf
routines), and others. Even the 32 bit math functions.
John P. Silva
Inova Products
-------------------- Cut Here ----------- Cut Here ----------------------
#! /bin/sh
# This is a shell archive. Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file". To overwrite existing
# files, type "sh file -c". You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g.. If this archive is complete, you
# will see the following message at the end:
# "End of shell archive."
# Contents: Readme Copyright Lflshift.s Sflshift.s
# Wrapped by root at empire on Sun Jul 17 15:28:15 1988
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'Readme' -a "${1}" != "-c" ; then
echo shar: Will not clobber existing file \"'Readme'\"
else
echo shar: Extracting \"'Readme'\" \(2454 characters\)
sed "s/^X//" >'Readme' <<'END_OF_FILE'
XIntroduction
X------------
X
XThe routines in the files Sflshift.s and Lflshift.s are high speed 32 bit
Xshift subroutines intended to replace the library routines _lshr and _lshl.
XThese routines will give a noticable speed increase in those programs which
Xmake heavy use of bit shifts on long integers.
X
XOn 80x86 Xenix routines running the Microsoft compiler, and some others,
Xthe standard 32 bit shift routines are optimized for size rather than
Xspeed. The algorithm Microsoft chose is essentially "shift one bit, loop".
XThis makes it dreadfully slow for long bit shifts.
X
XMy routines achieve their speed increase by replacing the loop structure with
Xtwo or three integer shift instructions and some minor calculations. To enable
Xthe use of the 16 bit shift instructions, I had to break up the long into
Xtwo 16 bit chunks and operate on each seperately. Each routine is comprised
Xof two parts: a part for shifts of less than 16 bits, and one for > 16 bit
Xshifts.
X
XDo to the longer algorithm required for < 16 bit shifts, you will notice
Xthat these routines will be slower than the library routines for small bit
Xshifts, equivalent for about 5 bit shifts, and faster for 6 shifts and larger.
XPast the 16 bit shift mark, the routines really gain in speed since only two
Xinteger shifts are required to achieve the shift as opposed to 3 shifts and
Xan or for < 16 bit shifts.
X
XI had originally written these routines to enhance the speed of compress in
X16 bit mode. They did: I acheived a speed increase of about 24%. (For
Xthose of you who have never hacked the compress sources, compress uses
Xa LOT of 32 bit shifts to get the job done.)
X
XTo install these routines, simply compile your original code and link these
Xroutines in with the rest of your code. It's as simple as that.
X
XJohn P. Silva,
XInova Products
X
XUUCP: ucbvax!cogsci!jsilva
XDOMAIN: jsilva at cogsci.berkeley.edu
X
XCopyright Notice
X----------------
X
X This code is NOT in the public domain. It is a copyrighted work,
X and as such is protected by law.
X
X Inova Products places no restrictions on distribution or noncommercial
X use of this product, as long as this notice is left intact.
X Commercial use is prohibited without express written permission of
X the Author.
X
X Inova Products IS NOT RESPONSIBLE for damages incurred through use
X of this package. We make no warranty of fitness for any particular
X application, nor that the code actually works as intended.
X
X Use at your own risk.
X
END_OF_FILE
if test 2454 -ne `wc -c <'Readme'`; then
echo shar: \"'Readme'\" unpacked with wrong size!
fi
# end of 'Readme'
fi
if test -f 'Copyright' -a "${1}" != "-c" ; then
echo shar: Will not clobber existing file \"'Copyright'\"
else
echo shar: Extracting \"'Copyright'\" \(638 characters\)
sed "s/^X//" >'Copyright' <<'END_OF_FILE'
X Copyright Notice
X ----------------
X
X Written by John P. Silva
X Copyright 1988 by Inova Products
X
X This code is NOT in the public domain. It is a copyrighted work,
X and as such is protected by law.
X
X Inova Products places no restrictions on distribution or noncommercial
X use of this product, as long as this notice is left intact.
X Commercial use is prohibited without express written permission of
X the Author.
X
X Inova Products IS NOT RESPONSIBLE for damages incurred through use
X of this package. We make no warranty of fitness for any particular
X application, nor that the code actually works as intended.
X
X Use at your own risk.
X
END_OF_FILE
if test 638 -ne `wc -c <'Copyright'`; then
echo shar: \"'Copyright'\" unpacked with wrong size!
fi
# end of 'Copyright'
fi
if test -f 'Lflshift.s' -a "${1}" != "-c" ; then
echo shar: Will not clobber existing file \"'Lflshift.s'\"
else
echo shar: Extracting \"'Lflshift.s'\" \(2894 characters\)
sed "s/^X//" >'Lflshift.s' <<'END_OF_FILE'
X; Faster 32 bit shift routines for 8086 family processors
X; Written by John P. Silva
X; Copyright 1988 by Inova Products
X;
X; This module is written for use with the Medium, Large and Huge
X; memory models.
X;
X; Copyright Notice
X; ----------------
X;
X; This code is NOT in the public domain. It is a copyrighted work,
X; and as such is protected by law.
X;
X; Inova Products places no restrictions on distribution or noncommercial
X; use of this product, as long as this notice is left intact.
X; Commercial use is prohibited without express written permission of
X; the Author.
X;
X; Inova Products IS NOT RESPONSIBLE for damages incurred through use
X; of this package. We make no warranty of fitness for any particular
X; application, nor that the code actually works as intended.
X;
X; Use at your own risk.
X;
X TITLE flshift
X
XFLSHIFT_TEXT SEGMENT BYTE PUBLIC 'CODE'
XFLSHIFT_TEXT ENDS
X_DATA SEGMENT WORD PUBLIC 'DATA'
X_DATA ENDS
XCONST SEGMENT WORD PUBLIC 'CONST'
XCONST ENDS
X_BSS SEGMENT WORD PUBLIC 'BSS'
X_BSS ENDS
XDGROUP GROUP CONST, _BSS, _DATA
X ASSUME CS: FLSHIFT_TEXT, DS: DGROUP, SS: DGROUP, ES: DGROUP
X_DATA SEGMENT
X_DATA ENDS
X_BSS SEGMENT
X_BSS ENDS
XFLSHIFT_TEXT SEGMENT
X
X; register di = general temporary
X; register si = shift count save
X
X PUBLIC __lshr
X__lshr PROC FAR
X push di
X push si
X xor ch,ch ;Clear hi byte of cx register
X mov si,cx ;Save shift count in si reg
X cmp cx,16 ;Should we use the 32bit shifter?
X jge SHORT LSHR_32
X mov di,dx ;Figure bits to be moved to low byte
X mov cx,16
X sub cx,si
X shl di,cl ;di now contains bits to be ored later
X mov cx,si
X sar dx,cl ;Perform arithmetic shift on high byte
X shr ax,cl ;Perform arithmetic shift on low byte
X or ax,di ;Replace saved bits into low byte
X jmp SHORT LSHR_ex
XLSHR_32: ;Shift routine for >16 bit shifts
X xor di,di ;Calculate artificial sign extension
X test dh,80h ;If dx is negative, di should be 0
X jne SHORT LSHR_32a
X dec di ;Make di -1
XLSHR_32a:
X lea cx,[si-16] ;Calculate amount to shift
X sar dx,cl ;Arithmetically shift high byte
X mov ax,dx ;Place freshly shifted high byte into low
X mov dx,di ;And place artifical sign extension into high
XLSHR_ex:
X pop si
X pop di
X ret
X__lshr ENDP
X
X PUBLIC __lshl
X__lshl PROC FAR
X push di
X push si
X xor ch,ch ;Clear hi byte of cx register
X mov si,cx ;Save shift count in si reg
X cmp cx,16 ;Should we use the 32bit shifter?
X jge SHORT LSHL_32
X mov di,ax ;Figure bits to be moved to high byte
X mov cx,16
X sub cx,si
X shr di,cl ;di now contains bits to be ored later
X mov cx,si
X shl ax,cl ;Shift low byte
X shl dx,cl ;Shift high byte
X or dx,di ;Replace saved bits into high byte
X jmp SHORT LSHL_ex
XLSHL_32: ;Shift routine for >16 bit shifts
X lea cx,[si-16] ;Calculate amount to shift
X shl ax,cl ;Shift low byte
X mov dx,ax ;Place freshly shifted low byte into high
X xor ax,ax ;And zero low byte
XLSHL_ex:
X pop si
X pop di
X ret
X__lshl ENDP
X
XFLSHIFT_TEXT ENDS
XEND
END_OF_FILE
if test 2894 -ne `wc -c <'Lflshift.s'`; then
echo shar: \"'Lflshift.s'\" unpacked with wrong size!
fi
# end of 'Lflshift.s'
fi
if test -f 'Sflshift.s' -a "${1}" != "-c" ; then
echo shar: Will not clobber existing file \"'Sflshift.s'\"
else
echo shar: Extracting \"'Sflshift.s'\" \(2881 characters\)
sed "s/^X//" >'Sflshift.s' <<'END_OF_FILE'
X; Faster 32 bit shift routines for 8086 family processors
X; Written by John P. Silva
X; Copyright 1988 by Inova Products
X;
X; This module is written to be used only in the Small memory model.
X;
X; Copyright Notice
X; ----------------
X;
X; This code is NOT in the public domain. It is a copyrighted work,
X; and as such is protected by law.
X;
X; Inova Products places no restrictions on distribution or noncommercial
X; use of this product, as long as this notice is left intact.
X; Commercial use is prohibited without express written permission of
X; the Author.
X;
X; Inova Products IS NOT RESPONSIBLE for damages incurred through use
X; of this package. We make no warranty of fitness for any particular
X; application, nor that the code actually works as intended.
X;
X; Use at your own risk.
X;
X TITLE flshift
X
XFLSHIFT_TEXT SEGMENT BYTE PUBLIC 'CODE'
XFLSHIFT_TEXT ENDS
X_DATA SEGMENT WORD PUBLIC 'DATA'
X_DATA ENDS
XCONST SEGMENT WORD PUBLIC 'CONST'
XCONST ENDS
X_BSS SEGMENT WORD PUBLIC 'BSS'
X_BSS ENDS
XDGROUP GROUP CONST, _BSS, _DATA
X ASSUME CS: FLSHIFT_TEXT, DS: DGROUP, SS: DGROUP, ES: DGROUP
X_DATA SEGMENT
X_DATA ENDS
X_BSS SEGMENT
X_BSS ENDS
XFLSHIFT_TEXT SEGMENT
X
X; register di = general temporary
X; register si = shift count save
X
X PUBLIC __lshr
X__lshr PROC NEAR
X push di
X push si
X xor ch,ch ;Clear hi byte of cx register
X mov si,cx ;Save shift count in si reg
X cmp cx,16 ;Should we use the 32bit shifter?
X jge SHORT LSHR_32
X mov di,dx ;Figure bits to be moved to low byte
X mov cx,16
X sub cx,si
X shl di,cl ;di now contains bits to be ored later
X mov cx,si
X sar dx,cl ;Perform arithmetic shift on high byte
X shr ax,cl ;Perform arithmetic shift on low byte
X or ax,di ;Replace saved bits into low byte
X jmp SHORT LSHR_ex
XLSHR_32: ;Shift routine for >16 bit shifts
X xor di,di ;Calculate artificial sign extension
X test dh,80h ;If dx is negative, di should be 0
X jne SHORT LSHR_32a
X dec di ;Make di -1
XLSHR_32a:
X lea cx,[si-16] ;Calculate amount to shift
X sar dx,cl ;Arithmetically shift high byte
X mov ax,dx ;Place freshly shifted high byte into low
X mov dx,di ;And place artifical sign extension into high
XLSHR_ex:
X pop si
X pop di
X ret
X__lshr ENDP
X
X PUBLIC __lshl
X__lshl PROC NEAR
X push di
X push si
X xor ch,ch ;Clear hi byte of cx register
X mov si,cx ;Save shift count in si reg
X cmp cx,16 ;Should we use the 32bit shifter?
X jge SHORT LSHL_32
X mov di,ax ;Figure bits to be moved to high byte
X mov cx,16
X sub cx,si
X shr di,cl ;di now contains bits to be ored later
X mov cx,si
X shl ax,cl ;Shift low byte
X shl dx,cl ;Shift high byte
X or dx,di ;Replace saved bits into high byte
X jmp SHORT LSHL_ex
XLSHL_32: ;Shift routine for >16 bit shifts
X lea cx,[si-16] ;Calculate amount to shift
X shl ax,cl ;Shift low byte
X mov dx,ax ;Place freshly shifted low byte into high
X xor ax,ax ;And zero low byte
XLSHL_ex:
X pop si
X pop di
X ret
X__lshl ENDP
X
XFLSHIFT_TEXT ENDS
XEND
END_OF_FILE
if test 2881 -ne `wc -c <'Sflshift.s'`; then
echo shar: \"'Sflshift.s'\" unpacked with wrong size!
fi
# end of 'Sflshift.s'
fi
echo shar: End of shell archive.
exit 0
---
UUCP: ucbvax!cogsci!jsilva
DOMAIN: jsilva at cogsci.berkeley.edu
More information about the Comp.unix.xenix
mailing list