count of bits in a long
Gene H. Olson
gene at zeno.mn.org
Sat Oct 13 01:41:37 AEST 1990
peter at neccan.oz (Peter Miller) writes:
>/*
> * A while back I posted a solution to the "count of bits set in a
> * long" question, suggesting that HACKMEM 169 was a very fast way to do it.
> * This program demonstrates the technique.
> *
> * The HACKMEM 169 method uses a modulo to sum the digits.
> * A number of people have asked me "how can a division be faster than a loop?"
> * This program demonstrates how. The modulo is a special case,
> * and can be unwound into a tight loop if your machine does not
> * have a fast divide opcode (although many modern machines have a 1 cycle
> * divide opcode, except for we-really-believe-our-press-hype risc
> * manufacturers).
> *
> * This program was run on a 386/20 under unix with the following results:
> * Func sec/iter scaled
> * nbits 0.00018310547 1.000 hackmem 169
> * nbits1 0.0006980896 3.812 hackmem 169 no % operator
> * nbits2 0.0016822815 9.188 count lsb's loop
> * nbits3 0.0037384033 20.417
> * nbits4 0.0034370422 18.771
> * nbits5 0.0037765503 20.625
> * nbits6 0.0035247803 19.250
> */
> ..... A program follows this.
To this program I have added a bug fix (the function table
was wrong) and another function (nbit7) which is MY FAVORITE
WAY of counting bits. The patches are at the end of this
posting....
Anyway, I got much better results on both the 386 (which
has a multiply) and on a SPARC (which doesn't) than any of
the functions in the original program.
The results were:
386 SPARC
------------------------------ ------------------------------
nbits 0.00020980835 1.774 nbits 0.00096893311 15.875
nbits1 0.00038909912 3.290 nbits1 0.00012969971 2.125
nbits2 0.00075149536 6.355 nbits2 0.00031280518 5.125
nbits3 0.0015411377 13.032 nbits3 0.00067138672 11.000
nbits4 0.0016479492 13.935 nbits4 0.00065231323 10.688
nbits5 0.0015525818 13.129 nbits5 0.00067520142 11.062
nbits6 0.0021286011 18.000 nbits6 0.00086212158 14.125
nbits7 0.00011825562 1.000 nbits7 6.1035156e-05 1.000
The nbit7 technique is a simple routine which shifts and sums
bits in place until all bits have summed in the lower 8 bits
of the word. The implementation here requires 32 bit longs,
but it is easily extensible to 64 bits with a mask change and
the additional shift shown.
Enjoy,
------------- Remainder of Posting is a Cdiff Patchfile ----------
*** orig.c Fri Oct 12 08:56:22 1990
--- gene.c Fri Oct 12 09:23:51 1990
***************
*** 215,220 ****
--- 215,236 ----
return count;
}
+ int
+ nbits7(n)
+ register unsigned long n ;
+ {
+ n = ((n >> 1) & 0x55555555) + (n & 0x55555555) ;
+ n = ((n >> 2) & 0x33333333) + (n & 0x33333333) ;
+ n = ((n >> 4) + n) & 0x0F0F0F0F ;
+ n = ((n >> 8) + n) ;
+ #if LONG64
+ n = (n >> 16) + n ;
+ return ((n >> 32) + n) & 0xFF ;
+ #else
+ return ((n >> 16) + n) & 0xFF ;
+ #endif
+ }
+
/*
* build a long random number.
* The standard rand() returns a 15bit number.
***************
*** 275,282 ****
{ "nbits2", nbits2 },
{ "nbits3", nbits3 },
{ "nbits4", nbits4 },
! { "nbits5", nbits3 },
! { "nbits6", nbits4 },
};
main()
--- 291,299 ----
{ "nbits2", nbits2 },
{ "nbits3", nbits3 },
{ "nbits4", nbits4 },
! { "nbits5", nbits5 },
! { "nbits6", nbits6 },
! { "nbits7", nbits7 },
};
main()
_________________________________________________________________________
__
/ ) Gene H. Olson uunet!digibd!zeno!gene
/ __ _ __ _ Smartware Consulting
(__/ _(/_//_(/_ Minneapolis, MN (612) 824-9108
More information about the Comp.lang.c
mailing list