Bit counting
Rich Neitzel
thor at stout.ucar.edu
Wed Jan 11 04:52:32 AEST 1989
Recently jim at athsys.uucp (Jim Becker) wrote:
> The problem is the most efficient method to simply count the
>number of on bits in a sixteen bit short. This is a really easy
>exercise, but one that has a variety of different approaches. I would
>be interested in the opinions of programmers out there, an efficient
>algorithm that is understandable.
>
> Assume that this is a basic sample algorithm:
>
> int
> BitCount( num )
> short num;
> {
> int count = 0;
>
> while( num ) {
>
> count += (num & 1);
> num = (num >> 1);
> }
>
> return count;
> }
>
> Is this efficient? Is this understandable? What is a better
>algorithm?
First a comment on this, what I call the 'classical shift' version. As
listed above, it will not function on many cpus, since it uses right
shifts. Since many cpus propagate the high bit, the loop will never
terminate.
Second, I do not consider this to be an efficient algorithm. While it
appears to be good, since it terminates when there are no more set
bits, it does not function well if there are many set bits or the only
set bit is in the last position tested.
A common solution is to use a lookup table and parse the word into
nibbles or bytes. This is faster, but you must be certain that the
table is correct. I do not relish the thought of entering 256 table
entries. Efficiency suffers from the fact that the table lookup
requires pointer indexing.
I have found that the most efficient algorithm is one involving 'bit
pair' addition. Attached are two routines that implement this
algorithm. They rely on 'parallel' addition. The word involved is
added as successively larger bit pairs. Both a 16 bit and 32 bit form
are provided. I have chosen not to attempt to apply it to any other
word lengths. This is because the algorithm is not truely 'portable',
that is, the idea is portable, but the implemenation is dependent on
the word size. The larger the word size the more pairs are involved.
These routines have been tested in two environments. The first was on
a PDP-11 under RSX-11M-Plus and the second on a Motorola MV133 (20 MHz
68020) under VxWorks. In both cases the test routine was the only job.
They easily out performed the other algorithms tested. The completeing
algorithms were the 'classical' shift and nibble lookup tables. The
results were:
Classical shift lut pair addition
RSX 12.3 7.4 4.6
VxWorks 25 12 8
RSX times are seconds/65K iterations, VxWorks times are microseconds
per iterations. The lookup table code would increase in speed if a
byte table were used, but I doubt it would increase enough to beat the
pair addition time.
Supplied here are both the 16 and 32 bit routines, as well as the
classic and lut routines.
---------------------------------------------------------------------
/*
Module: s_bitcnt.c
Author: Richard E. K. Neitzel
revision history
----------------
1.0,9jan89,rekn written
Explanation:
This routine takes an integer and returns the number of bits it contains
which are set. It does this via 'parallel' addition, turning the integer
into sets of bit pairs of increasing size. WARNING! This assumes that
shorts are 16 bits!
*/
int s_bitcnt(word)
short word;
{
register short i, j, k; /* our work area */
if (!word) /* Why bother? */
return(0);
j = word;
k = j & 0x5555; /* Clear every other bit */
j >>= 1; /* Repeat for other bits */
j &= 0x5555;
j += k; /* 1st pairs */
k = j & 0x3333; /* Clear every two bits */
j >>= 2; /* shift and repeat */
j &= 0x3333;
j += k; /* 2nd pairs */
k = j & 0xff; /* Get lower pairs */
j &= 0xff00; /* Get upper pairs */
j >>= 8;
j += k; /* 3rd pairs */
k = j & 0xf; /* Get last pairs */
j >>= 4;
j += k;
return(j); /* bye */
}
---------------------------------------------------------------------
/*
Module: l_bitcnt.c
Author: Richard E. K. Neitzel
revision history
----------------
1.0,9jan89,rekn written
Explanation:
This routine takes an integer and returns the number of bits it contains
which are set. It does this via 'parallel' addition, turning the integer
into sets of bit pairs of increasing size. WARNING! This assumes that
integers are 32 bits!
*/
int l_bitcnt(word)
int word;
{
register int j, k; /* Our work area */
if (!word) /* Why bother? */
return(0);
j = word;
k = j & 0x55555555; /* Clear every other bit */
j >>= 1; /* Do again for cleared bits */
j &= 0x55555555;
j += k; /* 1st pairs done */
k = j & 0x33333333; /* Clear every two bits */
j >>= 2;
j &= 0x33333333;
j += k; /* 2nd pairs done */
k = j & 0xffff; /* Only need last 4 sets */
j &= 0xffff0000; /* and top 4 here */
j = j >> 16; /* ready - */
j += k; /* add 3rd pairs */
k = j & 0xf0f; /* Clear bits */
j >>= 4; /* Repeat */
j &= 0xf0f;
j += k; /* add 4th pairs */
k = j & 0xff; /* Now add the 8 bit pairs */
j >>= 8;
j += k;
return(j); /* bye */
}
---------------------------------------------------------------------
int bitCount( num )
short num;
{
register short count = 0;
register short tmp;
tmp = num;
while(tmp) /* Note the fix to avoid looping always */
{
count += (tmp & 0x8000);
tmp <<= 1;
}
return(count);
}
---------------------------------------------------------------------
short lut[] = {0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4};
int luts(word)
short word;
{
register short j, k,l;
if (!word)
return(0);
j = word;
k = j & 0xf;
l += lut[k];
j >>= 4;
k = j & 0xf;
l += lut[k];
j >>= 4;
k = j & 0xf;
l += lut[k];
j >>= 4;
k = j & 0xf;
l += lut[k];
return(l);
}
-------------------------------------------------------------------------------
Richard Neitzel
National Center For Atmospheric Research
Box 3000
Boulder, CO 80307-3000
303-497-2057
thor at thor.ucar.edu
Torren med sitt skjegg Thor with the beard
lokkar borni under sole-vegg calls the children to the sunny wall
Gjo'i med sitt shinn Gjo with the pelts
jagar borni inn. chases the children in.
-------------------------------------------------------------------------------
More information about the Comp.unix.questions
mailing list