Bits


Moved to new blog

In a previous article, I described the basics of binary arithmetic and gave a function to display the binary representation of a number. Here, we’ll look at several ways to count the set (1) bits in a number.

First of all, why would you want to count bits? Bitsets. If you use bitsets as a fast set implementation, you might want to find out how many elements there are in the set. I used this in a sudoku programme to memorise which digits can’t be placed in a particular cell. Bit counting functions are also used extensively in graphics (bitmap manipulations).

Here’s 22 in binary:
00010110

From the binary representation, it’s obvious that there are 3 set bits and 5 unset bits. How do you count them? I’ll give three methods.

Classic, let’s iterate through every bit, solution

The idea behind this is simple: take every bit in the number (n) and check if it’s 1. To do this, you can simply use a variable (i):

  1. initialise i with 1
  2. check if n AND i is greater than zero and if so, increment a counter
  3. multiply i by 2
  4. if i < n, go to 2

Here’s the code:

/* Count the ON bits in n using an iterative algorithm */
int bitcount(int n) {
	int tot = 0;

	int i;
	for (i = 1; i <= n; i = i<<1)
		if (n & i)
			++tot;

	return tot;
}

This isn’t bad and works in O(lg(n)) time, but if you know (you probably will) if the number is made up mostly of ones or zeros, use one of the following algorithms.

Sparse ones algorithm

This solution relies on the following observation:
2210 = 000101102
22 - 1 = 2110 = 000101012
22 AND 21 = 00010100

Notice what happened: by logically ANDing 22 and 21, you get a number whose binary representation is the same as 22 but with the last 1 flipped to 0.

The idea behind this algorithm is to logically AND n and n – 1 until n is 0. The number of times necessary to do this will the the number of 1 bits.

Here’s the code:

/* Counts the ON bits in n. Use this if you know n is mostly 0s */
int bitcount_sparse_ones(int n) {
	int tot = 0;

	while (n) {
		++tot;
		n &= n - 1;
	}

	return tot;
}

Why call it sparse ones? Look at the algorithm, and apply it to a few numbers on a piece of paper. You’ll notice that you go through the inner loop the x times, where x is the number of 1s in the number. So the time is O(x), and it’s best to use it if there are few ones in the number.

Dense ones algorithm

But what if your number is mostly 1? The solution is obvious: flip every bit in n, then apply the sparse ones algorithm:

/* Counts the ON bits in n. Use this if you know n is mostly 1s */
int bitcount_dense_ones(int n) {
	int tot = 0;

	n ^= (unsigned int)-1;

	while (n) {
		++tot;
		n &= n - 1;
	}

	return sizeof(int) * 8 - tot;
}
Full source

Here’s the full C source to the programme (bit.c):
NOTE: Source is mangled by WordPress. Don’t copy-paste; download the file.

#include <stdio.h>

/* Print n as a binary number */
void printbits(int n) {
	unsigned int i, step;

	if (0 == n) { /* For simplicity's sake, I treat 0 as a special case*/
		printf("0000");
		return;
	}

	i = 1<<(sizeof(n) * 8 - 1);

	step = -1; /* Only print the relevant digits */
	step >>= 4; /* In groups of 4 */
	while (step >= n) {
		i >>= 4;
		step >>= 4;
	}

	/* At this point, i is the smallest power of two larger or equal to n */
	while (i > 0) {
		if (n & i)
			printf("1");
		else
			printf("0");
		i >>= 1;
	}
}

/* Count the ON bits in n using an iterative algorithm */
int bitcount(int n) {
	int tot = 0;

	int i;
	for (i = 1; i <= n; i = i<<1)
		if (n & i)
			++tot;

	return tot;
}

/* Counts the ON bits in n. Use this if you know n is mostly 0s */
int bitcount_sparse_ones(int n) {
	int tot = 0;

	while (n) {
		++tot;
		n &= n - 1;
	}

	return tot;
}

/* Counts the ON bits in n. Use this if you know n is mostly 1s */
int bitcount_dense_ones(int n) {
	int tot = 0;

	n ^= (unsigned int)-1;

	while (n) {
		++tot;
		n &= n - 1;
	}

	return sizeof(int) * 8 - tot;
}

int main(int argc, char *argv[]) {
	int i;
	for (i = 0; i < 23; ++i) {
		printf("%d = ", i);
		printbits(i);
		printf("\tON bits: %d %d %d", bitcount(i), bitcount_sparse_ones(i), bitcount_dense_ones(i));
		printf("\n");
	}

	return 0;
}

That’s it. If you’re intrested in a few more (slightly esoteric) algorithms, see this article.

Good luck. Always open to comments.

Moved to new blog

In this article I’ll begin by defining binary numbers and describe the basic operations. Afterwords, I’ll show you several of the most common uses of binary numbers.

Humans work with digits in base 10, and we’re quite good at it. Computers, on the other hand, use base 2 and it shouldn’t come as a surprise that they’re extremely good at it. Today’s high level languages do a lot to hide this fact from us, but, still, a fairly good grasp of binary numbers is essential to working with computers, especially programming them.

Binary numbers

Mathworld defines binary as:

The base 2 method of counting in which only the digits 0 and 1 are used.

Basically, the binary representation of a number is the way of writing that number, uniquely, using only the digits 0 and 1. These are the binary representations of the first 16 natural numbers:

0 = 0000
1 = 0001
2 = 0010
3 = 0011
4 = 0100
5 = 0101
6 = 0110
7 = 0111
8 = 1000
9 = 1001
10 = 1010
11 = 1011
12 = 1100
13 = 1101
14 = 1110
15 = 1111

Given a number in base 10, how do you find the representation in base 2? You take the number and divide it repeatedly by 2 taking note of the remainder. After you’ve reached 0, write the remainders in reverse order. That’s the binary number! Here’s an example for 13:
Converting_13_to_binary

So,
13_in_binary

Now, how do you get a decimal form a binary? This is even easier. Take the number in binary form and start from the right side. Multiply the first digit (from the right side) with 20. Multiply the second digit with 21 and add it to the result from the first multiplication. Multiply the third digit with 22 and add it to what you have so far. … Do this for all the digits and you’ll get the decimal representation of the binary number.
13_binary_to_decimal

Logical operations

All of the following operations work in the same way. Take every bit form the first number and the corresponding bit from the second number (if it exists) and compute the resulting bit using the operation’s truth table.

NOT
NOT truth table

So, take each bit and reverse it.

0000 becomes 1111
0101 becomes 1010
1111 becomes 0000

It’s worthwhile to note that applying NOT twice, you get the original number unaltered.

AND
AND truth table

So, if both digits are 1 then the result is 1, otherwise it is 0.

OR
OR truth table

So, if either digit is 1, then the result is 1, otherwise it is 0.

XOR (eXclusive OR)
XOR truth table

So, if one of the digits it 1, but not both, then the result is 1, otherwise, the result is 0.

SHL (SHift Left) and SHR (SHift Right)

These aren’t like the other operations in that they don’t rely on truth tables. To apply them, simply drop the left-most (right-most) digit and add a 0 digit to the right (left).

After successive SHL operations,

0011 becomes 0110,
0110 becomes 1100,
1100 becomes 1000,
1000 becomes 0000

After successive SHR operations,

1100 becomes 0110,
0110 becomes 0011,
0011 becomes 0001,
0001 becomes 0000

Common use: Fast multiplication/division by 2

To quickly multiply an integer by 2, simply shift left the number.

To quickly divide an integer by 2, simply shift right the number.

In C, this would be:

a = a <> 1; /* Divide by 2 */
a = a >> 1; /* Multiply by 2*/

Setting/Checking a certain bit

To check whether the (n + 1)th bit is set (1) or not (0), use this:

a & 1<<n = 0 if the bit is NOT set
a & 1< 0 if the bit is set

To set the (n + 1)th bit, use this:

a = a | 1<<n;

What am I doing here? In both cases I first compute 1<<n. This is a binary number in which all digits are 0 but the nth digit is 1. Then, to check, I logically AND the two numbers. If the bit is set, then the result should be anything but 0. On the other hand, to set the bit, I logically OR the two numbers. If the bit was set, than this will have no effect. But if the bit was not set, it will be at the end.

Common use: bitsets

Chances are, at one time or another, you’ve had to use an array of boolean values. You’ve probably used something like this:

char foo[100];
foo[42] = 0; /* Equivalent of false */
foo[42] = 1; /* Equivalent of true*/

OR

bool bar[100];
bar[23] = false;
bar[23] = true;

This isn’t bad. Actually, in most cases it’s quite good. But if the number of elements is relatively small (less than 64), there’s a better way, that is using bitsets.

Instead of an array use an int, then use the methods described above to set and check the bits.

int a;
a = 0; /* The entire ``array" is set to false */
a |= 1<<3; /* The fourth bit is set */
if (a & 1<<3) /* Is the fourth bit set? */
        a ^= 1<<3; /* If it is, flip it. */

Flipping bits

To flip a bit, XOR it by 1.

a ^= 1<<5; /* Flip the sixth bit */

Putting it all together: Printing a binary number

This small C programme prints out the binary representation of 0, 1, …, 16. (bit.c)

#include <stdio.h>

/* Print n as a binary number */
void printbitssimple(int n) {
	unsigned int i;
	i = 1<<(sizeof(n) * 8 - 1);

	while (i > 0) {
		if (n & i)
			printf("1");
		else
			printf("0");
		i >>= 1;
	}
}

/* Print n as a binary number */
void printbits(int n) {
	unsigned int i, step;

	if (0 == n) { /* For simplicity's sake, I treat 0 as a special case*/
		printf("0000");
		return;
	}

	i = 1<<(sizeof(n) * 8 - 1);

	step = -1; /* Only print the relevant digits */
	step >>= 4; /* In groups of 4 */
	while (step >= n) {
		i >>= 4;
		step >>= 4;
	}

	/* At this point, i is the smallest power of two larger or equal to n */
	while (i > 0) {
		if (n & i)
			printf("1");
		else
			printf("0");
		i >>= 1;
	}
}

int main(int argc, char *argv[]) {
	int i;
	for (i = 0; i < 16; ++i) {
		printf("%d = ", i);
		printbits(i);
		printf("\n");
	}

	return 0;
}

Code should be fairly easy to understand. Good luck. Always open to comments. Update: The next article in this series is Counting Bits.