Practical


Langton’s Ant is a turmite governed by simple rules whose outcome is both unpredictable and intresting. The path taken by the ant generates some surprising shapes, never appearing when you would expect them to, but a seemingly random moments. This article describes the rules behind Langton’s Ant, shows some of the images formed and provides a Python programme to simulate the ant.

Langton’s Ant

Chris Langton is a biologist, and the founder of the field of artificial life. One of his simplest and most intresting creations is Langton’s Ant. It is quite important theoretically, but it also has some intresting practical applications.

The ant’s world consists of a grid, possibly infinite, but limited in our example:
Step1.png

The ant always moves through this grid one step at a time. It’s direction and it’s effect on the grid is defined by 3 simple rules:

  1. If the ant is on a black square, it turns right 90 degrees.
  2. If the ant is on a white square, it turns left 90 degrees.
  3. When the ant leaves a square, it inverts colour.

It is fairly obvious that the ant’s movement will leave a coloured trail. When asked, most people would guess that either no patterns show or that some basic symmetric images appear. What actually happens is this.

For the first 50 or so steps, it just seems to move around randomly, and then at step 52, you get a heart:
Step53.png

For the next 300 or so steps, the ant seems to draw random shapes, ofter erasing what it had drawn before. At about step 383 you get something that looks like this:
Step383.png

Next you get what’s technically known as a mess. The ant’s movement degenerates into chaos. No more patterns are observed:
Step2615.png

By now, you are probably thinking: “Well yes. Simple rules, so at first you get simple patterns, but as the board gets more and more complex, the simple rules can’t handle it, so the ant moves randomly”. Well that’s right, up until about step 10000; that is when you start getting highways.

The highways the ant draws are intresting shapes. You can see one at the bottom of the next image. It is the diagonal bit going down and right. To build then, the ant seems to move in circles, the last row forcing it to draw the next. Thus, the ant continues to draw until something gets in its way. In this simulation, because the screen wraps around, the ant eventually gets back to the mess, where it starts moving randomly again.
Step10711.png

What’s intresting is that, after some time, the ant starts building highways again. Actually, there is no known way to stop it. The only hindrance here is that the board is finite so it has a tendency to fill up. Anyway, here’s what the board looks like at step 97049:
Step97049.png

The Programme

The following is a simple programme written in Python and using PyGame.

The variables that control the display and the board are at the top of the programme. They are heavily commented so there shouldn’t be any problems.

Here’s the code in Python (langton.py):

#************************************************
# Rules of the game
#	1. If the ant is on a black square, it turns
#		right 90 and moves forward one unit
#	2. If the ant is on a white square, it turns
# 		left 90 and moves forward one unit
#	3. When the ant leaves a square, it inverts
#		colour
#
# SEE: http://mathworld.wolfram.com/LangtonsAnt.html
#************************************************

import sys, pygame
from pygame.locals import *
import time

dirs = (
		(-1, 0),
		(0, 1),
		(1, 0),
		(0, -1)
		)

cellSize = 12 # size in pixels of the board (4 pixels are used to draw the grid)
numCells = 64 # length of the side of the board
background = 0, 0, 0 # background colour; black here
foreground = 23, 23, 23 # foreground colour; the grid's colour; dark gray here
textcol = 177, 177, 177 # the colour of the step display in the upper left of the screen
antwalk = 44, 88, 44 # the ant's trail; greenish here
antant = 222, 44, 44 # the ant's colour; red here
fps = 1.0 / 40 # time between steps; 1.0 / 40 means 40 steps per second

def main():
	pygame.init()

	size = width, height = numCells * cellSize, numCells * cellSize

	pygame.display.set_caption("Langton's Ant")

	screen = pygame.display.set_mode(size) # Screen is now an object representing the window in which we paint
	screen.fill(background)
	pygame.display.flip() # IMPORTANT: No changes are displayed until this function gets called

	for i in xrange(1, numCells):
		pygame.draw.line(screen, foreground, (i * cellSize, 1), (i * cellSize, numCells * cellSize), 2)
		pygame.draw.line(screen, foreground, (1, i * cellSize), (numCells * cellSize, i * cellSize), 2)
	pygame.display.flip() # IMPORTANT: No changes are displayed until this function gets called

	font = pygame.font.Font(None, 36)

	antx, anty = numCells / 2, numCells / 2
	antdir = 0
	board = [[False] * numCells for e in xrange(numCells)]

	step = 1
	pause = False
	while True:
		for event in pygame.event.get():
				if event.type == QUIT:
					return
				elif event.type == KEYUP:
					if event.key == 32: # If space pressed, pause or unpause
						pause = not pause
					elif event.key == 115:
						pygame.image.save(screen, "Step%d.tga" % (step))

		if pause:
			time.sleep(fps)
			continue

		text = font.render("%d " % (step), True, textcol, background)
		screen.blit(text, (10, 10))
		
		if board[antx][anty]:
			board[antx][anty] = False # See rule 3
			screen.fill(background, pygame.Rect(antx * cellSize + 1, anty * cellSize + 1, cellSize - 2, cellSize - 2))
			antdir = (antdir + 1) % 4 # See rule 1
		else:
			board[antx][anty] = True # See rule 3
			screen.fill(antwalk, pygame.Rect(antx * cellSize + 1, anty * cellSize + 1, cellSize - 2, cellSize - 2))
			antdir = (antdir + 3) % 4 # See rule 2

		antx = (antx + dirs[antdir][0]) % numCells
		anty = (anty + dirs[antdir][1]) % numCells

		# The current square (i.e. the ant) is painted a different colour
		screen.fill(antant, pygame.Rect(antx * cellSize + 1, anty * cellSize + 1, cellSize -2, cellSize -2))

		pygame.display.flip() # IMPORTANT: No changes are displayed until this function gets called

		step += 1
		time.sleep(fps)

if __name__ == "__main__":
	main()

It is intresting to note that although you know all the rules that govern the ant’s universe, you cannot predict its movement without resorting to simulation. This just goes to show that knowing the rules of the components at the lowest level might not help you understand the system as a whole.

That’s it. Have fun with the code. Always open to comments.

In this article, I describe a simple (adds less than 1min of work) way to speed up Dijkstra’s Algorithm for finding the single source shortest path to every node in a graph.

In a previous post I described the simple O(n2) implementation of the algorithm. Here, I focus on a method that will probably speed up the algorithm.

Why Bother

The previous implementation of the algorithm ran in O(n2) time, where n is the number of nodes in the graph. This means that for a graph of, say 100 nodes, it would do about 100 * 100 = 100000 calculations. Considering that computers nowadays are said to be able to do about 100000000 (a hundred million) calculations per second, we’re fine, and the programme will finish in well under a second. But what if we have a graph with 100000 nodes? This might take 100 seconds to run. Now we’re in trouble. We need a faster algorithm.

The two most common ways to speed up Dijkstra’s Algorithm are to implement the finding of the closest node not yet visited as priority queues. Usually heaps or Fibonacci Heaps are used for this purpose (Fibonacci Heaps were actually invented for this).

Heaps are somewhat difficult to implement and Fibonacci Heaps are horror to implement. Incidentally, there’s a very easy of speeding it up.

Just Use Queues

The idea is to simply use queues instead of priority queues. This way provides nowhere near the same level of speedup (the algorithm is still O(n2)), but it makes it run faster, on average, by a factor of 4.

Some bad news: a carefully crafted graph could slow this algorithm down to O(n3). As a rule, graphs in real life are never like this, and, as the method isn’t widely known, test sets for contests are not written to catch this optimisation.

Now for the good news: it’s shockingly easy to write. Compare the old dijkstra1() with the new dijkstra2().

void dijkstra1(int s) {
	int i, k, mini;
	int visited[GRAPHSIZE];

	for (i = 1; i <= n; ++i) {
		d1&#91;i&#93; = INFINITY;
		visited&#91;i&#93; = 0; /* the i-th element has not yet been visited */
	}

	d1&#91;s&#93; = 0;

	for (k = 1; k <= n; ++k) {
		mini = -1;
		for (i = 1; i <= n; ++i)
			if (!visited&#91;i&#93; && ((mini == -1) || (d1&#91;i&#93; < d1&#91;mini&#93;)))
				mini = i;

		visited&#91;mini&#93; = 1;

		for (i = 1; i <= n; ++i)
			if (dist&#91;mini&#93;&#91;i&#93;)
				if (d1&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93; < d1&#91;i&#93;) 
					d1&#91;i&#93; = d1&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93;;
	}
}

void dijkstra2(int s) {
	int queue&#91;GRAPHSIZE&#93;;
	char inQueue&#91;GRAPHSIZE&#93;;
	int begq = 0,
	    endq = 0;
	int i, mini;
	int visited&#91;GRAPHSIZE&#93;;

	for (i = 1; i <= n; ++i) {
		d2&#91;i&#93; = INFINITY;
		visited&#91;i&#93; = 0; /* the i-th element has not yet been visited */
		inQueue&#91;i&#93; = 0;
	}

	d2&#91;s&#93; = 0;
	queue&#91;endq&#93; = s;
	endq = (endq + 1) % GRAPHSIZE;


	while (begq != endq) {
		mini = queue&#91;begq&#93;;
		begq = (begq + 1) % GRAPHSIZE;
		inQueue&#91;mini&#93; = 0;

		visited&#91;mini&#93; = 1;

		for (i = 1; i <= n; ++i)
			if (dist&#91;mini&#93;&#91;i&#93;)
				if (d2&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93; < d2&#91;i&#93;) { 
					d2&#91;i&#93; = d2&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93;;
					if (!inQueue&#91;i&#93;) {
						queue&#91;endq&#93; = i;
						endq = (endq + 1) % GRAPHSIZE;
						inQueue&#91;i&#93; = 1;
					}
				}
	}
}
&#91;/sourcecode&#93;

What's changed? First, we <strong>define several new variables</strong>. These, together, make up the queue:

int queue[GRAPHSIZE];
char inQueue[GRAPHSIZE];
int begq = 0,
    endq = 0;

Next, during the initialisation part of the function, we mark all nodes as not being in the queue.

for (i = 1; i <= n; ++i) {
	/* OTHER INITIALISATIONS (look at the programme) */
	inQueue&#91;i&#93; = 0;
}
&#91;/sourcecode&#93;

Now, add the source node to the queue.
&#91;sourcecode language='cpp'&#93;
queue&#91;endq&#93; = s;
endq = (endq + 1) % GRAPHSIZE;
&#91;/sourcecode&#93;

What does this do? The first line add <strong>s</strong> to the end of the queue. The second line moves the end of the queue one step to the right (I'll explain a few paragraphs down). The modulo operation here is not really necessary, but I like to be consistent.

At this point we'll start looping. When do we stop? The idea here is that a node is in the queue when its neighbours need to be updated (i.e. when a new shortest path might be found leading to them). So, we stop when the queue is empty. Note that this occurs when <strong>begq == endq</strong> and <strong>not</strong> when <strong>!(begq &lt; endq)</strong>. So, <strong>while (begq &lt; endq)</strong> is incorrect because, in one case, <strong>begq</strong> will be greater then <strong>endq</strong>.

What was the first thing we did in the loop? We were supposed to find the closest node not yet visited. Now, we merely take the first node from the queue.

mini = queue[begq];
begq = (begq + 1) % GRAPHSIZE;
inQueue[mini] = 0;

Here, the first element is pop’d out of the queue, the head of the queue is moved one step to the right and the element is marked as not being in the queue. The problem with queues in general, and this one in particular is that the part of them that actually hold the elements tends to move around. Here, every insert moves the tail one step to the right and every pop moves the head one step to the right. Consider the following sequence of operations:

queue.png

Moves 1 through 8 clearly show that, while the size of the information content of the queue changes erratically, it constantly moves to the right. What happened at 9? The queue got to the end of available memory and wrapped around to the beginning. This is the purpose of (begq + 1) % GRAPHSIZE and (endq + 1) % GRAPHSIZE. It turns 7, 8, 9, etc. into 1, 2, 3, etc. But won’t endq overrun begq? No, the use of inQueue guarantees that no element will be inserted in the queue more than once. And as the queue is of size GRAPHSIZE, no overrun is possible.

So far, so good. One last modification: when we update the distance to a node, we add it to the queue (if it’s not already in it).

for (i = 1; i <= n; ++i) if (dist[mini][i]) if (d2[mini] + dist[mini][i] < d2[i]) { d2[i] = d2[mini] + dist[mini][i]; if (!inQueue[i]) { queue[endq] = i; endq = (endq + 1) % GRAPHSIZE; inQueue[i] = 1; } } [/sourcecode]

Comparing the speed

When I first wrote this, I wanted to be able to check that it outputs correct results and I wanted to see how much faster it is. The following programme does both. The function cmpd() checks the output against that given by the simple implementation and the various clock() calls littered through the code time the two functions.

Here’s the code in C (dijkstra2.c):
Note: Source might be mangled by WordPress, consider downloading the file.

#include
#include

#define GRAPHSIZE 2048
#define INFINITY GRAPHSIZE*GRAPHSIZE
#define MAX(a, b) ((a > b) ? (a) : (b))

int e; /* The number of nonzero edges in the graph */
int n; /* The number of nodes in the graph */
long dist[GRAPHSIZE][GRAPHSIZE]; /* dist[i][j] is the distance between node i and j; or 0 if there is no direct connection */
long d1[GRAPHSIZE], d2[GRAPHSIZE]; /* d[i] is the length of the shortest path between the source (s) and node i */

void dijkstra1(int s) {
int i, k, mini;
int visited[GRAPHSIZE];

for (i = 1; i <= n; ++i) { d1[i] = INFINITY; visited[i] = 0; /* the i-th element has not yet been visited */ } d1[s] = 0; for (k = 1; k <= n; ++k) { mini = -1; for (i = 1; i <= n; ++i) if (!visited[i] && ((mini == -1) || (d1[i] < d1[mini]))) mini = i; visited[mini] = 1; for (i = 1; i <= n; ++i) if (dist[mini][i]) if (d1[mini] + dist[mini][i] < d1[i]) d1[i] = d1[mini] + dist[mini][i]; } } void dijkstra2(int s) { int queue[GRAPHSIZE]; char inQueue[GRAPHSIZE]; int begq = 0, endq = 0; int i, mini; int visited[GRAPHSIZE]; for (i = 1; i <= n; ++i) { d2[i] = INFINITY; visited[i] = 0; /* the i-th element has not yet been visited */ inQueue[i] = 0; } d2[s] = 0; queue[endq] = s; endq = (endq + 1) % GRAPHSIZE; while (begq != endq) { mini = queue[begq]; begq = (begq + 1) % GRAPHSIZE; inQueue[mini] = 0; visited[mini] = 1; for (i = 1; i <= n; ++i) if (dist[mini][i]) if (d2[mini] + dist[mini][i] < d2[i]) { d2[i] = d2[mini] + dist[mini][i]; if (!inQueue[i]) { queue[endq] = i; endq = (endq + 1) % GRAPHSIZE; inQueue[i] = 1; } } } } int cmpd() { int i; for (i = 0; i < n; ++i) if (d1[i] != d2[i]) return 0; return 1; } int main(int argc, char *argv[]) { int i, j; int u, v, w; long t1 = 0, t2 = 0; FILE *fin = fopen("dist2.txt", "r"); fscanf(fin, "%d", &e); for (i = 0; i < e; ++i) for (j = 0; j < e; ++j) dist[i][j] = 0; n = -1; for (i = 0; i < e; ++i) { fscanf(fin, "%d%d%d", &u, &v, &w); dist[u][v] = w; n = MAX(u, MAX(v, n)); } fclose(fin); for (i = 1; i <= n; ++i) { long aux = clock(); dijkstra1(i); t1 += clock() - aux; aux = clock(); dijkstra2(i); t2 += clock() - aux; if (i % 10 == 0) { printf("%d / %d\n", i, n); fflush(stdout); } if (!cmpd()) { printf("\nResults for %d do NOT match\n", i); break; } } printf("\n"); printf("Dijkstra O(N^2):\t\t%ld\n", t1); printf("Dijkstra unstable:\t\t%ld\n", t2); printf("Ratio:\t\t\t\t%.2f\n", (float)t1/t2); /* printD(); */ return 0; } [/sourcecode] And here's a big 1200 node graph: dist2.txt.

Have fun and good luck. Always open to comments.

In this article, I describe Gauss’ algorithm for solving n linear equations with n unknowns. I also give a sample implementation in C.

The Problem

Let’s say you want to solve the following system of 3 equations with 3 unknowns:
eqset.png

Humans learn that there a two ways to solve this system. Reduction and substitution. Unfortunately, neither of these methods is suitable for a computer.

A simple algorithm (and the one used everywhere even today), was discovered by Gauss more than two hundred years ago. Since then, some refinements have been found, but the basic procedure remains unchanged.

Gaussian Elimination

Start by writing the system in matrix form:
s1.png

If you recall how matrix multiplication works, you’ll see that’s true. If not, it’s enough to notice how the matrix is written: the coefficients of x, y and z are written, side by side, as the rows of a 3×3 matrix; x, y and z are then written as rows of a 3×1 matrix; finally, what’s left of the equality sign is written one under the other as a 3×1 matrix.

So far, this doesn’t actually help, but it does make the following process easier to write. The goal is, through simple transformations, to reach the system, where a, b and c are known.
s2.png

How do you transform the initial system into the above one? Here’s Gauss’ idea.

Start with the initial system, then perform some operations to get 0s on the first column, on every row but the first.
s3.png

The operations mentioned are multiplying the first rows by -3/2 and substracting it from the second. Then multiplying the first rows by -1 and substracting it from the third. What is -3/2? It’s first element of the second row divided by the first element of the first row. And -1? It’s the first element of the third row divided by the first element of the first row. NOTE The changes to row 1 are never actually written back into the matrix.

For now we’re done with row 1, so we move on to row 2. The goal here is to get every row on the second column under row 2 to 0. We do this by multiplying the second rows by 4 (i.e. 2 / (1 / 2)) and substracting it from the third rows.
s4.png

Now it’s easy to find the value of z. Just multiply the third column by -1 (i.e. -1/1).
s5.png

ERRATA: The 7 in the above matrix should be an 8.

Knowing the value of z, we can now eliminate it from the other two equations.
s6.png

Now, we can find the value of y and eliminate y from the first equation.
s7.png

s8.png

And, finally, the value of x is:
s9.png

And with that, we’re done.

The Programme

Unfortunately, this is easier said than done. The actual computer programme has to take into account divisions by zero and numerical instabilities. This adds to the complexity of forwardSubstitution().

Here’s the code in C (gauss.c):

#include

int n;
float a[10][11];

void forwardSubstitution() {
int i, j, k, max;
float t;
for (i = 0; i < n; ++i) { max = i; for (j = i + 1; j < n; ++j) if (a[j][i] > a[max][i])
max = j;

for (j = 0; j < n + 1; ++j) { t = a[max][j]; a[max][j] = a[i][j]; a[i][j] = t; } for (j = n; j >= i; –j)
for (k = i + 1; k < n; ++k) a[k][j] -= a[k][i]/a[i][i] * a[i][j]; /* for (k = 0; k < n; ++k) { for (j = 0; j < n + 1; ++j) printf("%.2f\t", a[k][j]); printf("\n"); }*/ } } void reverseElimination() { int i, j; for (i = n - 1; i >= 0; –i) {
a[i][n] = a[i][n] / a[i][i];
a[i][i] = 1;
for (j = i – 1; j >= 0; –j) {
a[j][n] -= a[j][i] * a[i][n];
a[j][i] = 0;
}
}
}

void gauss() {
int i, j;

forwardSubstitution();
reverseElimination();

for (i = 0; i < n; ++i) { for (j = 0; j < n + 1; ++j) printf("%.2f\t", a[i][j]); printf("\n"); } } int main(int argc, char *argv[]) { int i, j; FILE *fin = fopen("gauss.in", "r"); fscanf(fin, "%d", &n); for (i = 0; i < n; ++i) for (j = 0; j < n + 1; ++j) fscanf(fin, "%f", &a[i][j]); fclose(fin); gauss(); return 0; } [/sourcecode] In the above code, the first two for-s of forwardSubstitution(), just swap two rows so as to diminish the possibilities of some bad rounding errors. Also, if it exists with a division by zero error, that probably means the system cannot be solved.

And here’s the input file for the example (gauss.in) (save it as gauss.in):
3
2 1 -1 8
-3 -1 2 -11
-2 1 2 -3

That’s it. Good luck. Always open to comments.

In a previous article, I described the basics of binary arithmetic and gave a function to display the binary representation of a number. Here, we’ll look at several ways to count the set (1) bits in a number.

First of all, why would you want to count bits? Bitsets. If you use bitsets as a fast set implementation, you might want to find out how many elements there are in the set. I used this in a sudoku programme to memorise which digits can’t be placed in a particular cell. Bit counting functions are also used extensively in graphics (bitmap manipulations).

Here’s 22 in binary:
00010110

From the binary representation, it’s obvious that there are 3 set bits and 5 unset bits. How do you count them? I’ll give three methods.

Classic, let’s iterate through every bit, solution

The idea behind this is simple: take every bit in the number (n) and check if it’s 1. To do this, you can simply use a variable (i):

  1. initialise i with 1
  2. check if n AND i is greater than zero and if so, increment a counter
  3. multiply i by 2
  4. if i < n, go to 2

Here’s the code:

/* Count the ON bits in n using an iterative algorithm */
int bitcount(int n) {
	int tot = 0;

	int i;
	for (i = 1; i <= n; i = i<<1)
		if (n & i)
			++tot;

	return tot;
}
&#91;/sourcecode&#93;

This isn't bad and works in <strong>O(lg(n))</strong> time, but if you know (you probably will) if the number is made up mostly of ones or zeros, use one of the following algorithms.

<h5>Sparse ones algorithm</h5>
This solution relies on the following observation:
<code>22<sub>10</sub> = 00010110<sub>2</sub>
22 - 1 = 21<sub>10</sub> = 00010101<sub>2</sub>
22 AND 21 = 00010100
</code>

Notice what happened: by logically ANDing <em>22</em> and <em>21</em>, you get a number whose binary representation is the same as <em>22</em> but with the last <em>1</em> flipped to <em>0</em>.

The idea behind this algorithm is to logically AND <em>n</em> and <em>n - 1</em> until <em>n</em> is <em>0</em>. The number of times necessary to do this will the the number of <em>1</em> bits.

Here's the code:

/* Counts the ON bits in n. Use this if you know n is mostly 0s */
int bitcount_sparse_ones(int n) {
	int tot = 0;

	while (n) {
		++tot;
		n &= n - 1;
	}

	return tot;
}

Why call it sparse ones? Look at the algorithm, and apply it to a few numbers on a piece of paper. You’ll notice that you go through the inner loop the x times, where x is the number of 1s in the number. So the time is O(x), and it’s best to use it if there are few ones in the number.

Dense ones algorithm

But what if your number is mostly 1? The solution is obvious: flip every bit in n, then apply the sparse ones algorithm:

/* Counts the ON bits in n. Use this if you know n is mostly 1s */
int bitcount_dense_ones(int n) {
	int tot = 0;

	n ^= (unsigned int)-1;

	while (n) {
		++tot;
		n &= n - 1;
	}

	return sizeof(int) * 8 - tot;
}
Full source

Here’s the full C source to the programme (bit.c):
NOTE: Source is mangled by WordPress. Don’t copy-paste; download the file.

#include

/* Print n as a binary number */
void printbits(int n) {
unsigned int i, step;

if (0 == n) { /* For simplicity’s sake, I treat 0 as a special case*/
printf(“0000”);
return;
}

i = 1<>= 4; /* In groups of 4 */
while (step >= n) {
i >>= 4;
step >>= 4;
}

/* At this point, i is the smallest power of two larger or equal to n */
while (i > 0) {
if (n & i)
printf(“1”);
else
printf(“0”);
i >>= 1;
}
}

/* Count the ON bits in n using an iterative algorithm */
int bitcount(int n) {
int tot = 0;

int i;
for (i = 1; i <= n; i = i<<1) if (n & i) ++tot; return tot; } /* Counts the ON bits in n. Use this if you know n is mostly 0s */ int bitcount_sparse_ones(int n) { int tot = 0; while (n) { ++tot; n &= n - 1; } return tot; } /* Counts the ON bits in n. Use this if you know n is mostly 1s */ int bitcount_dense_ones(int n) { int tot = 0; n ^= (unsigned int)-1; while (n) { ++tot; n &= n - 1; } return sizeof(int) * 8 - tot; } int main(int argc, char *argv[]) { int i; for (i = 0; i < 23; ++i) { printf("%d = ", i); printbits(i); printf("\tON bits: %d %d %d", bitcount(i), bitcount_sparse_ones(i), bitcount_dense_ones(i)); printf("\n"); } return 0; } [/sourcecode] That's it. If you're intrested in a few more (slightly esoteric) algorithms, see this article.

Good luck. Always open to comments.

Sudoku is that Japanese puzzle that requires you to fill in a grid of numbers. Here, I describe a general algorithm to solve these puzzles. Also provided is the source code in C++.

You can see a Sudoku board below. The goal is to fill in every empty cell with a number between 1 and 9 so that no number repeats itself on any line, column of zone (three-by-three marked square).

A sudoku board

If you’ve never solved a Sudoku before, now would be the time to do it. Wikipedia describes some useful strategies.

How do you approach a problem like this? The simplest way is from input to output. What’s your input? It’s a 9×9 matrix of numbers like this one:
5 3 0 0 7 0 0 0 0
6 0 0 1 9 5 0 0 0
0 9 8 0 0 0 0 6 0
8 0 0 0 6 0 0 0 3
4 0 0 8 0 3 0 0 1
7 0 0 0 2 0 0 0 6
0 6 0 0 0 0 2 8 0
0 0 0 4 1 9 0 0 5
0 0 0 0 8 0 0 7 9

How do you store it in the programme? If you said as a matrix, you were close. A matrix is obvious, it’s easy to understand, but it’s a pain to code. Believe me when I say, it’s a lot easier to store it as an array of 9*9 elements. What else do you need? A variable to keep track of how many cells have been filled in (0 means empty board; 81 means full board). An array of bitsets to keep track of what digits can’t be used in each cell (I’ll explain a little later), and the setter and getter functions. As it happens, it’s also easier if you encapsulate it in a C++ class. Here’s the full code to the programme (sudoku.cpp). I’ll explain it all in a bit.

#include <iostream>
#include <fstream>

using namespace std;

class SudokuBoard;
void printB(SudokuBoard sb);

typedef unsigned int uint;

const uint MAXVAL = 9;
const uint L = 9;
const uint C = 9;
const uint S = L * C;
const uint ZONEL = 3;
const uint ZONEC = 3;
const uint ZONES = ZONEL * ZONEC;

const uint lineElements[L][C] = {
    { 0,  1,  2,  3,  4,  5,  6,  7,  8},
    { 9, 10, 11, 12, 13, 14, 15, 16, 17},
    {18, 19, 20, 21, 22, 23, 24, 25, 26},
    {27, 28, 29, 30, 31, 32, 33, 34, 35},
    {36, 37, 38, 39, 40, 41, 42, 43, 44},
    {45, 46, 47, 48, 49, 50, 51, 52, 53},
    {54, 55, 56, 57, 58, 59, 60, 61, 62},
    {63, 64, 65, 66, 67, 68, 68, 70, 71},
    {72, 73, 74, 75, 76, 77, 78, 79, 80}
};

const uint columnElements[C][L] = {
    { 0,  9, 18, 27, 36, 45, 54, 63, 72},
    { 1, 10, 19, 28, 37, 46, 55, 64, 73},
    { 2, 11, 20, 29, 38, 47, 56, 65, 74},
    { 3, 12, 21, 30, 39, 48, 57, 66, 75},
    { 4, 13, 22, 31, 40, 49, 58, 67, 76},
    { 5, 14, 23, 32, 41, 50, 59, 68, 77},
    { 6, 15, 24, 33, 42, 51, 60, 69, 78},
    { 7, 16, 25, 34, 43, 52, 61, 70, 79},
    { 8, 17, 26, 35, 44, 53, 62, 71, 80}
};

const uint zoneElements[S / ZONES][ZONES] = {
    { 0,  1,  2,  9, 10, 11, 18, 19, 20},
    { 3,  4,  5, 12, 13, 14, 21, 22, 23},
    { 6,  7,  8, 15, 16, 17, 24, 25, 26},
    {27, 28, 29, 36, 37, 38, 45, 46, 47},
    {30, 31, 32, 39, 40, 41, 48, 49, 50},
    {33, 34, 35, 42, 43, 44, 51, 52, 53},
    {54, 55, 56, 63, 64, 65, 72, 73, 74},
    {57, 58, 59, 66, 67, 68, 75, 76, 77},
    {60, 61, 62, 68, 70, 71, 78, 79, 80}
};

class SudokuBoard {
public:
    SudokuBoard() :
        filledIn(0)
    {
        for (uint i(0); i < S; ++i)
            table&#91;i&#93; = usedDigits&#91;i&#93; = 0;
    }

    virtual ~SudokuBoard() {
    }

    int const at(uint l, uint c) { // Returns the value at line l and row c
        if (isValidPos(l, c))
            return table&#91;l * L + c&#93;;
        else
            return -1;
    }

    void set(uint l, uint c, uint val) { // Sets the cell at line l and row c to hold the value val
        if (isValidPos(l, c) && ((0 < val) && (val <= MAXVAL))) {
            if (table&#91;l * C + c&#93; == 0)
                ++filledIn;
            table&#91;l * C + c&#93; = val;
            for (uint i = 0; i < C; ++i) // Update lines
                usedDigits&#91;lineElements&#91;l&#93;&#91;i&#93;&#93; |= 1<<val;
            for (uint i = 0; i < L; ++i) // Update columns
                usedDigits&#91;columnElements&#91;c&#93;&#91;i&#93;&#93; |= 1<<val;
            int z = findZone(l * C + c);
            for (uint i = 0; i < ZONES; ++i) // Update columns
                usedDigits&#91;zoneElements&#91;z&#93;&#91;i&#93;&#93; |= 1<<val;
        }
    }

    void solve() {
        try { // This is just a speed boost
            scanAndSet(); // Logic approach
            goBruteForce(); // Brute force approach
        } catch (int e) { // This is just a speed boost
        }
    }

    void scanAndSet() {
        int b;
        bool changed(true);
        while (changed) {
            changed = false;
            for (uint i(0); i < S; ++i)
                if (0 == table&#91;i&#93;) // Is there a digit already written?
                    if ((b = bitcount(usedDigits&#91;i&#93;)) == MAXVAL - 1) { // If there's only one digit I can place in this cell, do
                        int d(1); // Find the digit
                        while ((usedDigits&#91;i&#93; & 1<<d) > 0)
                            ++d;
                        set(i / C, i % C, d); // Fill it in
                        changed = true; // The board has been changed so this step must be rerun
                    } else if (bitcount(usedDigits[i]) == MAXVAL)
                        throw 666; // Speed boost
        }
    }

    void goBruteForce() {
        int max(-1); // Find the cell with the _minimum_ number of posibilities (i.e. the one with the largest number of /used/ digits)
        for (uint i(0); i < S; ++i)
            if (table&#91;i&#93; == 0) // Is there a digit already written?
                if ((max == -1) || (bitcount(usedDigits&#91;i&#93;) > bitcount(usedDigits[max])))
                    max = i;

        if (max != -1) {
            for (uint i(1); i <= MAXVAL; ++i) // Go through each possible digit
                if ((usedDigits&#91;max&#93; & 1<<i) == 0) { // If it can be placed in this cell, do
                    SudokuBoard temp(*this); // Create a new board
                    temp.set(max / C, max % C, i); // Complete the attempt
                    temp.solve(); // Solve it
                    if (temp.getFilledIn() == S) { // If the board was completely solved (i.e. the number of filled in cells is S)
                        for (uint j(0); j < S; ++j) // Copy the board into this one
                            set(j / C, j % C, temp.at(j / C, j % C));
                        return; // Break the recursive cascade
                    }
                }
        }
    }

    uint getFilledIn() {
        return filledIn;
    }

private:
    uint table&#91;S&#93;;
    uint usedDigits&#91;S&#93;;
    uint filledIn;

    bool const inline isValidPos(int l, int c) {
        return ((0 <= l) && (l < (int)L) && (0 <= c) && (c < (int)C));
    }

    uint const inline findZone(uint off) {
        return ((off / C / ZONEL) * (C / ZONEC) + (off % C / ZONEC));
    }

    uint const inline bitcount(uint x) {
        uint count(0);
        for (; x; ++count, x &= (x - 1));
        return count;
    }
};

void printB(SudokuBoard sb) {
    cout << "  |  -------------------------------  |" << endl;
    for (uint i(0); i < S; ++i) {
        if (i % 3 == 0)
            cout << "  |";
        cout << "  " << sb.at(i / L, i % L);
        if (i % C == C - 1) {
            if (i / C % 3 == 2)
                cout << "  |" << endl << "  |  -------------------------------";
            cout << "  |" << endl;
        }
    }
    cout << endl;
}

int main(int argc, char *argv&#91;&#93;) {
    SudokuBoard sb;

    ifstream fin("sudoku4.in");
    int aux;
    for (uint i(0); i < S; ++i) {
        fin >> aux;
        sb.set(i / L, i % L, aux);
    }
    fin.close();

    printB(sb);
    sb.solve();
    printB(sb);

    return 0;
}

Look at the main function. It first opens a file and then reads S ints from it. S is just the number of columns (C) multiplied by the number of lines (L). It reads the value into an auxiliary variable and then sets it in the SudokuBoard.

How does it set a cell? The relevant code is in set. The first line just checks if the parameters are valid (if the value’s not too large, if the specified cell does not exist, etc.). Then it checks if there’s a value already in the cell (there shouldn’t be). If not, it increments the number of filled-in cells.

Now things get intresting. If a certain cell contains the number n, it should be obvious that none of the cells on the same line, column or zone as the cell can contain n. Look at the board above: because there’s a 5 in cell 1,1, there can’t be any more fives in any of the cells on the first line, on the first column or in the upper-left zone. This is what the remainder of set does. It sets the nth bit in every bitset in whose corresponding cell the number n cannot appear.

Note: For a given cell, it’s trivial to find the line, column and zone in which it happens to be. What’s hard is to find the other cells in the same line, column or zone. To keep things simple, use three arrays of arrays that contain the number of the cells on a certain line, column or zone.

The next function of intrest is solve. If you’ll look at it, you’ll notice that it contains a tryexcept statement. As the comments clearly note, it’s just a speed boost. If you comment it out, the programme will still work (but in some cases a lot slower).

Solve calls two other functions: scanAndSet and goBruteForce. These are both algorithms to determine or guess what value should be placed in which cell.

scanAndSet
void scanAndSet() {
    int b;
    bool changed(true);
    while (changed) {
        changed = false;
        for (uint i(0); i < S; ++i)
            if (0 == table&#91;i&#93;) // Is there a digit already written?
                if ((b = bitcount(usedDigits&#91;i&#93;)) == MAXVAL - 1) { // If there's only one digit I can place in this cell, do
                    int d(1); // Find the digit
                    while ((usedDigits&#91;i&#93; & 1<<d) > 0)
                         ++d;
                    set(i / C, i % C, d); // Fill it in
                    changed = true; // The board has been changed so this step must be rerun
                } else if (bitcount(usedDigits[i]) == MAXVAL)
                    throw 666; // Speed boost
    }
}

The basic idea is this: we have a list of cells that need to be completed (those whose value is 0) and list of digits that cannot be placed in each cell. Go through the list of used digits, searching for a cell in which 8 digits cannot be placed (i.e. there’s only one digit that can be placed), and place it.

Now, every time you place a digit, you change the board a bit, restricting the digits that can be placed in other cells. So, you have to do the previous step until you don’t change anything any more.

There’s also a check in there if the number of used digits for any cell is 9 (i.e. no digit can be placed in the cell). If such a cell exists then the board is clearly wrong, so throw an exception (which is caught in the solve routine).

goBruteForce

void goBruteForce() {
int max(-1); // Find the cell with the _minimum_ number of posibilities (i.e. the one with the largest number of /used/ digits)
for (uint i(0); i < S; ++i) if (table[i] == 0) // Is there a digit already written? if ((max == -1) || (bitcount(usedDigits[i]) > bitcount(usedDigits[max])))
max = i;
if (max != -1) {
for (uint i(1); i <= MAXVAL; ++i) // Go through each possible digit if ((usedDigits[max] & 1<solve you’ll notice that there are some boards that are not completed by scanAndSet. Why? Because there are some boards that can’t be completed through logic alone (and ours isn’t particularly thorough either).

To get over this, you have to use a brute-force algorithm. The idea is simple enough: given the list of which digits cannot be placed in each cell, find the cell in which the minimum number of digits can be placed. For this cell, for every possible digit, write it down and try to solve the board.

This is where it becomes apparent why a C++ object-oriented approach is a smart move. Instead of writing the try in the current board and then having to keep track of what changes are made, simply clone the current board, fill in a cell and let it solve itself.

That’s it. You might want to try some of the other algorithms suggested on the Net. Good luck. Always open to comments.

In this article I’ll begin by defining binary numbers and describe the basic operations. Afterwords, I’ll show you several of the most common uses of binary numbers.

Humans work with digits in base 10, and we’re quite good at it. Computers, on the other hand, use base 2 and it shouldn’t come as a surprise that they’re extremely good at it. Today’s high level languages do a lot to hide this fact from us, but, still, a fairly good grasp of binary numbers is essential to working with computers, especially programming them.

Binary numbers

Mathworld defines binary as:

The base 2 method of counting in which only the digits 0 and 1 are used.

Basically, the binary representation of a number is the way of writing that number, uniquely, using only the digits 0 and 1. These are the binary representations of the first 16 natural numbers:
0 = 0000
1 = 0001
2 = 0010
3 = 0011
4 = 0100
5 = 0101
6 = 0110
7 = 0111
8 = 1000
9 = 1001
10 = 1010
11 = 1011
12 = 1100
13 = 1101
14 = 1110
15 = 1111

Given a number in base 10, how do you find the representation in base 2? You take the number and divide it repeatedly by 2 taking note of the remainder. After you’ve reached 0, write the remainders in reverse order. That’s the binary number! Here’s an example for 13:
Converting_13_to_binary

So,
13_in_binary

Now, how do you get a decimal form a binary? This is even easier. Take the number in binary form and start from the right side. Multiply the first digit (from the right side) with 20. Multiply the second digit with 21 and add it to the result from the first multiplication. Multiply the third digit with 22 and add it to what you have so far. … Do this for all the digits and you’ll get the decimal representation of the binary number.
13_binary_to_decimal

Logical operations

All of the following operations work in the same way. Take every bit form the first number and the corresponding bit from the second number (if it exists) and compute the resulting bit using the operation’s truth table.

NOT

NOT truth table

So, take each bit and reverse it.
0000 becomes 1111
0101 becomes 1010
1111 becomes 0000

It’s worthwhile to note that applying NOT twice, you get the original number unaltered.

AND

AND truth table

So, if both digits are 1 then the result is 1, otherwise it is 0.

OR

OR truth table

So, if either digit is 1, then the result is 1, otherwise it is 0.

XOR (eXclusive OR)

XOR truth table

So, if one of the digits it 1, but not both, then the result is 1, otherwise, the result is 0.

SHL (SHift Left) and SHR (SHift Right)

These aren’t like the other operations in that they don’t rely on truth tables. To apply them, simply drop the left-most (right-most) digit and add a 0 digit to the right (left).
After successive SHL operations,
0011 becomes 0110,
0110 becomes 1100,
1100 becomes 1000,
1000 becomes 0000

After successive SHR operations,
1100 becomes 0110,
0110 becomes 0011,
0011 becomes 0001,
0001 becomes 0000

Common use: Fast multiplication/division by 2

To quickly multiply an integer by 2, simply shift left the number.

To quickly divide an integer by 2, simply shift right the number.

In C, this would be:
a = a <> 1; /* Divide by 2 */
a = a >> 1; /* Multiply by 2*/

Setting/Checking a certain bit

To check whether the (n + 1)th bit is set (1) or not (0), use this:
a & 1<<n = 0 if the bit is NOT set
a & 1< 0 if the bit is set

To set the (n + 1)th bit, use this:
a = a | 1<<n;

What am I doing here? In both cases I first compute 1<<n. This is a binary number in which all digits are 0 but the nth digit is 1. Then, to check, I logically AND the two numbers. If the bit is set, then the result should be anything but 0. On the other hand, to set the bit, I logically OR the two numbers. If the bit was set, than this will have no effect. But if the bit was not set, it will be at the end.

Common use: bitsets

Chances are, at one time or another, you’ve had to use an array of boolean values. You’ve probably used something like this:
char foo[100];
foo[42] = 0; /* Equivalent of false */
foo[42] = 1; /* Equivalent of true*/

OR

bool bar[100];
bar[23] = false;
bar[23] = true;

This isn’t bad. Actually, in most cases it’s quite good. But if the number of elements is relatively small (less than 64), there’s a better way, that is using bitsets.

Instead of an array use an int, then use the methods described above to set and check the bits.
int a;
a = 0; /* The entire ``array" is set to false */
a |= 1<<3; /* The fourth bit is set */
if (a & 1<<3) /* Is the fourth bit set? */
a ^= 1<<3; /* If it is, flip it. */

Flipping bits

To flip a bit, XOR it by 1.
a ^= 1<<5; /* Flip the sixth bit */

Putting it all together: Printing a binary number

This small C programme prints out the binary representation of 0, 1, …, 16. (bit.c)

#include

/* Print n as a binary number */
void printbitssimple(int n) {
unsigned int i;
i = 1<<(sizeof(n) * 8 - 1); while (i > 0) {
if (n & i)
printf(“1”);
else
printf(“0”);
i >>= 1;
}
}

/* Print n as a binary number */
void printbits(int n) {
unsigned int i, step;

if (0 == n) { /* For simplicity’s sake, I treat 0 as a special case*/
printf(“0000”);
return;
}

i = 1<<(sizeof(n) * 8 - 1); step = -1; /* Only print the relevant digits */ step >>= 4; /* In groups of 4 */
while (step >= n) {
i >>= 4;
step >>= 4;
}

/* At this point, i is the smallest power of two larger or equal to n */
while (i > 0) {
if (n & i)
printf(“1”);
else
printf(“0”);
i >>= 1;
}
}

int main(int argc, char *argv[]) {
int i;
for (i = 0; i < 16; ++i) { printf("%d = ", i); printbits(i); printf("\n"); } return 0; } [/sourcecode] Code should be fairly easy to understand. Good luck. Always open to comments. Update: The next article in this series is Counting Bits.