Langton’s Ant is a turmite governed by simple rules whose outcome is both unpredictable and intresting. The path taken by the ant generates some surprising shapes, never appearing when you would expect them to, but a seemingly random moments. This article describes the rules behind Langton’s Ant, shows some of the images formed and provides a Python programme to simulate the ant.

Langton’s Ant

Chris Langton is a biologist, and the founder of the field of artificial life. One of his simplest and most intresting creations is Langton’s Ant. It is quite important theoretically, but it also has some intresting practical applications.

The ant’s world consists of a grid, possibly infinite, but limited in our example:
Step1.png

The ant always moves through this grid one step at a time. It’s direction and it’s effect on the grid is defined by 3 simple rules:

  1. If the ant is on a black square, it turns right 90 degrees.
  2. If the ant is on a white square, it turns left 90 degrees.
  3. When the ant leaves a square, it inverts colour.

It is fairly obvious that the ant’s movement will leave a coloured trail. When asked, most people would guess that either no patterns show or that some basic symmetric images appear. What actually happens is this.

For the first 50 or so steps, it just seems to move around randomly, and then at step 52, you get a heart:
Step53.png

For the next 300 or so steps, the ant seems to draw random shapes, ofter erasing what it had drawn before. At about step 383 you get something that looks like this:
Step383.png

Next you get what’s technically known as a mess. The ant’s movement degenerates into chaos. No more patterns are observed:
Step2615.png

By now, you are probably thinking: “Well yes. Simple rules, so at first you get simple patterns, but as the board gets more and more complex, the simple rules can’t handle it, so the ant moves randomly”. Well that’s right, up until about step 10000; that is when you start getting highways.

The highways the ant draws are intresting shapes. You can see one at the bottom of the next image. It is the diagonal bit going down and right. To build then, the ant seems to move in circles, the last row forcing it to draw the next. Thus, the ant continues to draw until something gets in its way. In this simulation, because the screen wraps around, the ant eventually gets back to the mess, where it starts moving randomly again.
Step10711.png

What’s intresting is that, after some time, the ant starts building highways again. Actually, there is no known way to stop it. The only hindrance here is that the board is finite so it has a tendency to fill up. Anyway, here’s what the board looks like at step 97049:
Step97049.png

The Programme

The following is a simple programme written in Python and using PyGame.

The variables that control the display and the board are at the top of the programme. They are heavily commented so there shouldn’t be any problems.

Here’s the code in Python (langton.py):

#************************************************
# Rules of the game
#	1. If the ant is on a black square, it turns
#		right 90 and moves forward one unit
#	2. If the ant is on a white square, it turns
# 		left 90 and moves forward one unit
#	3. When the ant leaves a square, it inverts
#		colour
#
# SEE: http://mathworld.wolfram.com/LangtonsAnt.html
#************************************************

import sys, pygame
from pygame.locals import *
import time

dirs = (
		(-1, 0),
		(0, 1),
		(1, 0),
		(0, -1)
		)

cellSize = 12 # size in pixels of the board (4 pixels are used to draw the grid)
numCells = 64 # length of the side of the board
background = 0, 0, 0 # background colour; black here
foreground = 23, 23, 23 # foreground colour; the grid's colour; dark gray here
textcol = 177, 177, 177 # the colour of the step display in the upper left of the screen
antwalk = 44, 88, 44 # the ant's trail; greenish here
antant = 222, 44, 44 # the ant's colour; red here
fps = 1.0 / 40 # time between steps; 1.0 / 40 means 40 steps per second

def main():
	pygame.init()

	size = width, height = numCells * cellSize, numCells * cellSize

	pygame.display.set_caption("Langton's Ant")

	screen = pygame.display.set_mode(size) # Screen is now an object representing the window in which we paint
	screen.fill(background)
	pygame.display.flip() # IMPORTANT: No changes are displayed until this function gets called

	for i in xrange(1, numCells):
		pygame.draw.line(screen, foreground, (i * cellSize, 1), (i * cellSize, numCells * cellSize), 2)
		pygame.draw.line(screen, foreground, (1, i * cellSize), (numCells * cellSize, i * cellSize), 2)
	pygame.display.flip() # IMPORTANT: No changes are displayed until this function gets called

	font = pygame.font.Font(None, 36)

	antx, anty = numCells / 2, numCells / 2
	antdir = 0
	board = [[False] * numCells for e in xrange(numCells)]

	step = 1
	pause = False
	while True:
		for event in pygame.event.get():
				if event.type == QUIT:
					return
				elif event.type == KEYUP:
					if event.key == 32: # If space pressed, pause or unpause
						pause = not pause
					elif event.key == 115:
						pygame.image.save(screen, "Step%d.tga" % (step))

		if pause:
			time.sleep(fps)
			continue

		text = font.render("%d " % (step), True, textcol, background)
		screen.blit(text, (10, 10))
		
		if board[antx][anty]:
			board[antx][anty] = False # See rule 3
			screen.fill(background, pygame.Rect(antx * cellSize + 1, anty * cellSize + 1, cellSize - 2, cellSize - 2))
			antdir = (antdir + 1) % 4 # See rule 1
		else:
			board[antx][anty] = True # See rule 3
			screen.fill(antwalk, pygame.Rect(antx * cellSize + 1, anty * cellSize + 1, cellSize - 2, cellSize - 2))
			antdir = (antdir + 3) % 4 # See rule 2

		antx = (antx + dirs[antdir][0]) % numCells
		anty = (anty + dirs[antdir][1]) % numCells

		# The current square (i.e. the ant) is painted a different colour
		screen.fill(antant, pygame.Rect(antx * cellSize + 1, anty * cellSize + 1, cellSize -2, cellSize -2))

		pygame.display.flip() # IMPORTANT: No changes are displayed until this function gets called

		step += 1
		time.sleep(fps)

if __name__ == "__main__":
	main()

It is intresting to note that although you know all the rules that govern the ant’s universe, you cannot predict its movement without resorting to simulation. This just goes to show that knowing the rules of the components at the lowest level might not help you understand the system as a whole.

That’s it. Have fun with the code. Always open to comments.

In this article I describe a way of modifying Dijkstra’s Alogrithm in order to find all the shortest path from a source to a node.

This article assumes you know how Dijkstra’s Algorithm works. If you don’t, see my previous post or the Wikipedia article.

The Problem

You know how to use Dijkstra’s algorithm to find the length of the shortest path to a node. You’ve even figured out how to record the path to each node. But you what you really need are all the shortest paths leading to a node.

The Idea

I can help, but to be honest, this is obvious.

In order to record the path to each node, I used an array to record which node comes before each other node in the shortest path. That is to say: prev[i] was the node that comes just before node i in the shortest path from the source to node i.

To record all the shortest paths that lead to a node, I just turned prev into a matrix with the following meaning: prev[i][0] is the number of nodes that could come before node i on a path of minimum length; prev[i][1..] are the nodes that could come before node i on path of minimum length.

The Programme

Here’s the code in C (dijkstraAll.c):

#include

#define GRAPHSIZE 2048
#define INFINITY GRAPHSIZE*GRAPHSIZE
#define MAX(a, b) ((a > b) ? (a) : (b))

int e; /* The number of nonzero edges in the graph */
int n; /* The number of nodes in the graph */
long dist[GRAPHSIZE][GRAPHSIZE]; /* dist[i][j] is the distance between node i and j; or 0 if there is no direct connection */
long d[GRAPHSIZE]; /* d[i] is the length of the shortest path between the source (s) and node i */
int prev[GRAPHSIZE][GRAPHSIZE + 1]; /* prev[i] holds the nodes that could comes right before i in the shortest path from the source to i;
prev[i][0] is the number of nodes and prev[i][1..] are the nodes */

void printD() {
int i;

printf(“Distances:\n”);
for (i = 1; i <= n; ++i) printf("%10d", i); printf("\n"); for (i = 1; i <= n; ++i) { printf("%10ld", d[i]); } printf("\n"); } /* * Prints the shortest path from the source to dest. * * dijkstra(int) MUST be run at least once BEFORE * this is called */ void printPath(int dest, int depth) { int i, j; printf("-%d\n", dest); for (i = 1; i <= prev[dest][0]; ++i) { for (j = 0; j <= depth; ++j) printf(" |"); printPath(prev[dest][i], depth + 1); } } void dijkstra(int s) { int i, k, mini; int visited[GRAPHSIZE]; for (i = 1; i <= n; ++i) { d[i] = INFINITY; prev[i][0] = 0; /* no path has yet been found to i */ visited[i] = 0; /* the i-th element has not yet been visited */ } d[s] = 0; for (k = 1; k <= n; ++k) { mini = -1; for (i = 1; i <= n; ++i) if (!visited[i] && ((mini == -1) || (d[i] < d[mini]))) mini = i; visited[mini] = 1; for (i = 1; i <= n; ++i) if (dist[mini][i]) { if (d[mini] + dist[mini][i] < d[i]) { /* a shorter path has been found */ d[i] = d[mini] + dist[mini][i]; prev[i][0] = 1; prev[i][1] = mini; } else if (d[mini] + dist[mini][i] == d[i]) { /* a path of the same length has been found */ ++prev[i][0]; prev[i][prev[i][0]] = mini; } } } } int main(int argc, char *argv[]) { int i, j; int u, v, w; FILE *fin = fopen("dist.txt", "r"); fscanf(fin, "%d", &e); for (i = 0; i < e; ++i) for (j = 0; j < e; ++j) dist[i][j] = 0; n = -1; for (i = 0; i < e; ++i) { fscanf(fin, "%d%d%d", &u, &v, &w); dist[u][v] = w; n = MAX(u, MAX(v, n)); } fclose(fin); dijkstra(1); printD(); printf("\n"); for (i = 1; i <= n; ++i) { printf("Path to %d:\n", i); printPath(i, 0); printf("\n"); } return 0; } [/sourcecode] And here's an input file: dist.txt.
10
1 2 5
1 4 3
2 3 1
2 4 3
3 5 6
4 2 2
4 3 9
4 5 2
5 1 7
5 3 4

The input file describes this graph:
djAll.png

As you can see, there are two paths from node 1 to node 3: 1 -> 2 -> 3 and 1 -> 4 -> 2 -> 3 both of length 6.

Now, what does the programme output?
Distances:
1 2 3 4 5
0 5 6 3 5

Path to 1:
-1

Path to 2:
-2
|-1
|-4
| |-1

Path to 3:
-3
|-2
| |-1
| |-4
| | |-1

Path to 4:
-4
|-1

Path to 5:
-5
|-4
| |-1

It first outputs the distances, and… yes! They’re correct.

Next, it prints those ASCII art drawings. They not drawings. They’re trees with the destination as root and the leafs as the source. To read a path from such a tree, start at a leaf (always 1) and go left, reading the first numbers you can see above.

Let’s find the paths to node 3. There are two leafs, so there are two paths of minimal length. The first one is 1 -> 4 -> 2 -> 3. The second one is 1 -> 2 -> 3. Check on the graph.

That’s it. If you’re up to a challenge, implement prev as an array of linked lists.

Good luck. Always open to comments.

In this article, I describe a simple (adds less than 1min of work) way to speed up Dijkstra’s Algorithm for finding the single source shortest path to every node in a graph.

In a previous post I described the simple O(n2) implementation of the algorithm. Here, I focus on a method that will probably speed up the algorithm.

Why Bother

The previous implementation of the algorithm ran in O(n2) time, where n is the number of nodes in the graph. This means that for a graph of, say 100 nodes, it would do about 100 * 100 = 100000 calculations. Considering that computers nowadays are said to be able to do about 100000000 (a hundred million) calculations per second, we’re fine, and the programme will finish in well under a second. But what if we have a graph with 100000 nodes? This might take 100 seconds to run. Now we’re in trouble. We need a faster algorithm.

The two most common ways to speed up Dijkstra’s Algorithm are to implement the finding of the closest node not yet visited as priority queues. Usually heaps or Fibonacci Heaps are used for this purpose (Fibonacci Heaps were actually invented for this).

Heaps are somewhat difficult to implement and Fibonacci Heaps are horror to implement. Incidentally, there’s a very easy of speeding it up.

Just Use Queues

The idea is to simply use queues instead of priority queues. This way provides nowhere near the same level of speedup (the algorithm is still O(n2)), but it makes it run faster, on average, by a factor of 4.

Some bad news: a carefully crafted graph could slow this algorithm down to O(n3). As a rule, graphs in real life are never like this, and, as the method isn’t widely known, test sets for contests are not written to catch this optimisation.

Now for the good news: it’s shockingly easy to write. Compare the old dijkstra1() with the new dijkstra2().

void dijkstra1(int s) {
	int i, k, mini;
	int visited[GRAPHSIZE];

	for (i = 1; i <= n; ++i) {
		d1&#91;i&#93; = INFINITY;
		visited&#91;i&#93; = 0; /* the i-th element has not yet been visited */
	}

	d1&#91;s&#93; = 0;

	for (k = 1; k <= n; ++k) {
		mini = -1;
		for (i = 1; i <= n; ++i)
			if (!visited&#91;i&#93; && ((mini == -1) || (d1&#91;i&#93; < d1&#91;mini&#93;)))
				mini = i;

		visited&#91;mini&#93; = 1;

		for (i = 1; i <= n; ++i)
			if (dist&#91;mini&#93;&#91;i&#93;)
				if (d1&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93; < d1&#91;i&#93;) 
					d1&#91;i&#93; = d1&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93;;
	}
}

void dijkstra2(int s) {
	int queue&#91;GRAPHSIZE&#93;;
	char inQueue&#91;GRAPHSIZE&#93;;
	int begq = 0,
	    endq = 0;
	int i, mini;
	int visited&#91;GRAPHSIZE&#93;;

	for (i = 1; i <= n; ++i) {
		d2&#91;i&#93; = INFINITY;
		visited&#91;i&#93; = 0; /* the i-th element has not yet been visited */
		inQueue&#91;i&#93; = 0;
	}

	d2&#91;s&#93; = 0;
	queue&#91;endq&#93; = s;
	endq = (endq + 1) % GRAPHSIZE;


	while (begq != endq) {
		mini = queue&#91;begq&#93;;
		begq = (begq + 1) % GRAPHSIZE;
		inQueue&#91;mini&#93; = 0;

		visited&#91;mini&#93; = 1;

		for (i = 1; i <= n; ++i)
			if (dist&#91;mini&#93;&#91;i&#93;)
				if (d2&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93; < d2&#91;i&#93;) { 
					d2&#91;i&#93; = d2&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93;;
					if (!inQueue&#91;i&#93;) {
						queue&#91;endq&#93; = i;
						endq = (endq + 1) % GRAPHSIZE;
						inQueue&#91;i&#93; = 1;
					}
				}
	}
}
&#91;/sourcecode&#93;

What's changed? First, we <strong>define several new variables</strong>. These, together, make up the queue:

int queue[GRAPHSIZE];
char inQueue[GRAPHSIZE];
int begq = 0,
    endq = 0;

Next, during the initialisation part of the function, we mark all nodes as not being in the queue.

for (i = 1; i <= n; ++i) {
	/* OTHER INITIALISATIONS (look at the programme) */
	inQueue&#91;i&#93; = 0;
}
&#91;/sourcecode&#93;

Now, add the source node to the queue.
&#91;sourcecode language='cpp'&#93;
queue&#91;endq&#93; = s;
endq = (endq + 1) % GRAPHSIZE;
&#91;/sourcecode&#93;

What does this do? The first line add <strong>s</strong> to the end of the queue. The second line moves the end of the queue one step to the right (I'll explain a few paragraphs down). The modulo operation here is not really necessary, but I like to be consistent.

At this point we'll start looping. When do we stop? The idea here is that a node is in the queue when its neighbours need to be updated (i.e. when a new shortest path might be found leading to them). So, we stop when the queue is empty. Note that this occurs when <strong>begq == endq</strong> and <strong>not</strong> when <strong>!(begq &lt; endq)</strong>. So, <strong>while (begq &lt; endq)</strong> is incorrect because, in one case, <strong>begq</strong> will be greater then <strong>endq</strong>.

What was the first thing we did in the loop? We were supposed to find the closest node not yet visited. Now, we merely take the first node from the queue.

mini = queue[begq];
begq = (begq + 1) % GRAPHSIZE;
inQueue[mini] = 0;

Here, the first element is pop’d out of the queue, the head of the queue is moved one step to the right and the element is marked as not being in the queue. The problem with queues in general, and this one in particular is that the part of them that actually hold the elements tends to move around. Here, every insert moves the tail one step to the right and every pop moves the head one step to the right. Consider the following sequence of operations:

queue.png

Moves 1 through 8 clearly show that, while the size of the information content of the queue changes erratically, it constantly moves to the right. What happened at 9? The queue got to the end of available memory and wrapped around to the beginning. This is the purpose of (begq + 1) % GRAPHSIZE and (endq + 1) % GRAPHSIZE. It turns 7, 8, 9, etc. into 1, 2, 3, etc. But won’t endq overrun begq? No, the use of inQueue guarantees that no element will be inserted in the queue more than once. And as the queue is of size GRAPHSIZE, no overrun is possible.

So far, so good. One last modification: when we update the distance to a node, we add it to the queue (if it’s not already in it).

for (i = 1; i <= n; ++i) if (dist[mini][i]) if (d2[mini] + dist[mini][i] < d2[i]) { d2[i] = d2[mini] + dist[mini][i]; if (!inQueue[i]) { queue[endq] = i; endq = (endq + 1) % GRAPHSIZE; inQueue[i] = 1; } } [/sourcecode]

Comparing the speed

When I first wrote this, I wanted to be able to check that it outputs correct results and I wanted to see how much faster it is. The following programme does both. The function cmpd() checks the output against that given by the simple implementation and the various clock() calls littered through the code time the two functions.

Here’s the code in C (dijkstra2.c):
Note: Source might be mangled by WordPress, consider downloading the file.

#include
#include

#define GRAPHSIZE 2048
#define INFINITY GRAPHSIZE*GRAPHSIZE
#define MAX(a, b) ((a > b) ? (a) : (b))

int e; /* The number of nonzero edges in the graph */
int n; /* The number of nodes in the graph */
long dist[GRAPHSIZE][GRAPHSIZE]; /* dist[i][j] is the distance between node i and j; or 0 if there is no direct connection */
long d1[GRAPHSIZE], d2[GRAPHSIZE]; /* d[i] is the length of the shortest path between the source (s) and node i */

void dijkstra1(int s) {
int i, k, mini;
int visited[GRAPHSIZE];

for (i = 1; i <= n; ++i) { d1[i] = INFINITY; visited[i] = 0; /* the i-th element has not yet been visited */ } d1[s] = 0; for (k = 1; k <= n; ++k) { mini = -1; for (i = 1; i <= n; ++i) if (!visited[i] && ((mini == -1) || (d1[i] < d1[mini]))) mini = i; visited[mini] = 1; for (i = 1; i <= n; ++i) if (dist[mini][i]) if (d1[mini] + dist[mini][i] < d1[i]) d1[i] = d1[mini] + dist[mini][i]; } } void dijkstra2(int s) { int queue[GRAPHSIZE]; char inQueue[GRAPHSIZE]; int begq = 0, endq = 0; int i, mini; int visited[GRAPHSIZE]; for (i = 1; i <= n; ++i) { d2[i] = INFINITY; visited[i] = 0; /* the i-th element has not yet been visited */ inQueue[i] = 0; } d2[s] = 0; queue[endq] = s; endq = (endq + 1) % GRAPHSIZE; while (begq != endq) { mini = queue[begq]; begq = (begq + 1) % GRAPHSIZE; inQueue[mini] = 0; visited[mini] = 1; for (i = 1; i <= n; ++i) if (dist[mini][i]) if (d2[mini] + dist[mini][i] < d2[i]) { d2[i] = d2[mini] + dist[mini][i]; if (!inQueue[i]) { queue[endq] = i; endq = (endq + 1) % GRAPHSIZE; inQueue[i] = 1; } } } } int cmpd() { int i; for (i = 0; i < n; ++i) if (d1[i] != d2[i]) return 0; return 1; } int main(int argc, char *argv[]) { int i, j; int u, v, w; long t1 = 0, t2 = 0; FILE *fin = fopen("dist2.txt", "r"); fscanf(fin, "%d", &e); for (i = 0; i < e; ++i) for (j = 0; j < e; ++j) dist[i][j] = 0; n = -1; for (i = 0; i < e; ++i) { fscanf(fin, "%d%d%d", &u, &v, &w); dist[u][v] = w; n = MAX(u, MAX(v, n)); } fclose(fin); for (i = 1; i <= n; ++i) { long aux = clock(); dijkstra1(i); t1 += clock() - aux; aux = clock(); dijkstra2(i); t2 += clock() - aux; if (i % 10 == 0) { printf("%d / %d\n", i, n); fflush(stdout); } if (!cmpd()) { printf("\nResults for %d do NOT match\n", i); break; } } printf("\n"); printf("Dijkstra O(N^2):\t\t%ld\n", t1); printf("Dijkstra unstable:\t\t%ld\n", t2); printf("Ratio:\t\t\t\t%.2f\n", (float)t1/t2); /* printD(); */ return 0; } [/sourcecode] And here's a big 1200 node graph: dist2.txt.

Have fun and good luck. Always open to comments.

In this article I describe Dijkstra’s algorithm for finding the shortest path from one source to all the other vertexes in a graph. Afterwards, I provide the source code in C of a simple implementation.

To understand this you should know what a graph is, and how to store one in memory. If in doubt check this and this.

Another solution for this problem is the Bellman-Ford algorithm.

The Problem

Given the following graph calculate the length of the shortest path from node 1 to every other node.
dj1.png

Lets take the nodes 1 and 3. There are several paths (1 -> 4 -> 3, 1 -> 2 -> 3, etc.), but the shortest of them is 1 -> 4 -> 2 -> 3 of length 9. Our job is to find it.

The Algorithm

Dijkstra’s algorithm is one of the most common solutions to this problem. Even so, it only works on graphs which have no edges of negative weight, and the actual speed of the algorithm can vary from O(n*lg(lg(n))) to O(n2).

The idea is somewhat simple:

Take the length of the shortest path to all nodes to be infinity. Mark the length of the shortest path to the source as 0.

dj2.png

Now, we already know that the graph has no edges of negative weight so the a path of length 0 is the best we can come up with. The path to the source is 0, so it’s optimal.

This algorithm works by making the paths to one more node optimal at each step. So, at the kth step, you know for sure that there are at least k nodes to which you know the shortest path.

At each step, choose the node, which is not yet optimal, but which is closest to the source; i.e. the node to which the current calculated shortest path is smallest. Then, from it, try to optimise the path to every node connected to it. Finally, mark the said node as optimal (visited, if you prefer). In the previous example, the node which is closest to the source and is not yet optimal is the source. From it, you can optimise the path to nodes 2 and 4.
dj3_1.png

At this point, the only visited/optimal node is 0. Now we have to redo this step 4 more times (to ensure that all nodes are optimal).

The next node to consider is 4:
dj4.png

It’s worthwhile to note that at this step, we’ve also found a better path to node 2.
Next is node 2:
dj5.png

Finally, we look at nodes 5 and 3 (none of which offer any optimisations):
dj5.png

dj5.png

The actual code in C looks something like this:

void dijkstra(int s) {
        int i, k, mini;
        int visited[GRAPHSIZE];

        for (i = 1; i <= n; ++i) {
                d&#91;i&#93; = INFINITY;
                visited&#91;i&#93; = 0; /* the i-th element has not yet been visited */
        }

        d&#91;s&#93; = 0;

        for (k = 1; k <= n; ++k) {
                mini = -1;
                for (i = 1; i <= n; ++i)
                        if (!visited&#91;i&#93; && ((mini == -1) || (d&#91;i&#93; < d&#91;mini&#93;)))
                                mini = i;

                visited&#91;mini&#93; = 1;

                for (i = 1; i <= n; ++i)
                        if (dist&#91;mini&#93;&#91;i&#93;)
                                if (d&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93; < d&#91;i&#93;)
                                        d&#91;i&#93; = d&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93;;
        }
}
&#91;/sourcecode&#93;

<h4>The Programme</h4>
Putting the above into context, we get the <strong>O(n<sup>2</sup>)</strong> implementation. This works well for most graphs (it will <strong>not</strong> work for graphs with negative weight edges), and it's quite fast.

Here's the source code in C (<a href='https://compprog.files.wordpress.com/2007/12/dijkstra.c' title='dijkstra.c'>dijkstra.c</a>):

#include <stdio.h>

#define GRAPHSIZE 2048
#define INFINITY GRAPHSIZE*GRAPHSIZE
#define MAX(a, b) ((a > b) ? (a) : (b))

int e; /* The number of nonzero edges in the graph */
int n; /* The number of nodes in the graph */
long dist[GRAPHSIZE][GRAPHSIZE]; /* dist[i][j] is the distance between node i and j; or 0 if there is no direct connection */
long d[GRAPHSIZE]; /* d[i] is the length of the shortest path between the source (s) and node i */

void printD() {
	int i;
	for (i = 1; i <= n; ++i)
		printf("%10d", i);
	printf("\n");
	for (i = 1; i <= n; ++i) {
		printf("%10ld", d&#91;i&#93;);
	}
	printf("\n");
}

void dijkstra(int s) {
	int i, k, mini;
	int visited&#91;GRAPHSIZE&#93;;

	for (i = 1; i <= n; ++i) {
		d&#91;i&#93; = INFINITY;
		visited&#91;i&#93; = 0; /* the i-th element has not yet been visited */
	}

	d&#91;s&#93; = 0;

	for (k = 1; k <= n; ++k) {
		mini = -1;
		for (i = 1; i <= n; ++i)
			if (!visited&#91;i&#93; && ((mini == -1) || (d&#91;i&#93; < d&#91;mini&#93;)))
				mini = i;

		visited&#91;mini&#93; = 1;

		for (i = 1; i <= n; ++i)
			if (dist&#91;mini&#93;&#91;i&#93;)
				if (d&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93; < d&#91;i&#93;) 
					d&#91;i&#93; = d&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93;;
	}
}

int main(int argc, char *argv&#91;&#93;) {
	int i, j;
	int u, v, w;

	FILE *fin = fopen("dist.txt", "r");
	fscanf(fin, "%d", &e);
	for (i = 0; i < e; ++i)
		for (j = 0; j < e; ++j)
			dist&#91;i&#93;&#91;j&#93; = 0;
	n = -1;
	for (i = 0; i < e; ++i) {
		fscanf(fin, "%d%d%d", &u, &v, &w);
		dist&#91;u&#93;&#91;v&#93; = w;
		n = MAX(u, MAX(v, n));
	}
	fclose(fin);

	dijkstra(1);

	printD();

	return 0;
}
&#91;/sourcecode&#93;

And here's a sample input file(<a href='https://compprog.files.wordpress.com/2007/12/dist.txt' title='dist.txt'>dist.txt</a>):

<code>10
1 2 10
1 4 5
2 3 1
2 4 3
3 5 6
4 2 2
4 3 9
4 5 2
5 1 7
5 3 4
</code>

The graph is given as an edge list:
<ul>
	<li>the first line contains <em>e</em>, the number of edges</li>
	<li>the following <em>e</em> lines contain <em>3</em> numbers: <em>u</em>, <em>v</em> and <em>w</em> signifying that there's an edge from <em>u</em> to <em>v</em> of weight <em>w</em></li>
</ul>

That's it. Good luck and have fun. Always open to comments.

<h4>Finding the shortest path</h4>
<strong>UPDATE</strong> In response to <strong>campOs</strong>' comment.

Now we know the distance between the source node and any other node (the distance to the ith node is remembered in <strong>d[i]</strong>). But suppose we also need the path (which nodes make up the path).

Look at the above code. Where is <strong>d</strong> modified? Where is the recorded distance between the source and a node modified? In two places:

Firstly, <strong>d[s]</strong> is initialised to be <em>0</em>.

d[s] = 0;

And then, when a new shortest path is found, d[i] is updated accordingly:

for (i = 1; i <= n; ++i)
	if (dist&#91;mini&#93;&#91;i&#93;)
		if (d&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93; < d&#91;i&#93;) 
			d&#91;i&#93; = d&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93;;
&#91;/sourcecode&#93;

The important thing to notice here is that <strong>when you update the shortest distance to node i, you know the previous node in the path to i</strong>. This is, of course, <strong>mini</strong>. This suggests the solution to our problem.

For every node <strong>i</strong> other than the source, remember not only the distance to it, but also the previous node in the path to it. Thus we have a new array, <strong>prev</strong>.

Now, we need to make to modifications.
First, we initialise the value of <strong>prev[i]</strong> to something impossible (say <em>-1</em>) at the start of <strong>dijkstra()</strong>.

for (i = 1; i <= n; ++i) {
	d&#91;i&#93; = INFINITY;
	prev&#91;i&#93; = -1; /* no path has yet been found to i */
	visited&#91;i&#93; = 0; /* the i-th element has not yet been visited */
}
&#91;/sourcecode&#93;

Secondly, we update the value of <strong>prev[i]</strong> every time a new shortest path is found to i.

for (i = 1; i <= n; ++i)
	if (dist&#91;mini&#93;&#91;i&#93;)
		if (d&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93; < d&#91;i&#93;) {
			d&#91;i&#93; = d&#91;mini&#93; + dist&#91;mini&#93;&#91;i&#93;;
			prev&#91;i&#93; = mini;
		}
&#91;/sourcecode&#93;

Good. For every node reachable from the source we know which node is just before it in the shortest path. For the above example, we would have the following array:
<code>i - prev[i]
1 - -1
2 - 4
3 - 2
4 - 1
5 - 4
</code>

Using this, how do you get the path? Let's say you want to get to <em>3</em>. Which node comes right before <em>3</em>? Node <em>2</em>. Which node comes right before node <em>2</em>? Node <em>4</em>. Which node comes before <em>4</em>? Node <em>1</em>. We've reached the source, so we're done. Go through this list backwards and you get the path: <em>1 -&gt; 4 -&gt; 2 -&gt; 3</em>. This is easily implemented with <a href="http://en.wikipedia.org/wiki/Recursion">recursion</a>.


void printPath(int dest) {
	if (prev[dest] != -1)
		printPath(prev[dest]);
	printf("%d ", dest);
}

Here is the updated source: dijkstraWithPath.c.

Good luck.

In this article, I describe the Bellman-Ford algorithm for finding the one-source shortest paths in a graph, give an informal proof and provide the source code in C for a simple implementation.

To understand this you should know what a graph is, and how to store one in memory. If in doubt check this and this.

Another solution to this problem is Dijkstra’s algorithm.

The Problem

Given the following graph, calculate the length of the shortest path from node 1 to node 2.
bf1.png

It’s obvious that there’s a direct route of length 6, but take a look at path: 1 -> 4 -> 3 -> 2. The length of the path is 7 – 3 – 2 = 2, which is less than 6. BTW, you don’t need negative edge weights to get such a situation, but they do clarify the problem.

This also suggests a property of shortest path algorithms: to find the shortest path form x to y, you need to know, beforehand, the shortest paths to y‘s neighbours. For this, you need to know the paths to y‘s neighbours’ neighbours… In the end, you must calculate the shortest path to the connected component of the graph in which x and y are found.

That said, you usually calculate the shortest path to all nodes and then pick the ones you’re intrested in.

The Algorithm

The Bellman-Ford algorithm is one of the classic solutions to this problem. It calculates the shortest path to all nodes in the graph from a single source.

The basic idea is simple:
Start by considering that the shortest path to all nodes, less the source, is infinity. Mark the length of the path to the source as 0:
bf2.png

Take every edge and try to relax it:
bf3.png

Relaxing an edge means checking to see if the path to the node the edge is pointing to can’t be shortened, and if so, doing it. In the above graph, by checking the edge 1 -> 2 of length 6, you find that the length of the shortest path to node 1 plus the length of the edge 1 -> 2 is less then infinity. So, you replace infinity in node 2 with 6. The same can be said for edge 1 -> 4 of length 7. It’s also worth noting that, practically, you can’t relax the edges whose start has the shortest path of length infinity to it.

Now, you apply the previous step n – 1 times, where n is the number of nodes in the graph. In this example, you have to apply it 4 times (that’s 3 more times).
bf4.png

bf5.png

bf6.png

That’s it, here’s the algorithm in a condensed form:

void bellman_ford(int s) {
int i, j;

for (i = 0; i < n; ++i) d[i] = INFINITY; d[s] = 0; for (i = 0; i < n - 1; ++i) for (j = 0; j < e; ++j) if (d[edges[j].u] + edges[j].w < d[edges[j].v]) d[edges[j].v] = d[edges[j].u] + edges[j].w; } [/sourcecode] Here, d[i] is the shortest path to node i, e is the number of edges and edges[i] is the i-th edge.

It may not be obvious why this works, but take a look at what is certain after each step. After the first step, any path made up of at most 2 nodes will be optimal. After the step 2, any path made up of at most 3 nodes will be optimal… After the (n – 1)-th step, any path made up of at most n nodes will be optimal.

The Programme

The following programme just puts the bellman_ford function into context. It runs in O(VE) time, so for the example graph it will do something on the lines of 5 * 9 = 45 relaxations. Keep in mind that this algorithm works quite well on graphs with few edges, but is very slow for dense graphs (graphs with almost n2 edges). For graphs with lots of edges, you’re better off with Dijkstra’s algorithm.

Here’s the source code in C (bellmanford.c):

#include

typedef struct {
int u, v, w;
} Edge;

int n; /* the number of nodes */
int e; /* the number of edges */
Edge edges[1024]; /* large enough for n <= 2^5=32 */ int d[32]; /* d[i] is the minimum distance from node s to node i */ #define INFINITY 10000 void printDist() { int i; printf("Distances:\n"); for (i = 0; i < n; ++i) printf("to %d\t", i + 1); printf("\n"); for (i = 0; i < n; ++i) printf("%d\t", d[i]); printf("\n\n"); } void bellman_ford(int s) { int i, j; for (i = 0; i < n; ++i) d[i] = INFINITY; d[s] = 0; for (i = 0; i < n - 1; ++i) for (j = 0; j < e; ++j) if (d[edges[j].u] + edges[j].w < d[edges[j].v]) d[edges[j].v] = d[edges[j].u] + edges[j].w; } int main(int argc, char *argv[]) { int i, j; int w; FILE *fin = fopen("dist.txt", "r"); fscanf(fin, "%d", &n); e = 0; for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) { fscanf(fin, "%d", &w); if (w != 0) { edges[e].u = i; edges[e].v = j; edges[e].w = w; ++e; } } fclose(fin); /* printDist(); */ bellman_ford(0); printDist(); return 0; } [/sourcecode] And here's the input file used in the example (dist.txt):
5
0 6 0 7 0
0 0 5 8 -4
0 -2 0 0 0
0 0 -3 9 0
2 0 7 0 0

That’s an adjacency matrix.

That’s it. Have fun. Always open to comments.

In this article, I describe the greedy algorithm for solving the Fractional Knapsack Problem and give an implementation in C.

The Problem

The Fractional Knapsack Problem usually sounds like this:

Ted Thief has just broken into the Fort Knox! He sees himself in a room with n piles of gold dust. Because the each pile has a different purity, each pile also has a different value (v[i]) and a different weight (c[i]). Ted has a knapsack that can only hold W kilograms.

Given n, v, c and W, calculate which piles Ted should completely put into his knapsack and which he should put only a fraction of.

So, for this input:
n = 5
c = {12, 1, 2, 1, 4}
v = {4, 2, 2, 1, 10}
W = 15

Ted should take piles 2, 3, 4 and 5 completely and about 58% of pile 1.

You’re usually dealling with a knapsack problem when you’re give the cost and the benefits of certain objects and asked to obtain the maximum benefit so that the sum of the costs is smaller than a given value. You’ve got the fractional knapsack problem when you can take fractions (as opposed to all or nothing) of the objects.

The Algorithm

This is a standard greedy algorithm. In fact, it’s one of the classic examples.

The idea is to calculate for each object the ratio of value/cost, and sort them according to this ratio. Then you take the objects with the highest ratios and add them until you can’t add the next object as whole. Finally add as much as you can of the next object.

So, for our example:
v = {4, 2, 2, 1, 10}
c = {12, 1, 2, 1, 4}
r = {1/3, 2, 1, 1, 5/2}

From this it’s obvious that you should add the objects: 5, 2, 3, 4 and then as much as possible of 1.
The output of my programme is this:
Added object 5 (10$, 4Kg) completly in the bag. Space left: 11.
Added object 2 (2$, 1Kg) completly in the bag. Space left: 10.
Added object 3 (2$, 2Kg) completly in the bag. Space left: 8.
Added object 4 (1$, 1Kg) completly in the bag. Space left: 7.
Added 58% (4$, 12Kg) of object 1 in the bag.
Filled the bag with objects worth 15.48$.

The Programme

Now, you could implement the algorithm as stated, but for practical reasons you may wish to trade speed for simplicity. That’s what I’ve done here: instead of sorting the objects, I simply go through them every time searching for the best ratio. This modification turns an O(n*lg(n)) algorithm into an O(n2) one. For small values of n, this doesn’t matter and n is usually small.

Here’s the code in C (fractional_knapsack.c):

#include <stdio.h>

int n = 5; /* The number of objects */
int c[10] = {12, 1, 2, 1, 4}; /* c[i] is the *COST* of the ith object; i.e. what
				YOU PAY to take the object */
int v[10] = {4, 2, 2, 1, 10}; /* v[i] is the *VALUE* of the ith object; i.e.
				what YOU GET for taking the object */
int W = 15; /* The maximum weight you can take */

void simple_fill() {
	int cur_w;
	float tot_v;
	int i, maxi;
	int used[10];

	for (i = 0; i < n; ++i)
		used&#91;i&#93; = 0; /* I have not used the ith object yet */

	cur_w = W;
	while (cur_w > 0) { /* while there's still room*/
		/* Find the best object */
		maxi = -1;
		for (i = 0; i < n; ++i)
			if ((used&#91;i&#93; == 0) &&
				((maxi == -1) || ((float)v&#91;i&#93;/c&#91;i&#93; > (float)v[maxi]/c[maxi])))
				maxi = i;

		used[maxi] = 1; /* mark the maxi-th object as used */
		cur_w -= c[maxi]; /* with the object in the bag, I can carry less */
		tot_v += v[maxi];
		if (cur_w >= 0)
			printf("Added object %d (%d$, %dKg) completly in the bag. Space left: %d.\n", maxi + 1, v[maxi], c[maxi], cur_w);
		else {
			printf("Added %d%% (%d$, %dKg) of object %d in the bag.\n", (int)((1 + (float)cur_w/c[maxi]) * 100), v[maxi], c[maxi], maxi + 1);
			tot_v -= v[maxi];
			tot_v += (1 + (float)cur_w/c[maxi]) * v[maxi];
		}
	}

	printf("Filled the bag with objects worth %.2f$.\n", tot_v);
}

int main(int argc, char *argv[]) {
	simple_fill();

	return 0;
}

That’s it. Good luck.

Always open to comments.

Update: The next article in this series is The 0-1 Knapsack Problem.

In this article I describe the Floyd-Warshall algorithm for finding the shortest path between all nodes in a graph. I give an informal proof and provide an implementation in C.

Shortest paths

The shortest path between two nodes of a graph is a sequence of connected nodes so that the sum of the edges that inter-connect them is minimal.

Take this graph,
p2.png

There are several paths between A and E:
Path 1: A -> B -> E 20
Path 2: A -> D -> E 25
Path 3: A -> B -> D -> E 35
Path 4: A -> D -> B -> E 20

There are several things to notice here:

  1. There can be more then one route between two nodes
  2. The number of nodes in the route isn’t important (Path 4 has 4 nodes but is shorter than Path 2, which has 3 nodes)
  3. There can be more than one path of minimal length

Something else that should be obvious from the graph is that any path worth considering is simple. That is, you only go through each node once.

Unfortunately, this is not always the case. The problem appears when you allow negative weight edges. This isn’t by itself bad. But if a loop of negative weight appears, then there is no shortest path. Look at this example:
A graph containing a negative weight loop

Look at the path B -> E -> D -> B. This is a loop, because the starting node is the also the end. What’s the cost? It’s 10 – 20 + 5 = -5. This means that adding this loop to a path once lowers the cost of the path by 5. Adding it twice would lower the cost by 2 * 5 = 10. So, whatever shortest path you may have come up with, you can make it smaller by going through the loop one more time. BTW there’s no problem with a negative cost path.

The Floyd-Warshall Algorithm

This algorithm calculates the length of the shortest path between all nodes of a graph in O(V3) time. Note that it doesn’t actually find the paths, only their lengths.

Let’s say you have the adjacency matrix of a graph. Assuming no loop of negative values, at this point you have the minimum distance between any two nodes which are connected by an edge.
A B C D E
A 0 10 0 5 0
B 10 0 5 5 10
C 0 5 0 0 0
D 5 5 0 0 20
E 0 10 0 20 0

The graph is the one shown above (the first one).

The idea is to try to interspace A between any two nodes in hopes of finding a shorter path.
A B C D E
A 0 10 0 5 0
B 10 0 5 5 10
C 0 5 0 0 0
D 5 5 0 0 20
E 0 10 0 20 0

Then try to interspace B between any two nodes:
A B C D E
A 0 10 15 5 20
B 10 0 5 5 10
C 15 5 0 10 15
D 5 5 10 0 15
E 20 10 15 15 0

Do the same for C:
A B C D E
A 0 10 15 5 20
B 10 0 5 5 10
C 15 5 0 10 15
D 5 5 10 0 15
E 20 10 15 15 0

Do the same for D:
A B C D E
A 0 10 15 5 20
B 10 0 5 5 10
C 15 5 0 10 15
D 5 5 10 0 15
E 20 10 15 15 0

And for E:
A B C D E
A 0 10 15 5 20
B 10 0 5 5 10
C 15 5 0 10 15
D 5 5 10 0 15
E 20 10 15 15 0

This is the actual algorithm:

# dist(i,j) is "best" distance so far from vertex i to vertex j 

# Start with all single edge paths.
 For i = 1 to n do
     For j = 1 to n do
         dist(i,j) = weight(i,j) 

 For k = 1 to n do # k is the `intermediate' vertex
     For i = 1 to n do
         For j = 1 to n do
             if (dist(i,k) + dist(k,j) < dist(i,j)) then # shorter path?
                 dist(i,j) = dist(i,k) + dist(k,j)

The Programme

Here’s the code in C(floyd_warshall.c):

#include

int n; /* Then number of nodes */
int dist[16][16]; /* dist[i][j] is the length of the edge between i and j if
it exists, or 0 if it does not */

void printDist() {
int i, j;
printf(” “);
for (i = 0; i < n; ++i) printf("%4c", 'A' + i); printf("\n"); for (i = 0; i < n; ++i) { printf("%4c", 'A' + i); for (j = 0; j < n; ++j) printf("%4d", dist[i][j]); printf("\n"); } printf("\n"); } /* floyd_warshall() after calling this function dist[i][j] will the the minimum distance between i and j if it exists (i.e. if there's a path between i and j) or 0, otherwise */ void floyd_warshall() { int i, j, k; for (k = 0; k < n; ++k) { printDist(); for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) /* If i and j are different nodes and if the paths between i and k and between k and j exist, do */ if ((dist[i][k] * dist[k][j] != 0) && (i != j)) /* See if you can't get a shorter path between i and j by interspacing k somewhere along the current path */ if ((dist[i][k] + dist[k][j] < dist[i][j]) || (dist[i][j] == 0)) dist[i][j] = dist[i][k] + dist[k][j]; } printDist(); } int main(int argc, char *argv[]) { FILE *fin = fopen("dist.txt", "r"); fscanf(fin, "%d", &n); int i, j; for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) fscanf(fin, "%d", &dist[i][j]); fclose(fin); floyd_warshall(); return 0; } [/sourcecode] Note that of the above programme, all the work is done by only five lines (30-48). That's it. Good luck. Always open to comments.