In this article I describe a way of modifying Dijkstra’s Alogrithm in order to find all the shortest path from a source to a node.

This article assumes you know how Dijkstra’s Algorithm works. If you don’t, see my previous post or the Wikipedia article.

#### The Problem

You know how to use Dijkstra’s algorithm to find the length of the shortest path to a node. You’ve even figured out how to record the path to each node. But you what you really need are all the shortest paths leading to a node.

#### The Idea

I can help, but to be honest, this is obvious.

In order to record the path to each node, I used an array to record which node comes before each other node in the shortest path. That is to say: prev[i] was the node that comes just before node i in the shortest path from the source to node i.

To record all the shortest paths that lead to a node, I just turned prev into a matrix with the following meaning: prev[i][0] is the number of nodes that could come before node i on a path of minimum length; prev[i][1..] are the nodes that could come before node i on path of minimum length.

#### The Programme

Here’s the code in C (dijkstraAll.c):

#include

#define GRAPHSIZE 2048
#define INFINITY GRAPHSIZE*GRAPHSIZE
#define MAX(a, b) ((a > b) ? (a) : (b))

int e; /* The number of nonzero edges in the graph */
int n; /* The number of nodes in the graph */
long dist[GRAPHSIZE][GRAPHSIZE]; /* dist[i][j] is the distance between node i and j; or 0 if there is no direct connection */
long d[GRAPHSIZE]; /* d[i] is the length of the shortest path between the source (s) and node i */
int prev[GRAPHSIZE][GRAPHSIZE + 1]; /* prev[i] holds the nodes that could comes right before i in the shortest path from the source to i;
prev[i][0] is the number of nodes and prev[i][1..] are the nodes */

void printD() {
int i;

printf(“Distances:\n”);
for (i = 1; i <= n; ++i) printf("%10d", i); printf("\n"); for (i = 1; i <= n; ++i) { printf("%10ld", d[i]); } printf("\n"); } /* * Prints the shortest path from the source to dest. * * dijkstra(int) MUST be run at least once BEFORE * this is called */ void printPath(int dest, int depth) { int i, j; printf("-%d\n", dest); for (i = 1; i <= prev[dest][0]; ++i) { for (j = 0; j <= depth; ++j) printf(" |"); printPath(prev[dest][i], depth + 1); } } void dijkstra(int s) { int i, k, mini; int visited[GRAPHSIZE]; for (i = 1; i <= n; ++i) { d[i] = INFINITY; prev[i][0] = 0; /* no path has yet been found to i */ visited[i] = 0; /* the i-th element has not yet been visited */ } d[s] = 0; for (k = 1; k <= n; ++k) { mini = -1; for (i = 1; i <= n; ++i) if (!visited[i] && ((mini == -1) || (d[i] < d[mini]))) mini = i; visited[mini] = 1; for (i = 1; i <= n; ++i) if (dist[mini][i]) { if (d[mini] + dist[mini][i] < d[i]) { /* a shorter path has been found */ d[i] = d[mini] + dist[mini][i]; prev[i][0] = 1; prev[i][1] = mini; } else if (d[mini] + dist[mini][i] == d[i]) { /* a path of the same length has been found */ ++prev[i][0]; prev[i][prev[i][0]] = mini; } } } } int main(int argc, char *argv[]) { int i, j; int u, v, w; FILE *fin = fopen("dist.txt", "r"); fscanf(fin, "%d", &e); for (i = 0; i < e; ++i) for (j = 0; j < e; ++j) dist[i][j] = 0; n = -1; for (i = 0; i < e; ++i) { fscanf(fin, "%d%d%d", &u, &v, &w); dist[u][v] = w; n = MAX(u, MAX(v, n)); } fclose(fin); dijkstra(1); printD(); printf("\n"); for (i = 1; i <= n; ++i) { printf("Path to %d:\n", i); printPath(i, 0); printf("\n"); } return 0; } [/sourcecode] And here's an input file: dist.txt.
```10 1 2 5 1 4 3 2 3 1 2 4 3 3 5 6 4 2 2 4 3 9 4 5 2 5 1 7 5 3 4 ```

The input file describes this graph:

As you can see, there are two paths from node 1 to node 3: 1 -> 2 -> 3 and 1 -> 4 -> 2 -> 3 both of length 6.

Now, what does the programme output?
```Distances: 1 2 3 4 5 0 5 6 3 5```

``` Path to 1: -1 Path to 2: -2 |-1 |-4 | |-1 Path to 3: -3 |-2 | |-1 | |-4 | | |-1 Path to 4: -4 |-1 ```

```Path to 5: -5 |-4 | |-1 ```

It first outputs the distances, and… yes! They’re correct.

Next, it prints those ASCII art drawings. They not drawings. They’re trees with the destination as root and the leafs as the source. To read a path from such a tree, start at a leaf (always 1) and go left, reading the first numbers you can see above.

Let’s find the paths to node 3. There are two leafs, so there are two paths of minimal length. The first one is 1 -> 4 -> 2 -> 3. The second one is 1 -> 2 -> 3. Check on the graph.

That’s it. If you’re up to a challenge, implement prev as an array of linked lists.

Good luck. Always open to comments.

In this article, I describe the Bellman-Ford algorithm for finding the one-source shortest paths in a graph, give an informal proof and provide the source code in C for a simple implementation.

To understand this you should know what a graph is, and how to store one in memory. If in doubt check this and this.

Another solution to this problem is Dijkstra’s algorithm.

#### The Problem

Given the following graph, calculate the length of the shortest path from node 1 to node 2.

It’s obvious that there’s a direct route of length 6, but take a look at path: 1 -> 4 -> 3 -> 2. The length of the path is 7 – 3 – 2 = 2, which is less than 6. BTW, you don’t need negative edge weights to get such a situation, but they do clarify the problem.

This also suggests a property of shortest path algorithms: to find the shortest path form x to y, you need to know, beforehand, the shortest paths to y‘s neighbours. For this, you need to know the paths to y‘s neighbours’ neighbours… In the end, you must calculate the shortest path to the connected component of the graph in which x and y are found.

That said, you usually calculate the shortest path to all nodes and then pick the ones you’re intrested in.

#### The Algorithm

The Bellman-Ford algorithm is one of the classic solutions to this problem. It calculates the shortest path to all nodes in the graph from a single source.

The basic idea is simple:
Start by considering that the shortest path to all nodes, less the source, is infinity. Mark the length of the path to the source as 0:

Take every edge and try to relax it:

Relaxing an edge means checking to see if the path to the node the edge is pointing to can’t be shortened, and if so, doing it. In the above graph, by checking the edge 1 -> 2 of length 6, you find that the length of the shortest path to node 1 plus the length of the edge 1 -> 2 is less then infinity. So, you replace infinity in node 2 with 6. The same can be said for edge 1 -> 4 of length 7. It’s also worth noting that, practically, you can’t relax the edges whose start has the shortest path of length infinity to it.

Now, you apply the previous step n – 1 times, where n is the number of nodes in the graph. In this example, you have to apply it 4 times (that’s 3 more times).

That’s it, here’s the algorithm in a condensed form:

void bellman_ford(int s) {
int i, j;

for (i = 0; i < n; ++i) d[i] = INFINITY; d[s] = 0; for (i = 0; i < n - 1; ++i) for (j = 0; j < e; ++j) if (d[edges[j].u] + edges[j].w < d[edges[j].v]) d[edges[j].v] = d[edges[j].u] + edges[j].w; } [/sourcecode] Here, d[i] is the shortest path to node i, e is the number of edges and edges[i] is the i-th edge.

It may not be obvious why this works, but take a look at what is certain after each step. After the first step, any path made up of at most 2 nodes will be optimal. After the step 2, any path made up of at most 3 nodes will be optimal… After the (n – 1)-th step, any path made up of at most n nodes will be optimal.

#### The Programme

The following programme just puts the bellman_ford function into context. It runs in O(VE) time, so for the example graph it will do something on the lines of 5 * 9 = 45 relaxations. Keep in mind that this algorithm works quite well on graphs with few edges, but is very slow for dense graphs (graphs with almost n2 edges). For graphs with lots of edges, you’re better off with Dijkstra’s algorithm.

Here’s the source code in C (bellmanford.c):

#include

typedef struct {
int u, v, w;
} Edge;

int n; /* the number of nodes */
int e; /* the number of edges */
Edge edges[1024]; /* large enough for n <= 2^5=32 */ int d[32]; /* d[i] is the minimum distance from node s to node i */ #define INFINITY 10000 void printDist() { int i; printf("Distances:\n"); for (i = 0; i < n; ++i) printf("to %d\t", i + 1); printf("\n"); for (i = 0; i < n; ++i) printf("%d\t", d[i]); printf("\n\n"); } void bellman_ford(int s) { int i, j; for (i = 0; i < n; ++i) d[i] = INFINITY; d[s] = 0; for (i = 0; i < n - 1; ++i) for (j = 0; j < e; ++j) if (d[edges[j].u] + edges[j].w < d[edges[j].v]) d[edges[j].v] = d[edges[j].u] + edges[j].w; } int main(int argc, char *argv[]) { int i, j; int w; FILE *fin = fopen("dist.txt", "r"); fscanf(fin, "%d", &n); e = 0; for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) { fscanf(fin, "%d", &w); if (w != 0) { edges[e].u = i; edges[e].v = j; edges[e].w = w; ++e; } } fclose(fin); /* printDist(); */ bellman_ford(0); printDist(); return 0; } [/sourcecode] And here's the input file used in the example (dist.txt):
```5 0 6 0 7 0 0 0 5 8 -4 0 -2 0 0 0 0 0 -3 9 0 2 0 7 0 0```

That’s it. Have fun. Always open to comments.

In this article I describe the Floyd-Warshall algorithm for finding the shortest path between all nodes in a graph. I give an informal proof and provide an implementation in C.

#### Shortest paths

The shortest path between two nodes of a graph is a sequence of connected nodes so that the sum of the edges that inter-connect them is minimal.

Take this graph,

There are several paths between A and E:
```Path 1: A -> B -> E 20 Path 2: A -> D -> E 25 Path 3: A -> B -> D -> E 35 Path 4: A -> D -> B -> E 20 ```

There are several things to notice here:

1. There can be more then one route between two nodes
2. The number of nodes in the route isn’t important (Path 4 has 4 nodes but is shorter than Path 2, which has 3 nodes)
3. There can be more than one path of minimal length

Something else that should be obvious from the graph is that any path worth considering is simple. That is, you only go through each node once.

Unfortunately, this is not always the case. The problem appears when you allow negative weight edges. This isn’t by itself bad. But if a loop of negative weight appears, then there is no shortest path. Look at this example:

Look at the path B -> E -> D -> B. This is a loop, because the starting node is the also the end. What’s the cost? It’s 10 – 20 + 5 = -5. This means that adding this loop to a path once lowers the cost of the path by 5. Adding it twice would lower the cost by 2 * 5 = 10. So, whatever shortest path you may have come up with, you can make it smaller by going through the loop one more time. BTW there’s no problem with a negative cost path.

#### The Floyd-Warshall Algorithm

This algorithm calculates the length of the shortest path between all nodes of a graph in O(V3) time. Note that it doesn’t actually find the paths, only their lengths.

Let’s say you have the adjacency matrix of a graph. Assuming no loop of negative values, at this point you have the minimum distance between any two nodes which are connected by an edge.
``` A B C D E A 0 10 0 5 0 B 10 0 5 5 10 C 0 5 0 0 0 D 5 5 0 0 20 E 0 10 0 20 0```

The graph is the one shown above (the first one).

The idea is to try to interspace A between any two nodes in hopes of finding a shorter path.
``` A B C D E A 0 10 0 5 0 B 10 0 5 5 10 C 0 5 0 0 0 D 5 5 0 0 20 E 0 10 0 20 0```

Then try to interspace B between any two nodes:
``` A B C D E A 0 10 15 5 20 B 10 0 5 5 10 C 15 5 0 10 15 D 5 5 10 0 15 E 20 10 15 15 0```

Do the same for C:
``` A B C D E A 0 10 15 5 20 B 10 0 5 5 10 C 15 5 0 10 15 D 5 5 10 0 15 E 20 10 15 15 0```

Do the same for D:
``` A B C D E A 0 10 15 5 20 B 10 0 5 5 10 C 15 5 0 10 15 D 5 5 10 0 15 E 20 10 15 15 0```

And for E:
``` A B C D E A 0 10 15 5 20 B 10 0 5 5 10 C 15 5 0 10 15 D 5 5 10 0 15 E 20 10 15 15 0```

This is the actual algorithm:

``` # dist(i,j) is "best" distance so far from vertex i to vertex j # Start with all single edge paths. For i = 1 to n do For j = 1 to n do dist(i,j) = weight(i,j) For k = 1 to n do # k is the `intermediate' vertex For i = 1 to n do For j = 1 to n do if (dist(i,k) + dist(k,j) < dist(i,j)) then # shorter path? dist(i,j) = dist(i,k) + dist(k,j) ```

#### The Programme

Here’s the code in C(floyd_warshall.c):

#include

int n; /* Then number of nodes */
int dist[16][16]; /* dist[i][j] is the length of the edge between i and j if
it exists, or 0 if it does not */

void printDist() {
int i, j;
printf(” “);
for (i = 0; i < n; ++i) printf("%4c", 'A' + i); printf("\n"); for (i = 0; i < n; ++i) { printf("%4c", 'A' + i); for (j = 0; j < n; ++j) printf("%4d", dist[i][j]); printf("\n"); } printf("\n"); } /* floyd_warshall() after calling this function dist[i][j] will the the minimum distance between i and j if it exists (i.e. if there's a path between i and j) or 0, otherwise */ void floyd_warshall() { int i, j, k; for (k = 0; k < n; ++k) { printDist(); for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) /* If i and j are different nodes and if the paths between i and k and between k and j exist, do */ if ((dist[i][k] * dist[k][j] != 0) && (i != j)) /* See if you can't get a shorter path between i and j by interspacing k somewhere along the current path */ if ((dist[i][k] + dist[k][j] < dist[i][j]) || (dist[i][j] == 0)) dist[i][j] = dist[i][k] + dist[k][j]; } printDist(); } int main(int argc, char *argv[]) { FILE *fin = fopen("dist.txt", "r"); fscanf(fin, "%d", &n); int i, j; for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) fscanf(fin, "%d", &dist[i][j]); fclose(fin); floyd_warshall(); return 0; } [/sourcecode] Note that of the above programme, all the work is done by only five lines (30-48). That's it. Good luck. Always open to comments.

In this article I give an informal definition of a graph and of the minimum spanning tree. Afterwards I describe Prim’s algorithm and then follow its execution on an example. Finally, the code in C is provided.

#### Graphs

Wikipedia gives one of the common definitions of a graph:

In computer science, a graph is a kind of data structure, specifically an abstract data type (ADT), that consists of a set of nodes and a set of edges that establish relationships (connections) between the nodes. The graph ADT follows directly from the graph concept from mathematics.
Informally, G=(V,E) consists of vertices, the elements of V, which are connected by edges, the elements of E. Formally, a graph, G, is defined as an ordered pair, G=(V,E), where V is a finite set and E is a set consisting of two element subsets of V.

This is a graph:

It’s a set of nodes (A, B, C, D and E) and the edges (lines) that interconnect them.

An important thing to note about this graph is that the edges are bidirectional, i.e. if A is connected to B, then B is connected to A. This makes it an undirected graph.

A common extension is to attribute weights to the edges. This is what I’ve done with the previous graph:

#### Minimum spanning trees

Basically a minimum spanning tree is a subset of the edges of the graph, so that there’s a path form any node to any other node and that the sum of the weights of the edges is minimum.

Here’s the minimum spanning tree of the example:

Look at the above image closely. It contains all of the initial nodes and some of the initial edges. Actually it contains exactly n – 1 edges, where n is the number of nodes. It’s called a tree because there are no cycles.

You can think of the graph as a map, with the nodes being cities, the edges passable terrain, and the weights the distance between the cities.

It’s worth mentioning that a graph can have several minimum spanning trees. Think of the above example, but replace all the weight with 1. The resulting graph will have 6 minimum spanning trees.

Given a graph, find one of its minimum spanning trees.

#### Prim’s Algorithm

One of the classic algorithms for this problem is that found by Robert C. Prim. It’s a greedy style algorithm and it’s guaranteed to produce a correct result.

In the following discussion, let the distance from each node not in the tree to the tree be the edge of minimal weight between that node and some node in the tree. If there is no such edge, assume the distance is infinity (this shouldn’t happen).

The algorithm (greedily) builds the minimal spanning tree by iteratively adding nodes into a working tree:

2. Identify a node (outside the tree) which is closest to the tree and add the minimum weight edge from that node to some node in the tree and incorporate the additional node as a part of the tree.
3. If there are less then n – 1 edges in the tree, go to 2

For the example graph, here’s how it would run:

Find the closest node to the tree, and add it.

Repeat until there are n – 1 edges in the tree.

#### The Programme

The following programme just follows the algorithm. It runs in O(n2) time.

Here’s the code in C (prim.c):

```#include <stdio.h>

/*
The input file (weight.txt) look something like this
4
0 0 0 21
0 0 8 17
0 8 0 16
21 17 16 0

The first line contains n, the number of nodes.
Next is an nxn matrix containg the distances between the nodes
NOTE: The distance between a node and itself should be 0
*/

int n; /* The number of nodes in the graph */

int weight[100][100]; /* weight[i][j] is the distance between node i and node j;
if there is no path between i and j, weight[i][j] should
be 0 */

char inTree[100]; /* inTree[i] is 1 if the node i is already in the minimum
spanning tree; 0 otherwise*/

int d[100]; /* d[i] is the distance between node i and the minimum spanning
tree; this is initially infinity (100000); if i is already in
the tree, then d[i] is undefined;
this is just a temporary variable. It's not necessary but speeds
up execution considerably (by a factor of n) */

int whoTo[100]; /* whoTo[i] holds the index of the node i would have to be
linked to in order to get a distance of d[i] */

/* updateDistances(int target)
should be called immediately after target is added to the tree;
updates d so that the values are correct (goes through target's
neighbours making sure that the distances between them and the tree
are indeed minimum)
*/
void updateDistances(int target) {
int i;
for (i = 0; i < n; ++i)
if ((weight&#91;target&#93;&#91;i&#93; != 0) && (d&#91;i&#93; > weight[target][i])) {
d[i] = weight[target][i];
whoTo[i] = target;
}
}

int main(int argc, char *argv[]) {
FILE *f = fopen("dist.txt", "r");
fscanf(f, "%d", &n);
int i, j;
for (i = 0; i < n; ++i)
for (j = 0; j < n; ++j)
fscanf(f, "%d", &weight&#91;i&#93;&#91;j&#93;);
fclose(f);

/* Initialise d with infinity */
for (i = 0; i < n; ++i)
d&#91;i&#93; = 100000;

/* Mark all nodes as NOT beeing in the minimum spanning tree */
for (i = 0; i < n; ++i)
inTree&#91;i&#93; = 0;

/* Add the first node to the tree */
printf("Adding node %c\n", 0 + 'A');
inTree&#91;0&#93; = 1;
updateDistances(0);

int total = 0;
int treeSize;
for (treeSize = 1; treeSize < n; ++treeSize) {
/* Find the node with the smallest distance to the tree */
int min = -1;
for (i = 0; i < n; ++i)
if (!inTree&#91;i&#93;)
if ((min == -1) || (d&#91;min&#93; > d[i]))
min = i;

printf("Adding edge %c-%c\n", whoTo[min] + 'A', min + 'A');
inTree[min] = 1;
total += d[min];

updateDistances(min);
}

printf("Total distance: %d\n", total);

return 0;
}
```

And here’s a sample input file (dist.txt). It’s the example graph:
```5 0 10 0 5 0 10 0 5 5 10 0 5 0 0 0 5 5 0 0 20 0 10 0 20 0 ```

The code’s commented and there shouldn’t be any problems.

Good luck. Always open to comments.

Wikipedia defines combinations as:

In combinatorial mathematics, a combination is an un-ordered collection of unique elements. (An ordered collection is called a permutation.) Given S, the set of all possible unique elements, a combination is a subset of the elements of S. The order of the elements in a combination is not important (two lists with the same elements in different orders are considered to be the same combination). Also, the elements cannot be repeated in a combination (every element appears uniquely once); this is often referred to as “without replacement/repetition”. This is because combinations are defined by the elements contained in them, s the set {1, 1, 1} is the same as {1}. For example, from a 52-card deck any 5 cards can form a valid combination (a hand). The order of the cards doesn’t matter and there can be no repetition of cards.

Mathworld provides a more terse definition:

The number of ways of picking k unordered outcomes from n possibilities.

The combinations of n elements chosen as k is the number of unique ways of selecting k elements from a set of n.

From now on, by set of n I always mean one of the form {1, 2, 3, …, n}.

So, what are the ways of choosing 2 elements from a set of 4, {1, 2, 3, 4}?
```{1, 2} {1, 3} {1, 4} {2, 3} {2, 4} {3, 4}```

That’s 6 ways, but what is the general formula?

This is easily proved: for a set of n, there are n ways of choosing the first element, n * (n – 1) ways of choosing the first two elements, …, n * (n – 1) * … * (n – k + 1) ways of choosing the first k elements. Unfortunately, this will generate duplicate subsets: for every subset of k elements, this will generate all the k! permutations of the subset. So, we have to divide the total number of subsets (n * (n – 1) * … * (n – k + 1)) by the number of repetitions (k!). This yields exactly the formula noted above.

Combinations are an astoundingly wide-spread concept, and are used in every branch of mathematics and especially in the analysis of algorithms. This said, there’s only one thing you really need to know: how to apply the formula.

Look at the formula above, notice that there are exactly k factors in the nominator and k factors in the denominator. So, to remember the formula and easily apply it:
```P1. Draw the fraction line. P2. Above the line, write k terms of the form: n, n - 1, n - 2, ... P3. Below the line, write k terms of the form: 1, 2, 3, ...```

Here are a few examples:

And now for the fun part. How do you generate combinations? Look closely at the example above. First thing to note is that every combination is an array of k elements. Next, the first digit in every set is, basically, every digit between 1 and n. What about the other digits? They’re always between 1 and n and they’re always in ascending order. Now it should be obvious what the algorithm is:
```P1. Start of with (1, 2, ..., k); this is the first combination. P2. Print it. P3. Given the combination (c0, c1, ..., cn), start from the back and for ci, if it is larger than n - k + 1 + i then increment it and go on to the next indice i. After this, if c0 > n - k, then this is not a valid combination so we stop. Otherwise give ci+1, ci+2, ... the values of ci + 1, ci+1 + 1, .... Jump to P2.```

Here’s the sourcecode in C (comb1.c):
NOTE: Source is mangled by WordPress. Download the source file, or copy-paste it from here or remember to replace the amp-s with ampersands and the lt-s with “less then” signs.

```#include <stdio.h>

/* Prints out a combination like {1, 2} */
void printc(int comb[], int k) {
printf("{");
int i;
for (i = 0; i < k; ++i)
printf("%d, ", comb[i] + 1);
printf("\\b\\b}\\n");
}

/*
next_comb(int comb[], int k, int n)
Generates the next combination of n elements as k after comb

comb => the previous combination ( use (0, 1, 2, ..., k) for first)
k => the size of the subsets to generate
n => the size of the original set

Returns: 1 if a valid combination was found
0, otherwise
*/
int next_comb(int comb[], int k, int n) {
int i = k - 1;
++comb[i];
while ((i >= 0) &amp;&amp; (comb[i] >= n - k + 1 + i)) {
--i;
++comb[i];
}

if (comb[0] > n - k) /* Combination (n-k, n-k+1, ..., n) reached */
return 0; /* No more combinations can be generated */

/* comb now looks like (..., x, n, n, n, ..., n).
Turn it into (..., x, x + 1, x + 2, ...) */
for (i = i + 1; i &lt; k; ++i)
comb[i] = comb[i - 1] + 1;

return 1;
}

int main(int argc, char *argv[]) {
int n = 5; /* The size of the set; for {1, 2, 3, 4} it's 4 */
int k = 3; /* The size of the subsets; for {1, 2}, {1, 3}, ... it's 2 */
int comb[16]; /* comb[i] is the index of the i-th element in the
combination */

/* Setup comb for the initial combination */
int i;
for (i = 0; i &lt; k; ++i)
comb[i] = i;

/* Print the first combination */
printc(comb, k);

/* Generate and print all the other combinations */
while (next_comb(comb, k, n))
printc(comb, k);

return 0;
}

```

Always open to comments. Have fun.

Wikipedia defines the partition of a set as:

In mathematics, a partition of a set X is a division of X into non-overlapping “parts” or “blocks” or “cells” that cover all of X. More formally, these “cells” are both collectively exhaustive and mutually exclusive with respect to the set being partitioned.

A more succinct definition is given by Mathworld:

A set partition of a set S is a collection of disjoint subsets of S whose union is S.

Simply put, the partitions of a set S are all the ways in which you can choose disjoint, non-empty subsets of S that unioned result in S.

From now on, when I say a set of n elements, I mean {1, 2, …, n}. So, what are the subsets of {1, 2, 3}?
```{1, 2, 3} {2, 3} {1} {1, 3} {2} {3} {1, 2} {3} {2} {1}```

It’s obvious that these verify the definition: {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3} and {1, 2, 3} are all subsets of {1, 2, 3}. They’re all non-empty and, in any partition, the same element never appears twice. Finally, in a partitioning, the union of the partitions is the original set.

In how many ways can you partition a set of n elements? There are many ways to calculate this, but as far as I can tell, the easiest is using Catalan numbers:

If you check the formula for 3 you’ll see that it does give the correct answer: 5.

A reader pointed out that what we may need here are not Catalan numbers, but Bell numbers. Wikipedia’s definition seems to agree with him.

Ok. We know what a partitioning is, we know how many there are, but how do you generate them? This is the first algorithm I could think of. It may not be clear from the explanation why it works but try it on a piece of paper for n=3 and it will become obvious. Here’s how I came up with it:

First of all, how do you represent a partitioning of a set of n elements? The straight-forward way would be using a vector of n integers, each integer representing the number of the subset in which the corresponding element is in. If the corresponding element of 3 is 2, that means that 3 is in the 2nd subset. So, given the set {1, 2, 3}:
```Partitioning -> Encoding {1, 2, 3} -> (1, 1, 1) {1} {2, 3} -> (2, 1, 1) {2} {1, 3} -> (1, 2, 1) {1, 2} {3} -> (2, 2, 1) {1} {2} {3} -> (3, 2, 1) ```

Notice that the encodings, written backwards are: 111, 112, 121, 122 and 123. From this you can guess how the generator works: more or less, generate all the numbers between 111 and 123 using only the digits 1, 2 and 3:
``` 111 112 113 121 122 123 ```

That’s almost right. The encodings (1, 1, 2) and (1, 1, 3) translate into the same partitioning: {1} {2, 3}. If you do the same thing for a larger n you’ll notice this happening again and again. Fortunately, there’s an easy solution: never use a digit that’s more than 1 larger than any other digit in the encoding. i.e. You can’t use (1, 1, 3) because 3 is larger by 2 than the other digits in the encoding (1 and 1).

To do this, I use another vector m with the following significance: m[i] is the largest of the first i elements in the encoding. This makes it very easy not to generate any duplicate partitionings.

Here’s the code in C (part.c):

```#include <stdio.h>

/*
printp
- print out the partitioning scheme s of n elements
as: {1, 2, 4} {3}
*/
void printp(int *s, int n) {
/* Get the total number of partitions. In the exemple above, 2.*/
int part_num = 1;
int i;
for (i = 0; i < n; ++i)
if (s&#91;i&#93; > part_num)
part_num = s[i];

/* Print the p partitions. */
int p;
for (p = part_num; p >= 1; --p) {
printf("{");
/* If s[i] == p, then i + 1 is part of the pth partition. */
for (i = 0; i < n; ++i)
if (s&#91;i&#93; == p)
printf("%d, ", i + 1);
printf("\\b\\b} ");
}
printf("\\n");
}

/*
next
- given the partitioning scheme represented by s and m, generate
the next

Returns: 1, if a valid partitioning was found
0, otherwise
*/
int next(int *s, int *m, int n) {
/* Update s: 1 1 1 1 -> 2 1 1 1 -> 1 2 1 1 -> 2 2 1 1 -> 3 2 1 1 ->
1 1 2 1 ... */
/*int j;
printf(" -> (");
for (j = 0; j &lt; n; ++j)
printf("%d, ", s[j]);
printf("\\b\\b)\\n");*/
int i = 0;
++s[i];
while ((i < n - 1) &amp;&amp; (s&#91;i&#93; > m[i] + 1)) {
s[i] = 1;
++i;
++s[i];
}

/* If i is has reached n-1 th element, then the last unique partitiong
has been found*/
if (i == n - 1)
return 0;

/* Because all the first i elements are now 1, s[i] (i + 1 th element)
is the largest. So we update max by copying it to all the first i
positions in m.*/
int max = s[i];
for (i = i - 1; i >= 0; --i)
m[i] = max;

/*	for (i = 0; i &lt; n; ++i)
printf("%d ", m[i]);
getchar();*/
return 1;
}

int main(int argc, char *argv[]) {
int s[16]; /* s[i] is the number of the set in which the ith element
should go */
int m[16]; /* m[i] is the largest of the first i elements in s*/

int n = 3;
int i;
/* The first way to partition a set is to put all the elements in the same
subset. */
for (i = 0; i &lt; n; ++i) {
s[i] = 1;
m[i] = 1;
}

/* Print the first partitioning. */
printp(s, n);

/* Print the other partitioning schemes. */
while (next(s, m, n))
printp(s, n);

return 0;
}

```

The code is heavily commented, but I’ll happily respond to any questions. This is also what I used to generate all the above listings. Try decommenting some of the code to see how the programme works. Good luck!

P.S. Every encoding after (3, 2, 1) yields a duplicate partitioning. For fun, try proving this mathematically.

There quite a few definitions of what a set is, but it all boils down to this:

A set defined as a collection of distinct elements, in which order is not important.

So {1, 2, 3}, {3, 4}, {} and {5, 99, -1} are all sets. Because the order of the elements is ignored, {1, 2, 3} and {3, 2, 1} is the same set. In case you’re wandering, there are exactly n! diffrent ways to write a set of n elements.

For the rest of the discussion, I’ll use sets of the form {1, 2, …, n}, so when I say a set of 3 elements, I mean {1, 2, 3}. Just remember that is not a property of sets. They can contain anything as elements, not necessarily consecutive numbers.

The set S1 is said to be the subset of the set S2, if all the elements of S1 also belong to S2.

Knowing this, it’s easy to figure out the subsets of {1, 2, 3}:
```{ } { 1 } { 2 } { 1, 2 } { 3 } { 1, 3 } { 2, 3 } { 1, 2, 3 }```

How many subsets are there? For a set of one element, there are 2 subsets: {} and {1}. For a set of 2 elements, there are 4 subsets: {}, {1}, {2}, {1, 2}. For a set of 3 elements, there are 8 subsets. Notice the pattern?
```n = 1: 21 n = 2: 22 n = 3: 23```
For a set of n there are 2n subsets. This is easily proved: Any subset of the set can either contain or not contain an element; so, for a subset, there are 2 states for the first element, 2 for the second element, …, 2 for the nth element; so, there are 2 states for the first element, 2 * 2 = 22 states for the first two, 2 * 2 * 2= 23 states for the first three, …, 2 * 2 * 2 * … * 2 = 2n states for all the n elements.

The problem here is how to generate all the subsets of a given set. There are a few algorithms for doing this, but in the end, only two are worth considering.

The first is this: given all the subsets of S and the element y, you can generate all the subsets of S U {y} by taking each subset of S, once adding to it y and once leaving it as it is. i.e. Knowing that {1, 3} is a subset of S, you obtain the following two subsets of S U {y}: {1, 3, y} and {1, 3}.

This does what it’s supposed to – it generates all the subsets of S, and it wastes no time. It can also be used as another way to prove that there are 2n subsets for any set of n elements. The only problem is that you need the subsets from the previous step to generate those of this step. This means that just before the end, you must have 2n – 1 subsets in memory. Considering how much memory computers have this days, it’s not particularly wasteful, but still, there’s a better way.

The better way involves using a mask. If you have the a set of n elements, a valid mask would be an array of n boolean (true/false; 1/0) elements. When you apply a mask to a set, you check each element (e) in the set and the corresponding one in the mask (m): if m is true(1), you add e to the result, otherwise, you ignore it. After applying the mask (0, 1, 0, 0, 1) to {1, 2, 3, 4, 5}, you get {2, 5}.

So, to generate all the subsets of a set of n elements, you first have to generate all the possible 2n masks of the set and then apply them.

Generating the masks is a simple problem. Basically, you just have to implement a binary counter, i.e. something that generates:
```000 001 010 011 100 101 110 111```

Here’s the code in C (sub.c):

```#include <stdio.h>

/* Applies the mask to a set like {1, 2, ..., n} and prints it */
void printv(int mask[], int n) {
int i;
printf("{ ");
for (i = 0; i &lt; n; ++i)
printf("%d ", i + 1); /*i+1 is part of the subset*/
printf("\\b }\\n");
}

int next(int mask[], int n) {
int i;
for (i = 0; (i &lt; n) &amp;&amp; mask[i]; ++i)

if (i &lt; n) {
return 1;
}
return 0;
}

int main(int argc, char *argv[]) {
int n = 3;

int mask[16]; /* Guess what this is */
int i;
for (i = 0; i &lt; n; ++i)

/* Print the first set */
Note: The `next()` function generates the bits in reverse order.