In this article, I describe the Bellman-Ford algorithm for finding the one-source shortest paths in a graph, give an informal proof and provide the source code in C for a simple implementation.

To understand this you should know what a graph is, and how to store one in memory. If in doubt check this and this.

Another solution to this problem is Dijkstra’s algorithm.

The Problem

Given the following graph, calculate the length of the shortest path from node 1 to node 2.
bf1.png

It’s obvious that there’s a direct route of length 6, but take a look at path: 1 -> 4 -> 3 -> 2. The length of the path is 7 – 3 – 2 = 2, which is less than 6. BTW, you don’t need negative edge weights to get such a situation, but they do clarify the problem.

This also suggests a property of shortest path algorithms: to find the shortest path form x to y, you need to know, beforehand, the shortest paths to y‘s neighbours. For this, you need to know the paths to y‘s neighbours’ neighbours… In the end, you must calculate the shortest path to the connected component of the graph in which x and y are found.

That said, you usually calculate the shortest path to all nodes and then pick the ones you’re intrested in.

The Algorithm

The Bellman-Ford algorithm is one of the classic solutions to this problem. It calculates the shortest path to all nodes in the graph from a single source.

The basic idea is simple:
Start by considering that the shortest path to all nodes, less the source, is infinity. Mark the length of the path to the source as 0:
bf2.png

Take every edge and try to relax it:
bf3.png

Relaxing an edge means checking to see if the path to the node the edge is pointing to can’t be shortened, and if so, doing it. In the above graph, by checking the edge 1 -> 2 of length 6, you find that the length of the shortest path to node 1 plus the length of the edge 1 -> 2 is less then infinity. So, you replace infinity in node 2 with 6. The same can be said for edge 1 -> 4 of length 7. It’s also worth noting that, practically, you can’t relax the edges whose start has the shortest path of length infinity to it.

Now, you apply the previous step n – 1 times, where n is the number of nodes in the graph. In this example, you have to apply it 4 times (that’s 3 more times).
bf4.png

bf5.png

bf6.png

That’s it, here’s the algorithm in a condensed form:

void bellman_ford(int s) {
int i, j;

for (i = 0; i < n; ++i) d[i] = INFINITY; d[s] = 0; for (i = 0; i < n - 1; ++i) for (j = 0; j < e; ++j) if (d[edges[j].u] + edges[j].w < d[edges[j].v]) d[edges[j].v] = d[edges[j].u] + edges[j].w; } [/sourcecode] Here, d[i] is the shortest path to node i, e is the number of edges and edges[i] is the i-th edge.

It may not be obvious why this works, but take a look at what is certain after each step. After the first step, any path made up of at most 2 nodes will be optimal. After the step 2, any path made up of at most 3 nodes will be optimal… After the (n – 1)-th step, any path made up of at most n nodes will be optimal.

The Programme

The following programme just puts the bellman_ford function into context. It runs in O(VE) time, so for the example graph it will do something on the lines of 5 * 9 = 45 relaxations. Keep in mind that this algorithm works quite well on graphs with few edges, but is very slow for dense graphs (graphs with almost n2 edges). For graphs with lots of edges, you’re better off with Dijkstra’s algorithm.

Here’s the source code in C (bellmanford.c):

#include

typedef struct {
int u, v, w;
} Edge;

int n; /* the number of nodes */
int e; /* the number of edges */
Edge edges[1024]; /* large enough for n <= 2^5=32 */ int d[32]; /* d[i] is the minimum distance from node s to node i */ #define INFINITY 10000 void printDist() { int i; printf("Distances:\n"); for (i = 0; i < n; ++i) printf("to %d\t", i + 1); printf("\n"); for (i = 0; i < n; ++i) printf("%d\t", d[i]); printf("\n\n"); } void bellman_ford(int s) { int i, j; for (i = 0; i < n; ++i) d[i] = INFINITY; d[s] = 0; for (i = 0; i < n - 1; ++i) for (j = 0; j < e; ++j) if (d[edges[j].u] + edges[j].w < d[edges[j].v]) d[edges[j].v] = d[edges[j].u] + edges[j].w; } int main(int argc, char *argv[]) { int i, j; int w; FILE *fin = fopen("dist.txt", "r"); fscanf(fin, "%d", &n); e = 0; for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) { fscanf(fin, "%d", &w); if (w != 0) { edges[e].u = i; edges[e].v = j; edges[e].w = w; ++e; } } fclose(fin); /* printDist(); */ bellman_ford(0); printDist(); return 0; } [/sourcecode] And here's the input file used in the example (dist.txt):
5
0 6 0 7 0
0 0 5 8 -4
0 -2 0 0 0
0 0 -3 9 0
2 0 7 0 0

That’s an adjacency matrix.

That’s it. Have fun. Always open to comments.

Advertisements

In this article I describe the Floyd-Warshall algorithm for finding the shortest path between all nodes in a graph. I give an informal proof and provide an implementation in C.

Shortest paths

The shortest path between two nodes of a graph is a sequence of connected nodes so that the sum of the edges that inter-connect them is minimal.

Take this graph,
p2.png

There are several paths between A and E:
Path 1: A -> B -> E 20
Path 2: A -> D -> E 25
Path 3: A -> B -> D -> E 35
Path 4: A -> D -> B -> E 20

There are several things to notice here:

  1. There can be more then one route between two nodes
  2. The number of nodes in the route isn’t important (Path 4 has 4 nodes but is shorter than Path 2, which has 3 nodes)
  3. There can be more than one path of minimal length

Something else that should be obvious from the graph is that any path worth considering is simple. That is, you only go through each node once.

Unfortunately, this is not always the case. The problem appears when you allow negative weight edges. This isn’t by itself bad. But if a loop of negative weight appears, then there is no shortest path. Look at this example:
A graph containing a negative weight loop

Look at the path B -> E -> D -> B. This is a loop, because the starting node is the also the end. What’s the cost? It’s 10 – 20 + 5 = -5. This means that adding this loop to a path once lowers the cost of the path by 5. Adding it twice would lower the cost by 2 * 5 = 10. So, whatever shortest path you may have come up with, you can make it smaller by going through the loop one more time. BTW there’s no problem with a negative cost path.

The Floyd-Warshall Algorithm

This algorithm calculates the length of the shortest path between all nodes of a graph in O(V3) time. Note that it doesn’t actually find the paths, only their lengths.

Let’s say you have the adjacency matrix of a graph. Assuming no loop of negative values, at this point you have the minimum distance between any two nodes which are connected by an edge.
A B C D E
A 0 10 0 5 0
B 10 0 5 5 10
C 0 5 0 0 0
D 5 5 0 0 20
E 0 10 0 20 0

The graph is the one shown above (the first one).

The idea is to try to interspace A between any two nodes in hopes of finding a shorter path.
A B C D E
A 0 10 0 5 0
B 10 0 5 5 10
C 0 5 0 0 0
D 5 5 0 0 20
E 0 10 0 20 0

Then try to interspace B between any two nodes:
A B C D E
A 0 10 15 5 20
B 10 0 5 5 10
C 15 5 0 10 15
D 5 5 10 0 15
E 20 10 15 15 0

Do the same for C:
A B C D E
A 0 10 15 5 20
B 10 0 5 5 10
C 15 5 0 10 15
D 5 5 10 0 15
E 20 10 15 15 0

Do the same for D:
A B C D E
A 0 10 15 5 20
B 10 0 5 5 10
C 15 5 0 10 15
D 5 5 10 0 15
E 20 10 15 15 0

And for E:
A B C D E
A 0 10 15 5 20
B 10 0 5 5 10
C 15 5 0 10 15
D 5 5 10 0 15
E 20 10 15 15 0

This is the actual algorithm:

# dist(i,j) is "best" distance so far from vertex i to vertex j 

# Start with all single edge paths.
 For i = 1 to n do
     For j = 1 to n do
         dist(i,j) = weight(i,j) 

 For k = 1 to n do # k is the `intermediate' vertex
     For i = 1 to n do
         For j = 1 to n do
             if (dist(i,k) + dist(k,j) < dist(i,j)) then # shorter path?
                 dist(i,j) = dist(i,k) + dist(k,j)

The Programme

Here’s the code in C(floyd_warshall.c):

#include

int n; /* Then number of nodes */
int dist[16][16]; /* dist[i][j] is the length of the edge between i and j if
it exists, or 0 if it does not */

void printDist() {
int i, j;
printf(” “);
for (i = 0; i < n; ++i) printf("%4c", 'A' + i); printf("\n"); for (i = 0; i < n; ++i) { printf("%4c", 'A' + i); for (j = 0; j < n; ++j) printf("%4d", dist[i][j]); printf("\n"); } printf("\n"); } /* floyd_warshall() after calling this function dist[i][j] will the the minimum distance between i and j if it exists (i.e. if there's a path between i and j) or 0, otherwise */ void floyd_warshall() { int i, j, k; for (k = 0; k < n; ++k) { printDist(); for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) /* If i and j are different nodes and if the paths between i and k and between k and j exist, do */ if ((dist[i][k] * dist[k][j] != 0) && (i != j)) /* See if you can't get a shorter path between i and j by interspacing k somewhere along the current path */ if ((dist[i][k] + dist[k][j] < dist[i][j]) || (dist[i][j] == 0)) dist[i][j] = dist[i][k] + dist[k][j]; } printDist(); } int main(int argc, char *argv[]) { FILE *fin = fopen("dist.txt", "r"); fscanf(fin, "%d", &n); int i, j; for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) fscanf(fin, "%d", &dist[i][j]); fclose(fin); floyd_warshall(); return 0; } [/sourcecode] Note that of the above programme, all the work is done by only five lines (30-48). That's it. Good luck. Always open to comments.