Wikipedia defines the partition of a set as:

In mathematics, a partition of a set X is a division of X into non-overlapping “parts” or “blocks” or “cells” that cover all of X. More formally, these “cells” are both collectively exhaustive and mutually exclusive with respect to the set being partitioned.

A more succinct definition is given by Mathworld:

A set partition of a set S is a collection of disjoint subsets of S whose union is S.

Simply put, the partitions of a set S are all the ways in which you can choose disjoint, non-empty subsets of S that unioned result in S.

From now on, when I say a set of n elements, I mean {1, 2, …, n}. So, what are the subsets of {1, 2, 3}?
{1, 2, 3}
{2, 3} {1}
{1, 3} {2}
{3} {1, 2}
{3} {2} {1}

It’s obvious that these verify the definition: {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3} and {1, 2, 3} are all subsets of {1, 2, 3}. They’re all non-empty and, in any partition, the same element never appears twice. Finally, in a partitioning, the union of the partitions is the original set.

In how many ways can you partition a set of n elements? There are many ways to calculate this, but as far as I can tell, the easiest is using Catalan numbers:
Formula for the nth Catalan Number

If you check the formula for 3 you’ll see that it does give the correct answer: 5.

A reader pointed out that what we may need here are not Catalan numbers, but Bell numbers. Wikipedia’s definition seems to agree with him.

Ok. We know what a partitioning is, we know how many there are, but how do you generate them? This is the first algorithm I could think of. It may not be clear from the explanation why it works but try it on a piece of paper for n=3 and it will become obvious. Here’s how I came up with it:

First of all, how do you represent a partitioning of a set of n elements? The straight-forward way would be using a vector of n integers, each integer representing the number of the subset in which the corresponding element is in. If the corresponding element of 3 is 2, that means that 3 is in the 2nd subset. So, given the set {1, 2, 3}:
Partitioning -> Encoding
{1, 2, 3} -> (1, 1, 1)
{1} {2, 3} -> (2, 1, 1)
{2} {1, 3} -> (1, 2, 1)
{1, 2} {3} -> (2, 2, 1)
{1} {2} {3} -> (3, 2, 1)

Notice that the encodings, written backwards are: 111, 112, 121, 122 and 123. From this you can guess how the generator works: more or less, generate all the numbers between 111 and 123 using only the digits 1, 2 and 3:

111
112
113
121
122
123

That’s almost right. The encodings (1, 1, 2) and (1, 1, 3) translate into the same partitioning: {1} {2, 3}. If you do the same thing for a larger n you’ll notice this happening again and again. Fortunately, there’s an easy solution: never use a digit that’s more than 1 larger than any other digit in the encoding. i.e. You can’t use (1, 1, 3) because 3 is larger by 2 than the other digits in the encoding (1 and 1).

To do this, I use another vector m with the following significance: m[i] is the largest of the first i elements in the encoding. This makes it very easy not to generate any duplicate partitionings.

Here’s the code in C (part.c):

#include <stdio.h>

/*
	printp
		- print out the partitioning scheme s of n elements 
		as: {1, 2, 4} {3}
*/
void printp(int *s, int n) {
	/* Get the total number of partitions. In the exemple above, 2.*/
	int part_num = 1;
	int i;
	for (i = 0; i < n; ++i)
		if (s[i] > part_num)
			part_num = s[i];

	/* Print the p partitions. */
	int p;
	for (p = part_num; p >= 1; --p) {
		printf("{");
		/* If s[i] == p, then i + 1 is part of the pth partition. */
		for (i = 0; i < n; ++i)
			if (s[i] == p)
				printf("%d, ", i + 1);
		printf("\\b\\b} ");
	}
	printf("\\n");
}

/*
	next
		- given the partitioning scheme represented by s and m, generate
		the next

	Returns: 1, if a valid partitioning was found
		0, otherwise
*/
int next(int *s, int *m, int n) {
	/* Update s: 1 1 1 1 -> 2 1 1 1 -> 1 2 1 1 -> 2 2 1 1 -> 3 2 1 1 ->
	1 1 2 1 ... */
	/*int j;
	printf(" -> (");
	for (j = 0; j &lt; n; ++j)
		printf("%d, ", s[j]);
	printf("\\b\\b)\\n");*/
	int i = 0;
	++s[i];
	while ((i < n - 1) &amp;&amp; (s[i] > m[i] + 1)) {
		s[i] = 1;
		++i;
		++s[i];
	}

	/* If i is has reached n-1 th element, then the last unique partitiong
	has been found*/
	if (i == n - 1)
		return 0;

	/* Because all the first i elements are now 1, s[i] (i + 1 th element)
	is the largest. So we update max by copying it to all the first i
	positions in m.*/
	int max = s[i];
	for (i = i - 1; i >= 0; --i)
		m[i] = max;

/*	for (i = 0; i &lt; n; ++i)
		printf("%d ", m[i]);
	getchar();*/
	return 1;
}

int main(int argc, char *argv[]) {
	int s[16]; /* s[i] is the number of the set in which the ith element
			should go */
	int m[16]; /* m[i] is the largest of the first i elements in s*/

	int n = 3;
	int i;
	/* The first way to partition a set is to put all the elements in the same
	   subset. */
	for (i = 0; i &lt; n; ++i) {
		s[i] = 1;
		m[i] = 1;
	}

	/* Print the first partitioning. */
	printp(s, n);

	/* Print the other partitioning schemes. */
	while (next(s, m, n))
		printp(s, n);

	return 0;
}

The code is heavily commented, but I’ll happily respond to any questions. This is also what I used to generate all the above listings. Try decommenting some of the code to see how the programme works. Good luck!

P.S. Every encoding after (3, 2, 1) yields a duplicate partitioning. For fun, try proving this mathematically.

About these ads