Wikipedia defines the **partition of a set** as:

In mathematics, a partition of a set X is a division of X into non-overlapping “parts” or “blocks” or “cells” that cover all of X. More formally, these “cells” are both collectively exhaustive and mutually exclusive with respect to the set being partitioned.

A more succinct definition is given by Mathworld:

A set partition of a set S is a collection of disjoint subsets of S whose union is S.

Simply put, the **partitions of a set S** are all the ways in which you can choose disjoint, non-empty subsets of S that unioned result in S.

From now on, when I say a set of *n* elements, I mean *{1, 2, …, n}*. So, what are the subsets of *{1, 2, 3}*?

*{1, 2, 3}
{2, 3} {1}
{1, 3} {2}
{3} {1, 2}
{3} {2} {1}*

It’s obvious that these verify the definition: *{1}*, *{2}*, *{3}*, *{1, 2}*, *{1, 3}*, *{2, 3}* and *{1, 2, 3}* are all subsets of *{1, 2, 3}*. They’re all non-empty and, in any partition, the same element never appears twice. Finally, in a partitioning, the union of the partitions is the original set.

In how many ways can you partition a set of *n* elements? There are many ways to calculate this, but as far as I can tell, the easiest is using Catalan numbers:

If you check the formula for *3* you’ll see that it does give the correct answer: *5*.

A reader pointed out that what we may need here are not Catalan numbers, but Bell numbers. Wikipedia’s definition seems to agree with him.

Ok. We know what a partitioning is, we know how many there are, but how do you generate them? This is the first algorithm I could think of. It may not be clear from the explanation why it works but try it on a piece of paper for *n=3* and it will become obvious. Here’s how I came up with it:

First of all, how do you represent a partitioning of a set of *n* elements? The straight-forward way would be using a vector of n integers, each integer representing the number of the subset in which the corresponding element is in. If the corresponding element of *3* is *2*, that means that *3* is in the *2 ^{nd}* subset. So, given the set {1, 2, 3}:

`Partitioning -> Encoding`

{1, 2, 3} -> (1, 1, 1)

{1} {2, 3} -> (2, 1, 1)

{2} {1, 3} -> (1, 2, 1)

{1, 2} {3} -> (2, 2, 1)

{1} {2} {3} -> (3, 2, 1)

Notice that the encodings, written backwards are: *111*, *112*, *121*, *122* and *123*. From this you can guess how the generator works: more or less, **generate all the numbers between 111 and 123 using only the digits 1, 2 and 3**:

111

112

113

121

122

123

That’s almost right. The encodings *(1, 1, 2)* and *(1, 1, 3)* translate into the same partitioning: *{1} {2, 3}*. If you do the same thing for a larger *n* you’ll notice this happening again and again. Fortunately, there’s an easy solution: **never use a digit that’s more than 1 larger than any other digit in the encoding**. i.e. You can’t use

*(1, 1, 3)*because

*3*is larger by

*2*than the other digits in the encoding (

*1*and

*1*).

To do this, I use another vector *m* with the following significance: *m[i]* is the largest of the first *i* elements in the encoding. This makes it very easy not to generate any duplicate partitionings.

Here’s the code in C (part.c):

#include <stdio.h> /* printp - print out the partitioning scheme s of n elements as: {1, 2, 4} {3} */ void printp(int *s, int n) { /* Get the total number of partitions. In the exemple above, 2.*/ int part_num = 1; int i; for (i = 0; i < n; ++i) if (s[i] > part_num) part_num = s[i]; /* Print the p partitions. */ int p; for (p = part_num; p >= 1; --p) { printf("{"); /* If s[i] == p, then i + 1 is part of the pth partition. */ for (i = 0; i < n; ++i) if (s[i] == p) printf("%d, ", i + 1); printf("\\b\\b} "); } printf("\\n"); } /* next - given the partitioning scheme represented by s and m, generate the next Returns: 1, if a valid partitioning was found 0, otherwise */ int next(int *s, int *m, int n) { /* Update s: 1 1 1 1 -> 2 1 1 1 -> 1 2 1 1 -> 2 2 1 1 -> 3 2 1 1 -> 1 1 2 1 ... */ /*int j; printf(" -> ("); for (j = 0; j < n; ++j) printf("%d, ", s[j]); printf("\\b\\b)\\n");*/ int i = 0; ++s[i]; while ((i < n - 1) && (s[i] > m[i] + 1)) { s[i] = 1; ++i; ++s[i]; } /* If i is has reached n-1 th element, then the last unique partitiong has been found*/ if (i == n - 1) return 0; /* Because all the first i elements are now 1, s[i] (i + 1 th element) is the largest. So we update max by copying it to all the first i positions in m.*/ int max = s[i]; for (i = i - 1; i >= 0; --i) m[i] = max; /* for (i = 0; i < n; ++i) printf("%d ", m[i]); getchar();*/ return 1; } int main(int argc, char *argv[]) { int s[16]; /* s[i] is the number of the set in which the ith element should go */ int m[16]; /* m[i] is the largest of the first i elements in s*/ int n = 3; int i; /* The first way to partition a set is to put all the elements in the same subset. */ for (i = 0; i < n; ++i) { s[i] = 1; m[i] = 1; } /* Print the first partitioning. */ printp(s, n); /* Print the other partitioning schemes. */ while (next(s, m, n)) printp(s, n); return 0; }

The code is heavily commented, but I’ll happily respond to any questions. This is also what I used to generate all the above listings. Try decommenting some of the code to see how the programme works. Good luck!

P.S. Every encoding after *(3, 2, 1)* yields a duplicate partitioning. For fun, try proving this mathematically.

2007-12-22 at 5:06

I came across this web site while doing a google search on “set partition”. Nice writeup, but please correct the typo in the table with the heading that reads “Partitioning -> Encoding”. The encoding given there corresponds to the list of partitions given in the earlier paragraph, not the present paragraph.

Again, nice clear writeup, and good example code. Thanks.

– Pete

2008-02-02 at 13:09

You mean Bell Numbers, not Catalan Numbers.

http://en.wikipedia.org/wiki/Bell_number

http://mathworld.wolfram.com/BellNumber.html

[scvalex] Somehow, I think you’re right

2008-02-08 at 8:14

what is the complexity of above algorithm

2008-02-15 at 14:15

Complexity is

O(n2).It might be possible to show through amortised analysis that it is actually

O(n), but, somehow, I doubt it.2008-03-27 at 19:13

Great post and great program, but I think there might be a flaw: for n=6 it generates only 188 partitions instead of the expected 203.

I’ve looked a bit into it and found one missing partition (I think):

{1} {2, 4} {3} {5, 6}

AFAIU the corresponding encoding would be:

4 2 3 2 1 1

and it looks like it gets rejected because there’s a difference of two between the first two numbers.

I’d be interested in learning if this is indeed the problem and how to fix it.

2008-05-22 at 20:33

I think this a flaw in the reasoning here, which becomes apparent

when you try to generate all partitions of a set with 4 elements.

According to the approach above, you generate all integers

with digits 1,2,3,4, between 1111 and 1234,

and then prune out any integers

where one digit is greater than any other by more than 1. This leaves

us with the following 16 integers:

1111

1112

1121

1122

1123

1132

1211

1212

1213

1221

1222

1223

1231

1232

1233

1234

However, there are only 15 partitions of a set of 4 elements.

The problem is that encodings 1123 and 1132 represent the same partition, i.e., 1123 = {{1,2}, {3}, {4}} and 1132 = {{1,2}, {4}, {3}}.

Therefore, I think one of 1123 or 1132 needs to be pruned out.

Currently the one post-processing step (i.e., `never use a digit that’s more than 1 larger than any other digit in the encoding’)

does not handle this.

2009-11-20 at 2:30

1) Each digit should be not more than 1 above any EARLIER digit, not any other. So, 1132 is ineligible.

2) The method is correct but implementation contains a bug which prevents getting full set for n>4. The routine produces 51 partitions for n=5 (correct is 52, 13134 is missing), for n=6 only 188 instead of 203 etc.

3) The bug is that max (line 61) should be the larger of s[i] and m[i].

Just insert “if (m[i] > max) max = m[i]” after line 61 and you’re done.

2010-01-04 at 23:28

I found a good way to do this using recursion.

Instead of generating all partitions at once, I split the set in two, keeps the first part, and recur on the second.

It looks like this in python:

2010-01-04 at 23:38

Oops, there was a misspelling in the cleaned up version. The function name should of course be partitions.

You’ll notice the code generates the Bell numbers just fine:

2010-10-08 at 17:30

This is really useful!

Does anyone know a way to extend this to set a limit on the largest possible subgroup… I am doing some analysis on all possible subsets of say 7 objects – but 877 is too many. I would like to cap largest subgroup size as 4 to reduce the total number of partitions, but I’m not even sure how to calculate the number of such partitions.

I guess maybe with some more pruning – I have only just found this page so I need to read it with more care, but I wondered if this was a problem others have had.

2012-03-18 at 19:41

Hi all,

Just adding “if (m[i] > max) max = m[i]” does not make the code work.

To get a working code, do following modifications to the original code :

line 47 : while ((i m[i+1] + 1)) { // compare s[i] to m[i+1] and not to m[i]

replace lines 61, 62, 63 with :

if( s[i] > m[i] )

m[i] = s[i]

for ( int j = i – 1; j >= 0; –j)

m[j] = m[i];

2012-03-18 at 19:51

now with nice source code :

replace line 47 with :

replace lines 61, 62, 63 with :

2012-03-18 at 23:38

well adding “if (m[i] > max) max = m[i]” seems actually to make the code work. My version above ist just another way to use the m function, and works as well.

2012-04-06 at 3:51

Reblogged this on peladascomamaonabolsa.

2012-06-18 at 18:14

I have a simpler algorithm using a similar idea.

2012-07-04 at 9:30

Here’s a class that iteratively generates all partitions up to n=15.

http://code.google.com/p/consciousness/wiki/AlgorithmForGeneratingPartitions