There quite a few definitions of what a set is, but it all boils down to this:
A set defined as a collection of distinct elements, in which order is not important.
So {1, 2, 3}, {3, 4}, {} and {5, 99, -1} are all sets. Because the order of the elements is ignored, {1, 2, 3} and {3, 2, 1} is the same set. In case you’re wandering, there are exactly n! diffrent ways to write a set of n elements.
For the rest of the discussion, I’ll use sets of the form {1, 2, …, n}, so when I say a set of 3 elements, I mean {1, 2, 3}. Just remember that is not a property of sets. They can contain anything as elements, not necessarily consecutive numbers.
The set S1 is said to be the subset of the set S2, if all the elements of S1 also belong to S2.
Knowing this, it’s easy to figure out the subsets of {1, 2, 3}:
{ }
{ 1 }
{ 2 }
{ 1, 2 }
{ 3 }
{ 1, 3 }
{ 2, 3 }
{ 1, 2, 3 }
How many subsets are there? For a set of one element, there are 2 subsets: {} and {1}. For a set of 2 elements, there are 4 subsets: {}, {1}, {2}, {1, 2}. For a set of 3 elements, there are 8 subsets. Notice the pattern?
n = 1: 21
n = 2: 22
n = 3: 23
For a set of n there are 2n subsets. This is easily proved: Any subset of the set can either contain or not contain an element; so, for a subset, there are 2 states for the first element, 2 for the second element, …, 2 for the nth element; so, there are 2 states for the first element, 2 * 2 = 22 states for the first two, 2 * 2 * 2= 23 states for the first three, …, 2 * 2 * 2 * … * 2 = 2n states for all the n elements.
The problem here is how to generate all the subsets of a given set. There are a few algorithms for doing this, but in the end, only two are worth considering.
The first is this: given all the subsets of S and the element y, you can generate all the subsets of S U {y} by taking each subset of S, once adding to it y and once leaving it as it is. i.e. Knowing that {1, 3} is a subset of S, you obtain the following two subsets of S U {y}: {1, 3, y} and {1, 3}.
This does what it’s supposed to – it generates all the subsets of S, and it wastes no time. It can also be used as another way to prove that there are 2n subsets for any set of n elements. The only problem is that you need the subsets from the previous step to generate those of this step. This means that just before the end, you must have 2n – 1 subsets in memory. Considering how much memory computers have this days, it’s not particularly wasteful, but still, there’s a better way.
The better way involves using a mask. If you have the a set of n elements, a valid mask would be an array of n boolean (true/false; 1/0) elements. When you apply a mask to a set, you check each element (e) in the set and the corresponding one in the mask (m): if m is true(1), you add e to the result, otherwise, you ignore it. After applying the mask (0, 1, 0, 0, 1) to {1, 2, 3, 4, 5}, you get {2, 5}.
So, to generate all the subsets of a set of n elements, you first have to generate all the possible 2n masks of the set and then apply them.
Generating the masks is a simple problem. Basically, you just have to implement a binary counter, i.e. something that generates:
000
001
010
011
100
101
110
111
Here’s the code in C (sub.c):
#include <stdio.h>
/* Applies the mask to a set like {1, 2, ..., n} and prints it */
void printv(int mask[], int n) {
int i;
printf("{ ");
for (i = 0; i < n; ++i)
if (mask[i])
printf("%d ", i + 1); /*i+1 is part of the subset*/
printf("\\b }\\n");
}
/* Generates the next mask*/
int next(int mask[], int n) {
int i;
for (i = 0; (i < n) && mask[i]; ++i)
mask[i] = 0;
if (i < n) {
mask[i] = 1;
return 1;
}
return 0;
}
int main(int argc, char *argv[]) {
int n = 3;
int mask[16]; /* Guess what this is */
int i;
for (i = 0; i < n; ++i)
mask[i] = 0;
/* Print the first set */
printv(mask, n);
/* Print all the others */
while (next(mask, n))
printv(mask, n);
return 0;
}
Note: The next() function generates the bits in reverse order.
Always open to comments.
2007-10-12 at 11:58
Thank you so much. I was trying to catch a deadline for a project. I implemented your code and it worked perfect. Compared to other source in the internet simple and very well writen! Thanks!
UC Davis, CA
2007-10-28 at 4:41
[...] story here This entry was posted on Wednesday, October 10th, 2007 at 8:45 am and is filed under computer [...]
2007-12-18 at 16:21
What complexity does this algorithm have?
2007-12-31 at 19:42
Complexity is O(N). This can easily be proven using Amortized Analysis.
2008-05-30 at 6:51
thnx man its awesome .the concept is wowwwww
2008-09-13 at 10:48
thank you
2008-10-04 at 1:00
O(n)? oh dear, sure it isn’t…
2009-01-12 at 20:03
Is there a code that will generate the following patterns, given {a,b,c,d,e}
a,b,c,d,e
ab,c,d,e
a,bc,d,e
a,b,cd,e
a,b,c,de
abc,d,e etc.?
Note that all these patterns use all the components and furthermore the order is always preserved ie ac,b,d,e is not allowed.
2009-02-18 at 18:18
Thanks
it done well
2009-03-06 at 19:17
It isnt O(n), since you are generating 2^n subsets
hence it is O( 2^n )
2009-03-18 at 19:23
Depending on what you want to do with the subsets, a Gray code (http://en.wikipedia.org/wiki/Gray_code) can be a significant improvement. The idea is that it is possible to construct the masks so that at each step, the mask only changes in one spot.
For example, if you are trying to find the largest set without x,y,z such that x+z=2y (i.e., a set without 3 evenly space numbers), then a Gray code would allow you to only check for triples involving the new element.
In my experience, unfortunately, the overhead of managing the more complicated ordering of masks outweighs the advantage of Gray codes for reasonable values of n.
2009-04-12 at 8:26
thank you
2009-07-19 at 2:14
nice! i’m gonna make my own blog
2009-09-17 at 11:19
it is not a c code i need a c code so can u sub mit c code
2009-09-17 at 11:25
what does < mean in u r program
2009-09-17 at 15:17
what does for(i = 0; (i < n) && mask[i]; ++i)mean
2009-11-06 at 18:58
[...] Generating subsets Filed under: Uncategorized — tuxdna @ 3:52 pm http://compprog.wordpress.com/2007/10/10/generating-subsets/ [...]