patterncsharpModerate
Get distinct combinations of numbers
Viewed 0 times
distinctnumbersgetcombinations
Problem
The below code returns all distinct combinations based on the logic that 1,2,3 = 3,2,1 = 2,3,1, so it only returns 1 instance of that set of numbers.
This is the end result I am trying to achieve: (no duplicate rows of combinations: duplicate = 3,2,1 and 3,2,1 are the same thing. but 1,2,3 and 3,2,1 are NOT the same thing and both should be in the end result)
```
1
2
3
1,2
1,3
2,1
2,3
2,2
3,1
3,2
3,3
1,2,3
1,2,2
1,3,2
1,3,3
2,1,3
2,1,2
2,3,1
2,3,2
2,3,3
2,2,1
2,2,3
3,1,2
3,1,3
3,2,1
3,2,2
3,2,3
3,3,1
3,3,2
1,2,3,2
1,2,3,3
1,2,2,3
1,3,2,2
1,3,2,3
1,3,3,2
2,1,3,2
2,1,3,3
2,1,2,3
2,3,1,2
2,3,1,3
2,3,2,1
2,3,2,3
2,3,3,1
2,3,3,2
2,2,1,3
2,2,3,1
2,2,3,3
3,1,2,2
3,1,2,3
3,1,3,2
3,2,1,2
3,2,1,3
3,2,2,1
3,2,2,3
3,2,3,1
3,2,3,2
3,3,1,2
3,3,2,1
3,3,2,2
1,2,3,2,3
1,2,3,3,2
1,2,2,3,3
1,3,2,2,3
1,3,2,3,2
1,3,3,2,2
2,1,3,2,3
2,1,3,3,2
2,1,2,3,3
2,3,1,2,3
2,3,1,3,2
2,3,2,1,3
2,3,2,3,1
2,3,3,1,2
2,3,3,2,1
2,2,1,3,3
2,2,3,1,3
2,2,3,3,1
3,1,2,2,3
3,1,2,3,2
3,1,3,2,2
3,2,1,2,3
3,2,1,3,2
3,2,2,1,3
3,2,2,3,1
3,2,3,1,2
3,2,3,2,1
3,3,1,2,2
3,3,2,1,
public void GetPowersets()
{
List ints = new List()
{
1,2,2,3,3
};
var results = GetPowerSet(ints);
List combinations = new List();
foreach (var result in results)
{
StringBuilder sb = new StringBuilder();
foreach (var intValue in result.OrderBy(x => x))
{
sb.Append(intValue + ",");
}
combinations.Add(sb.ToString());
}
string c1 = string.Join("|", combinations.ToArray()).Replace(",|", "|");
//c1 = "|1|2|1,2|2|1,2|2,2|1,2,2|3|1,3|2,3|1,2,3|2,3|1,2,3|2,2,3|1,2,2,3|3|1,3|2,3|1,2,3|2,3|1,2,3|2,2,3|1,2,2,3|3,3|1,3,3|2,3,3|1,2,3,3|2,3,3|1,2,3,3|2,2,3,3|1,2,2,3,3,"
}
public IEnumerable> GetPowerSet(List list)
{
return from m in Enumerable.Range(0, 1 << list.Count)
select
from i in Enumerable.Range(0, list.Count)
where (m & (1 << i)) != 0
select list[i];
}This is the end result I am trying to achieve: (no duplicate rows of combinations: duplicate = 3,2,1 and 3,2,1 are the same thing. but 1,2,3 and 3,2,1 are NOT the same thing and both should be in the end result)
```
1
2
3
1,2
1,3
2,1
2,3
2,2
3,1
3,2
3,3
1,2,3
1,2,2
1,3,2
1,3,3
2,1,3
2,1,2
2,3,1
2,3,2
2,3,3
2,2,1
2,2,3
3,1,2
3,1,3
3,2,1
3,2,2
3,2,3
3,3,1
3,3,2
1,2,3,2
1,2,3,3
1,2,2,3
1,3,2,2
1,3,2,3
1,3,3,2
2,1,3,2
2,1,3,3
2,1,2,3
2,3,1,2
2,3,1,3
2,3,2,1
2,3,2,3
2,3,3,1
2,3,3,2
2,2,1,3
2,2,3,1
2,2,3,3
3,1,2,2
3,1,2,3
3,1,3,2
3,2,1,2
3,2,1,3
3,2,2,1
3,2,2,3
3,2,3,1
3,2,3,2
3,3,1,2
3,3,2,1
3,3,2,2
1,2,3,2,3
1,2,3,3,2
1,2,2,3,3
1,3,2,2,3
1,3,2,3,2
1,3,3,2,2
2,1,3,2,3
2,1,3,3,2
2,1,2,3,3
2,3,1,2,3
2,3,1,3,2
2,3,2,1,3
2,3,2,3,1
2,3,3,1,2
2,3,3,2,1
2,2,1,3,3
2,2,3,1,3
2,2,3,3,1
3,1,2,2,3
3,1,2,3,2
3,1,3,2,2
3,2,1,2,3
3,2,1,3,2
3,2,2,1,3
3,2,2,3,1
3,2,3,1,2
3,2,3,2,1
3,3,1,2,2
3,3,2,1,
Solution
Let's talk algorithms, and complexity..... and what the actual problem is that you are trying to solve.
Background
Three concepts: distinct, permutation, and combination
Distinct
Distinct values are each unique within their peers. In your case, you have the input values {1, 2, 2, 3, 3}. The distinct set is just {1, 2, 3}. There are 5 input values, there are 3 distinct values.
Now, the 'original' problem extended this concept of distinct to apply to ordered sets as well. So, with the combinations for {1, 2, 2, 3, 3}, two of the possible combinations are where just the second member is in the result, and just the third member is in the result... these are the combinations {2} and {2}, but they are different 2's. The original code discards the second combination as a duplicate of the first.
Your goal is to keep both of them, since they are 'different' 2 values.
Permutations
Permutations are about order. What ways are there to order a set of data. There is no concept of what the data is, only where the data is. With the input data {1, 2, 3}, what are the permutations?
There can be a lot of permutations, and the numbers get large fast.
The actual formula is a factorial, as in \$n!\$ where \$n\$ is the number of values in the set. If there is one member, there is only one permutation.
If there are two, then there are two permutations. Now, the way to visualize it, is that, we know there are two ways to arrange two members.... with three members, we start building permutations... take the first member from the set, and there are 2 ways to arrange the remaining 2 members. Then take the second member from the set, and there are two ways to arrange the remaining 2. Finally, take the last member from the set, and there are two ways to arrange the remaining 2. In other words, there are \$3 \times 2\$ permutations. If we reduce the logic down to just one member in the set, then the actual sequence is \$3 \times 2 \times 1\$. Then, if we add a fourth member, it becomes \$4 \times 3 \times 2 \times 1\$
Factorial numbers get large, fast. So, the following factorials are 'realistic' (for an arbitrary definition of realistic):
Calculating the permutations for even a moderate (say 10) sized dataset is a large undertaking.... (>3.5million permutations)
Combinations
Combinations are about whether data is in the set, or not. Combinations consist of a set of conditions: either the data is in, or it is not. Where the data is, in the output set, is not relevant for combinations, only that it is there. So, If you have the input values {1, 2, 3}, then the combinations are:
The math behind combinations is relatively simple. There are \$n\$ members in an input set, each member is either in the output set, or not in the output set. Now, because there is a 'binary' condition (either in, or out), you can think of the problem as follows:
If there is just one input member, then that member is either in the set, or not. There are 2 combinations, the empty set, and the set with that one member.
If we add a second input member, then we can have the same two combinations of the first member, but then we can also add the second member to the output, and get another two combinations. As follows (for the input data {a, b}):
If we add a third input member, we can have the same 4 combinations above, but then we can also have it with the third member.
There is a pattern here, each time we add a member, we double the number of combinations. Now, in computer science, it happens that there is a very convenient system we can use to help with this... it's called an 'integer'. Consider an input dataset with 3 members.... Now, consider a 3-bit integer. Let's count with a 3-bit integer:
This bit representation of our three-bit number shows the possible combinations! If we take an input data set of {a, b, c}, and then use the bits in the above counting to indicate whether the data member is 'in' the combination, or 'out', then you can do the following:
To read the above table, you need to say, 'if the bit in the same position as the input data is set, then include the value in the input data in the combination'.
This leads on to the number of combinations that are possible, and we can borrow from the binary numbers again. The number is \$2^n\$ where \$n\$ is the number of members in the input set.
So, with 10 input members, there`s \$2^{10}\$ combinations, or 1024. With 16 input values,
Background
Three concepts: distinct, permutation, and combination
Distinct
Distinct values are each unique within their peers. In your case, you have the input values {1, 2, 2, 3, 3}. The distinct set is just {1, 2, 3}. There are 5 input values, there are 3 distinct values.
Now, the 'original' problem extended this concept of distinct to apply to ordered sets as well. So, with the combinations for {1, 2, 2, 3, 3}, two of the possible combinations are where just the second member is in the result, and just the third member is in the result... these are the combinations {2} and {2}, but they are different 2's. The original code discards the second combination as a duplicate of the first.
Your goal is to keep both of them, since they are 'different' 2 values.
Permutations
Permutations are about order. What ways are there to order a set of data. There is no concept of what the data is, only where the data is. With the input data {1, 2, 3}, what are the permutations?
- [1, 2, 3]
- [1, 3, 2]
- [2, 1, 3]
- [2, 3, 1]
- [3, 1, 2]
- [3, 2, 1]
There can be a lot of permutations, and the numbers get large fast.
The actual formula is a factorial, as in \$n!\$ where \$n\$ is the number of values in the set. If there is one member, there is only one permutation.
If there are two, then there are two permutations. Now, the way to visualize it, is that, we know there are two ways to arrange two members.... with three members, we start building permutations... take the first member from the set, and there are 2 ways to arrange the remaining 2 members. Then take the second member from the set, and there are two ways to arrange the remaining 2. Finally, take the last member from the set, and there are two ways to arrange the remaining 2. In other words, there are \$3 \times 2\$ permutations. If we reduce the logic down to just one member in the set, then the actual sequence is \$3 \times 2 \times 1\$. Then, if we add a fourth member, it becomes \$4 \times 3 \times 2 \times 1\$
Factorial numbers get large, fast. So, the following factorials are 'realistic' (for an arbitrary definition of realistic):
- 1
- 2
- 6
- 24
- 120
- 720
- 5040
- 40320
- 362880
- 3628800
- 39916800
- 479001600
- 6227020800
- 87178291200
- 1307674368000
- 20922789888000
- 355687428096000
- 6402373705728000
- 121645100408832000
- 2432902008176640000
Calculating the permutations for even a moderate (say 10) sized dataset is a large undertaking.... (>3.5million permutations)
Combinations
Combinations are about whether data is in the set, or not. Combinations consist of a set of conditions: either the data is in, or it is not. Where the data is, in the output set, is not relevant for combinations, only that it is there. So, If you have the input values {1, 2, 3}, then the combinations are:
- {}
- {1}
- {2}
- {3}
- {1, 2}
- {1, 3}
- {2, 3}
- {1, 2, 3}
The math behind combinations is relatively simple. There are \$n\$ members in an input set, each member is either in the output set, or not in the output set. Now, because there is a 'binary' condition (either in, or out), you can think of the problem as follows:
If there is just one input member, then that member is either in the set, or not. There are 2 combinations, the empty set, and the set with that one member.
If we add a second input member, then we can have the same two combinations of the first member, but then we can also add the second member to the output, and get another two combinations. As follows (for the input data {a, b}):
[ ]
[ b]
[a ]
[a b]
If we add a third input member, we can have the same 4 combinations above, but then we can also have it with the third member.
There is a pattern here, each time we add a member, we double the number of combinations. Now, in computer science, it happens that there is a very convenient system we can use to help with this... it's called an 'integer'. Consider an input dataset with 3 members.... Now, consider a 3-bit integer. Let's count with a 3-bit integer:
000
001
010
011
100
101
110
111This bit representation of our three-bit number shows the possible combinations! If we take an input data set of {a, b, c}, and then use the bits in the above counting to indicate whether the data member is 'in' the combination, or 'out', then you can do the following:
000
001 c
010 b
011 bc
100 a
101 a c
110 ab
111 abcTo read the above table, you need to say, 'if the bit in the same position as the input data is set, then include the value in the input data in the combination'.
This leads on to the number of combinations that are possible, and we can borrow from the binary numbers again. The number is \$2^n\$ where \$n\$ is the number of members in the input set.
So, with 10 input members, there`s \$2^{10}\$ combinations, or 1024. With 16 input values,
Code Snippets
000
001
010
011
100
101
110
111000
001 c
010 b
011 bc
100 a
101 a c
110 ab
111 abc5 * 1 (5 combinations have 1 permutation each)
10 * 2 (10 combinations have 2 permutations each)
10 * 6
5 * 24
1 * 120 (1 combination has 120 permutations).Context
StackExchange Code Review Q#51938, answer score: 12
Revisions (0)
No revisions yet.