patternjavaMinor
Expected value (and distribution) of sum of six balls labeled 1-49, no replacement
Viewed 0 times
sixlabeledexpectedvalueballssumanddistributionreplacement
Problem
This code finds expectation and standard deviation of sum(x) of 6 digited base-49 numbers with all digits distinct.
The expectation is: \$\mu={\rm E[X]}=\frac1n\sum_{k=0}^{n}x_i=\frac{x_1+x_2+\ldots+x_n}{n}\$
whereas standard deviation is: \${\rm \sigma[X]}=\sqrt{{\rm E[(X-\mu)^2]}}=\sqrt{\frac1n[(x_1-\mu)^2+(x_2-\mu)^2+\ldots+(x_n-\mu)^2]}\$
```
package test;
import java.util.ArrayList;
import java.util.Arrays;
public class BallsSum {
final static int BALLS_DRAWN = 3;
final static int TOTAL_BALLS = 49;
public static void main(String args[]) {
long startTime = System.nanoTime();
int mNumberOfWays = 0;
int mTotalSum = 0;
ArrayList mOneDrawPartialSums = new ArrayList();
BallCombination mBallCombination = new BallsSum().new BallCombination();
int MAX_POSSIBLE_COMB = 1;
for (int i = 0; i mFormedCombinations;
public BallCombination() {
this.mCombination = new int[BALLS_DRAWN];
for (int i = 0; i ();
this.mFormedCombinations.add(this.mCombination);
}
public void print() {
System.out.print("(");
for (int i = 0; i < BALLS_DRAWN; i++) {
System.out.print(this.mCombination[i]);
if (i != BALLS_DRAWN - 1) {
System.out.print(",");
}
}
System.out.println(")");
}
public boolean getNewCombination() {
for (int i = 0; i < BALLS_DRAWN; i++) {
if (this.mCombination[i] < TOTAL_BALLS) {
++this.mCombination[i];
// Debug printing System.out.print("newComb:");
print();
if (isUnique(this.mCombination)) {
mFormedCombinations.add(mCombination);
return true;
}
} else {
this.mCombination[i] = 1;
}
}
return false;
}
private boolean isUnique(int[] pCombination) {
if (containsDuplicate(pCombination)) {
return
The expectation is: \$\mu={\rm E[X]}=\frac1n\sum_{k=0}^{n}x_i=\frac{x_1+x_2+\ldots+x_n}{n}\$
whereas standard deviation is: \${\rm \sigma[X]}=\sqrt{{\rm E[(X-\mu)^2]}}=\sqrt{\frac1n[(x_1-\mu)^2+(x_2-\mu)^2+\ldots+(x_n-\mu)^2]}\$
```
package test;
import java.util.ArrayList;
import java.util.Arrays;
public class BallsSum {
final static int BALLS_DRAWN = 3;
final static int TOTAL_BALLS = 49;
public static void main(String args[]) {
long startTime = System.nanoTime();
int mNumberOfWays = 0;
int mTotalSum = 0;
ArrayList mOneDrawPartialSums = new ArrayList();
BallCombination mBallCombination = new BallsSum().new BallCombination();
int MAX_POSSIBLE_COMB = 1;
for (int i = 0; i mFormedCombinations;
public BallCombination() {
this.mCombination = new int[BALLS_DRAWN];
for (int i = 0; i ();
this.mFormedCombinations.add(this.mCombination);
}
public void print() {
System.out.print("(");
for (int i = 0; i < BALLS_DRAWN; i++) {
System.out.print(this.mCombination[i]);
if (i != BALLS_DRAWN - 1) {
System.out.print(",");
}
}
System.out.println(")");
}
public boolean getNewCombination() {
for (int i = 0; i < BALLS_DRAWN; i++) {
if (this.mCombination[i] < TOTAL_BALLS) {
++this.mCombination[i];
// Debug printing System.out.print("newComb:");
print();
if (isUnique(this.mCombination)) {
mFormedCombinations.add(mCombination);
return true;
}
} else {
this.mCombination[i] = 1;
}
}
return false;
}
private boolean isUnique(int[] pCombination) {
if (containsDuplicate(pCombination)) {
return
Solution
I am sure there is a better way to compute the value mathematically, but brute-forcing the solution is quite a lot faster than the 6 hours you suggest.... I have 49/6 down to less than 2 seconds on my laptop... how is this done?
Firstly, math is your friend. There are a fixed number of combinations that are possible:
\$Cn = 49 \times 48 \times 47 \times 46 \times 45 \times 44\$
This computes to
To brute-force efficiently, use primitives, no objects (and garbage collection)! Here's how to brute it - count the number of times each sum-of-balls happens.
-
record that sum in the count:
-
'increment' the ball combination using some intelligence (in this case, increment to
At this point, you have an array of counts... where the index is the sum, and the value in the array is the number of times that sum was encountered.
For example, for 2 balls of maximum 6, the counts are:
at sum 3, that would be ball 1 and 2 (sum to 3). ball 1 and 3 sum to 4, balls 1 and 4, as well as 2, and 3 sum to 5, and so on.
Now, with this data, it is easy to calculate the mean, and standard deviations.... I used some Java 8 streams to do it...
The standard deviation required a little helper function which takes a deviation, and a count, and squares the deviation, and multiplies by the count:
We can use that function to calculate the total deviations:
The, with that total deviation, the Std. deviation is just:
Putting all of this logic together, you get:
For me, this produces output similar to:
Though, having done all this, I am pretty sure there's a mathematical way to do the same process much faster sti
Firstly, math is your friend. There are a fixed number of combinations that are possible:
\$Cn = 49 \times 48 \times 47 \times 46 \times 45 \times 44\$
This computes to
10,068,347,520. These 10 billion combinations do not take in to account that the ball order of selection is not significant. Since there are 720 ways (6 5 4 3 2 * 1) to order 6 balls, there are really only 10bil / 720 sequences, (about 14 million - 13,983,816) of balls that are possible. These can actually be brute-forced.To brute-force efficiently, use primitives, no objects (and garbage collection)! Here's how to brute it - count the number of times each sum-of-balls happens.
- the maximum sum will come from the combination
[44,45,46,47,48,49]
- create an array with space to count sum instances up to that magnitude
- start with a collection of N=6 balls
[1,2,3,4,5,6]with an upper limit of value 49.
-
record that sum in the count:
sum = 6 + 5 + 4 + 3 + 2 + 1;
counts[sum]++;-
'increment' the ball combination using some intelligence (in this case, increment to
[1, 3, 4, 5, 6, 7] - and then to [2, 3, 4, 5, 6, 7] and so on)- increment the counter for the sum of each combination as you go.
At this point, you have an array of counts... where the index is the sum, and the value in the array is the number of times that sum was encountered.
For example, for 2 balls of maximum 6, the counts are:
sum -> 0 1 2 3 4 5 6 7 8 9 10 11
count -> 0, 0, 0, 1, 1, 2, 2, 3, 2, 2, 1, 1at sum 3, that would be ball 1 and 2 (sum to 3). ball 1 and 3 sum to 4, balls 1 and 4, as well as 2, and 3 sum to 5, and so on.
Now, with this data, it is easy to calculate the mean, and standard deviations.... I used some Java 8 streams to do it...
final long count = LongStream.of(counts).sum();
final long total = IntStream.range(0, counts.length)
.mapToLong(index -> index * counts[index])
.sum();
final double mean = total / (double)count;The standard deviation required a little helper function which takes a deviation, and a count, and squares the deviation, and multiplies by the count:
private static double valDev(final double diff, final long count) {
return diff * diff * count;
}We can use that function to calculate the total deviations:
final double dev = IntStream
.range(0, counts.length)
.mapToDouble(index -> valDev(index - mean, counts[index]))
.sum();The, with that total deviation, the Std. deviation is just:
final double sd = Math.sqrt(dev / count);Putting all of this logic together, you get:
private static final long[] countComboSums(final int balls, final int base) {
int size = 1;
int[] state = new int[balls];
for (int i = 0; i max) {
return false;
}
int c = 1;
while (c = 0) {
state[c] = c + 1;
}
return true;
}
private static void logState(int[] state, int balls, long[] ret) {
ret[IntStream.of(state).sum()]++;
}
private static void processCombo(int base, int balls) {
long nanos = System.nanoTime();
final long[] counts = countComboSums(balls, base);
final long count = LongStream.of(counts).sum();
final long total = IntStream.range(0, counts.length)
.mapToLong(index -> index * counts[index]).sum();
final double mean = total / (double) count;
final double dev = IntStream
.range(0, counts.length)
.mapToDouble(index -> valDev(index - mean, counts[index]))
.sum();
final double sd = Math.sqrt(dev / count);
nanos = System.nanoTime() - nanos;
System.out.printf("For base %d with %d balls in %.3fs, mean is %.3f (sd %.3f) counts are %s\n",
base, balls, nanos / 1000000000.0, mean, sd, Arrays.toString(counts));
}
private static double valDev(final double diff, final long count) {
return diff * diff * count;
}
public static void main(String[] args) {
for (int i = 6; i <= 49; i++) {
for (int b = 1; b <= 6; b++) {
processCombo(i, b);
}
}
}For me, this produces output similar to:
.....
For base 48 with 6 balls in 1.709s, mean is 147.000 (sd 32.078) counts are
For base 49 with 1 balls in 0.000s, mean is 25.000 (sd 14.142) counts are
For base 49 with 2 balls in 0.000s, mean is 50.000 (sd 19.791) counts are
For base 49 with 3 balls in 0.002s, mean is 75.000 (sd 23.979) counts are
For base 49 with 4 balls in 0.029s, mean is 100.000 (sd 27.386) counts are
For base 49 with 5 balls in 0.264s, mean is 125.000 (sd 30.277) counts are
For base 49 with 6 balls in 1.933s, mean is 150.000 (sd 32.787) counts are
Though, having done all this, I am pretty sure there's a mathematical way to do the same process much faster sti
Code Snippets
sum = 6 + 5 + 4 + 3 + 2 + 1;
counts[sum]++;sum -> 0 1 2 3 4 5 6 7 8 9 10 11
count -> 0, 0, 0, 1, 1, 2, 2, 3, 2, 2, 1, 1final long count = LongStream.of(counts).sum();
final long total = IntStream.range(0, counts.length)
.mapToLong(index -> index * counts[index])
.sum();
final double mean = total / (double)count;private static double valDev(final double diff, final long count) {
return diff * diff * count;
}final double dev = IntStream
.range(0, counts.length)
.mapToDouble(index -> valDev(index - mean, counts[index]))
.sum();Context
StackExchange Code Review Q#86186, answer score: 8
Revisions (0)
No revisions yet.