patternjavaMinor
Comparing three data structures for dealing with probability distributions in Java
Viewed 0 times
threecomparingwithjavadealingdistributionsforprobabilitydatastructures
Problem
Introduction
Suppose you are given three elements \$a, b, c\$ with respective weights \$1, 1, 3\$. Now, a probability distribution data structures will return upon request \$a\$ with probability 20%, \$b\$ with probability 20%, and \$c\$ with probability 60%.
The API for my probability distribution data structures is defined by the following abstract class:
```
package net.coderodde.stat;
import java.util.Objects;
import java.util.Random;
/**
* This class implements an abstract base class for probability distributions.
* Elements are added with strictly positive weights and whenever asking this
* data structure for a random element, their respective weights are taken into
* account. For example, if this data structure contains three different
* elements (a, b, c with respective weights
* 1.0, 1.0, 3.0), whenever asking for a random
* element, there is 20 percent chance of obtaining a, 20 percent
* chance of obtaining b, and 60 percent chance of obtaining
* c.
*
* @author Rodion "rodde" Efremov
* @version 1.6 (Jun 11, 2016)
*/
public abstract class AbstractProbabilityDistribution {
/**
* The amount of elements in this probability distribution.
*/
protected int size;
/**
* The sum of all weights.
*/
protected double totalWeight;
/**
* The random number generator of this probability distribution.
*/
protected final Random random;
/**
* Constructs this probability distribution.
*/
protected AbstractProbabilityDistribution() {
this(new Random());
}
/**
* Constructs this probability distribution using the input random number
* generator.
*
* @param random the random number generator.
*/
protected AbstractProbabilityDistribution(final Random random) {
this.random =
Objects.requireNonNull(random,
"The random number generator is null.");
}
public boolea
Suppose you are given three elements \$a, b, c\$ with respective weights \$1, 1, 3\$. Now, a probability distribution data structures will return upon request \$a\$ with probability 20%, \$b\$ with probability 20%, and \$c\$ with probability 60%.
The API for my probability distribution data structures is defined by the following abstract class:
```
package net.coderodde.stat;
import java.util.Objects;
import java.util.Random;
/**
* This class implements an abstract base class for probability distributions.
* Elements are added with strictly positive weights and whenever asking this
* data structure for a random element, their respective weights are taken into
* account. For example, if this data structure contains three different
* elements (a, b, c with respective weights
* 1.0, 1.0, 3.0), whenever asking for a random
* element, there is 20 percent chance of obtaining a, 20 percent
* chance of obtaining b, and 60 percent chance of obtaining
* c.
*
* @author Rodion "rodde" Efremov
* @version 1.6 (Jun 11, 2016)
*/
public abstract class AbstractProbabilityDistribution {
/**
* The amount of elements in this probability distribution.
*/
protected int size;
/**
* The sum of all weights.
*/
protected double totalWeight;
/**
* The random number generator of this probability distribution.
*/
protected final Random random;
/**
* Constructs this probability distribution.
*/
protected AbstractProbabilityDistribution() {
this(new Random());
}
/**
* Constructs this probability distribution using the input random number
* generator.
*
* @param random the random number generator.
*/
protected AbstractProbabilityDistribution(final Random random) {
this.random =
Objects.requireNonNull(random,
"The random number generator is null.");
}
public boolea
Solution
Inconsistent use of
I'm not sure why you used a
Array implementation suggestion
You could modify your Array implementation to do the following:
-
Adding an element remains constant time because you can compute the cumulative weight of the new element in constant time like this:
This implemenation would potentially be superior to the binary tree implementation if removals were expected to be rare.
thisI'm not sure why you used a
this prefix for many of your variables, when it was optional to do so. I would have omitted it myself. But assuming you had a reason for doing so, you didn't do it consistently. It seems like around 70% of the places use it and 30% don't.Array implementation suggestion
You could modify your Array implementation to do the following:
- In addition to storing an array of objects and an array of weights, you could make a third array which stores cumulative weights. In other words,
cumulativeWeight[n]would be the sum ofweightStorageArray[i]fori = 0..n.
- With the cumulative weight array,
sampleElement()could now be done using a binary search through the cumulative weight array. This reduces sample time to \$O(\log n)\$ instead of \$O(n)\$.
-
Adding an element remains constant time because you can compute the cumulative weight of the new element in constant time like this:
cumulativeWeight[this.size] = weight +
(this.size == 0) ? 0 : cumulativeWeight[this.size-1];- Removing an element still takes linear time, and you will need to adjust each element of
cumulativeWeightas you shift it downwards.
This implemenation would potentially be superior to the binary tree implementation if removals were expected to be rare.
Code Snippets
cumulativeWeight[this.size] = weight +
(this.size == 0) ? 0 : cumulativeWeight[this.size-1];Context
StackExchange Code Review Q#132088, answer score: 3
Revisions (0)
No revisions yet.