HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Weighted probabilistic sampling

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
weightedprobabilisticsampling

Problem

Problem description.
Returns strings with probability determined by the frequency
of each of the strings.
eg: if "foo" has weight of 50% and "bar" has weight of another 50% then both foo and
bar should be 5 and 5 times each given that 10 tries were made.

If question is unclear let me know I will reply asap.
Looking for code review, optimizations and best practice.

Also verifying that complexity of next() is undefined due to probabilistic nature.

```
public final class WeightedRandom {

private final List> itemDataList;
private final int size;
private int count;

/**
* Takes in a list of items.
*
* @param items the list of items
* @throws NullPointerException if items is null.
* @throws IllegalArgumentException if the input list size is zero.
*/
public WeightedRandom (List items) {
if (items.size() == 0) throw new IllegalArgumentException("The size of list should be greater than zero.");
size = items.size();
itemDataList = new ArrayList>();
addAll(items);
}

private void addAll(List items) {
final Map map = new HashMap();
for (T item : items) {
int val = 0;
if (map.containsKey(item)) {
val = map.get(item);
}
map.put(item, val + 1);
}

for (Entry entry : map.entrySet()) {
itemDataList.add(new ItemData(entry.getKey(), entry.getValue(), 0));
}
}

private static class ItemData {
T item;
int frequency;
int usedCount;

ItemData (T item, int frequency, int usedCount) {
this.item = item;
this.frequency = frequency;
this.usedCount = usedCount;
}
}

/**
* Returns strings with probability determined by the frequency
* of each of the strings.
* eg: if "foo" has weight of 50% and "bar" has weight of another 50% then both foo and
* bar should be 5 and 5 times

Solution

The code looks nice. A few minor improvements or ideas:

-
You need only two simple methods to make it an Iterator:

public final class WeightedRandom implements Iterator {

    ...

    @Override
    public boolean hasNext() {
        return true;
    }

    @Override
    public void remove() {
        throw new UnsupportedOperationException();
    }
}


It could be useful/convenient for clients.

-

if (items.size() == 0) {
    throw new IllegalArgumentException("The size of list should be greater than zero.");
}


items.isEmpty() would be a little bit readable and you could use Guava's Preconditons too to make it more fluent and explicit:

checkNotNull(items, "items cannot be null");
checkArgument(!items.isEmpty(), "The size of list should be greater than zero.");


-

itemDataList = new ArrayList>();


This could be in the declaration:

import static com.google.common.collect.Lists.newArrayList;

...

    private final List> itemDataList = newArrayList();


-
The constructor of ItemData is always called with zero usedCount. Consider setting a default value for that parameter and remove it from the signature:

private static class ItemData {
    T item;
    int frequency;
    int usedCount = 0;

    ItemData(final T item, final int frequency) {
        this.item = item;
        this.frequency = frequency;
    }
}


Currently the number 0 is a magic number in the constructor call:

itemDataList.add(new ItemData(entry.getKey(), entry.getValue(), 0));


Readers have to check the constructor of ItemData to figure out what it means. An explanatory local variable be an improvement. (Clean Code by Robert C. Martin, G19: Use Explanatory Variables; Refactoring: Improving the Design of Existing Code by Martin Fowler, Introduce Explaining Variable)

-
addAll would be a little bit shorter with a Multiset:

private void addAll(final List items) {
    final Multiset multiset = HashMultiset.create();
    multiset.addAll(items);

    for (final Entry entry: multiset.entrySet()) {
        itemDataList.add(new ItemData(entry.getElement(), entry.getCount()));
    }
}


-
Javadoc of next speaks about strings but it could be any type (T). It's a little bit confusing.

-
Javadoc typo: exptected (Eclipse has spell check.)

Code Snippets

public final class WeightedRandom<T> implements Iterator<T> {

    ...

    @Override
    public boolean hasNext() {
        return true;
    }

    @Override
    public void remove() {
        throw new UnsupportedOperationException();
    }
}
if (items.size() == 0) {
    throw new IllegalArgumentException("The size of list should be greater than zero.");
}
checkNotNull(items, "items cannot be null");
checkArgument(!items.isEmpty(), "The size of list should be greater than zero.");
itemDataList = new ArrayList<ItemData<T>>();
import static com.google.common.collect.Lists.newArrayList;

...

    private final List<ItemData<T>> itemDataList = newArrayList();

Context

StackExchange Code Review Q#40823, answer score: 3

Revisions (0)

No revisions yet.