patternjavaMinor
Java class for simple statistics calculations
Viewed 0 times
simplejavacalculationsstatisticsforclass
Problem
I've written a class for a project of mine whose responsibility is to perform basic statistic calculations like mean, median and mode. The aim is for it to be a helper class, so it should not be instantiable. I later want to extend it to include weighted mean and range/interquartile range as well as variance.
Here is the code:
```
import java.util.Arrays;
import java.util.HashMap;
public class BasicStatistics {
//Prevent instantiation...
//throws UnsupportedOperationException to defend against reflection
private BasicStatistics() throws UnsupportedOperationException{
throw new UnsupportedOperationException();
}
public static class Mode {
private static int mode;
private static HashMap hm = new HashMap();
private Mode() throws UnsupportedOperationException{
throw new UnsupportedOperationException();
}
public static int calculateMode(int ints[]){
int max = 0;
int temp = 0;
for (int i = 0; i max){
max = count;
temp = ints[i];
}
} else {
hm.put(ints[i], 1);
}
}
if (temp == 0){
mode = ints[0];
} else {
mode = temp;
}
return mode;
}
}
public static class Mean {
private static float mean;
private Mean() throws UnsupportedOperationException{
throw new UnsupportedOperationException();
}
public static float calculateMean(int ints[]){
for (int element : ints){
mean += element;
}
return mean / ints.length;
}
}
public static class Median {
private static float median;
private Median() throws UnsupportedOperationException{
throw new UnsupportedOperationException();
}
Here is the code:
```
import java.util.Arrays;
import java.util.HashMap;
public class BasicStatistics {
//Prevent instantiation...
//throws UnsupportedOperationException to defend against reflection
private BasicStatistics() throws UnsupportedOperationException{
throw new UnsupportedOperationException();
}
public static class Mode {
private static int mode;
private static HashMap hm = new HashMap();
private Mode() throws UnsupportedOperationException{
throw new UnsupportedOperationException();
}
public static int calculateMode(int ints[]){
int max = 0;
int temp = 0;
for (int i = 0; i max){
max = count;
temp = ints[i];
}
} else {
hm.put(ints[i], 1);
}
}
if (temp == 0){
mode = ints[0];
} else {
mode = temp;
}
return mode;
}
}
public static class Mean {
private static float mean;
private Mean() throws UnsupportedOperationException{
throw new UnsupportedOperationException();
}
public static float calculateMean(int ints[]){
for (int element : ints){
mean += element;
}
return mean / ints.length;
}
}
public static class Median {
private static float median;
private Median() throws UnsupportedOperationException{
throw new UnsupportedOperationException();
}
Solution
Don't overdo Single Responsibility Principle
You have four classes, three of which calculate the mean, median, and mode respectively. The thing is though that their responsibility doesn't match their names. They have responsibility over the data, which they then express as the statistics. But if all the
Consider instead making one class with multiple methods. Then the class can have the data as a member.
Don't overuse
This means that every single instance of this class has the same
I also changed
I changed
The
If you need access to fields, then you should instantiate the class and access the fields through normal methods. E.g.
Now you can create a second object with different data without losing the first. Or worse, merging with it.
The presumption here is that
The
First, you don't need to make the
Second, since you sort the data to get the median anyway, you might as well use the sorted data to get the mode as well. With sorted data, you can simply track the mode of the data that you've seen.
This version returns a
This method is
I renamed
Because sorting clumps all the like values together, we don't need to store anything but the current and largest counts.
Saying
I don't verify that there is at least one element in the data. You might want to
What if there's a 0 in the data?
If the correct answer is a 0 that is not in the first position, this will return the wrong answer.
Consider making a
If you had a
You have four classes, three of which calculate the mean, median, and mode respectively. The thing is though that their responsibility doesn't match their names. They have responsibility over the data, which they then express as the statistics. But if all the
Mode is responsible for is the mode, then it shouldn't know about the data. All it should know about is the Mode itself. Consider instead making one class with multiple methods. Then the class can have the data as a member.
Don't overuse
staticprivate static int mode;
private static HashMap hm = new HashMap();This means that every single instance of this class has the same
mode and hm. private int mode;
private Map counts = new HashMap<>();I also changed
HashMap to Map on the left. Usually you make types the interface rather than the implementation. That ensures that you can change the implementation easily later if desired. I changed
hm to counts as being more descriptive. The
<> notation will only work in newer versions of Java but saves typing when available. If you need access to fields, then you should instantiate the class and access the fields through normal methods. E.g.
StatisticsCalculator statCalc = new StatisticsCalculator(data);
statCalc.calculate();
Set modes = getModes();Now you can create a second object with different data without losing the first. Or worse, merging with it.
The presumption here is that
calculate would calculate all the stats and you could just access them afterwards. If you always use them together, this is more efficient. You could also calculate them on a just-in-time basis. The
Map is unnecessaryFirst, you don't need to make the
Map a class field. It would be fine as a local variable. Second, since you sort the data to get the median anyway, you might as well use the sorted data to get the mode as well. With sorted data, you can simply track the mode of the data that you've seen.
private Set calculateMode() {
Set modes = new HashSet<>();
int largestCount = 0;
int count = 0;
int previous = sortedData[0];
for (int datum : sortedData) {
if (previous == datum) {
count++
} else {
count = 1;
}
if (count > largestCount) {
largestCount = count;
modes.clear();
modes.add(datum);
} else if (count == largestCount) {
modes.add(datum);
}
}
return modes;
}This version returns a
Set, since there is no guarantee that there will be a unique mode. This method is
private because it requires sorted data. A public version could check if the data was sorted prior to calling the private version. I renamed
temp to previous as being more descriptive. Because sorting clumps all the like values together, we don't need to store anything but the current and largest counts.
Saying
count++ is more idiomatic than count += 1. I don't verify that there is at least one element in the data. You might want to
throw an exception in that case. As is, both your original version and this version would crash, as they attempt to access the first element. It's possible that the correct place to do that validation is outside this method. What if there's a 0 in the data?
if (temp == 0){
mode = ints[0];If the correct answer is a 0 that is not in the first position, this will return the wrong answer.
Consider making a
calculateSum methodIf you had a
calculateSum method, you could make calculateMean a one liner. And report the sum of the data if you wanted.Code Snippets
private static int mode;
private static HashMap<Integer, Integer> hm = new HashMap<Integer, Integer>();private int mode;
private Map<Integer, Integer> counts = new HashMap<>();StatisticsCalculator statCalc = new StatisticsCalculator(data);
statCalc.calculate();
Set<Integer> modes = getModes();private Set<Integer> calculateMode() {
Set<Integer> modes = new HashSet<>();
int largestCount = 0;
int count = 0;
int previous = sortedData[0];
for (int datum : sortedData) {
if (previous == datum) {
count++
} else {
count = 1;
}
if (count > largestCount) {
largestCount = count;
modes.clear();
modes.add(datum);
} else if (count == largestCount) {
modes.add(datum);
}
}
return modes;
}if (temp == 0){
mode = ints[0];Context
StackExchange Code Review Q#139196, answer score: 6
Revisions (0)
No revisions yet.