patternjavaMinor
Calculate conditional probabilities and perform naive Bayes classification on a given data set
Viewed 0 times
naivebayesconditionalperformprobabilitiescalculateandclassificationdatagiven
Problem
I wrote a class that I'm using to calculate conditional probabilities of a given distribution as well as perform naive Bayes classification. I'd like to get a code review done to tell me if there is anything that I can do to make what I wrote neater or perform better.
```
package edu.uba.filters;
import java.util.*;
import org.apache.commons.math3.util.Pair;
public class Probability {
protected Frequency iFrequency = new Frequency();
protected Frequency rFrequency = new Frequency();
private String[] targetClassKeys;
private HashMap priors = new HashMap();
private LinkedList> predictions = new LinkedList>();
public Probability(){
super();
}
public void setInterestedFrequency(List interestedFrequency){
for(String s: interestedFrequency){
this.iFrequency.addValue(s);
}
}
public void setReducingFrequency(List reducingFrequency){
for(String s:reducingFrequency){
this.rFrequency.addValue(s);
}
}
public LinkedList> getPredictions(){
return this.predictions;
}
/*
*return conditional probability of P(interestedClass|reducingClass)
*/
public double conditionalProbability(List interestedSet,
List reducingSet,
String interestedClass,
String reducingClass){
List conditionalData = new LinkedList();
double returnProb = 0;
iFrequency.clear();
rFrequency.clear();
this.setInterestedFrequency(interestedSet);
this.setReducingFrequency(reducingSet);
for(int i = 0;i targetClass, BayesOption bayesOption){
//intialize variables
int numOfClasses = data.getHeaders().size();
Object[] keyNames = data.getHeaders().toArray();
double conditionalProb = 1.0;
double prob = 1.0;
String[] rClass;
String priorName;
iFrequency.clear();
rFrequency.clear();
if(bayesOption.compareTo(BayesOption.TRAIN) == 0){
this.setInterestedFrequency(target
Probability class:```
package edu.uba.filters;
import java.util.*;
import org.apache.commons.math3.util.Pair;
public class Probability {
protected Frequency iFrequency = new Frequency();
protected Frequency rFrequency = new Frequency();
private String[] targetClassKeys;
private HashMap priors = new HashMap();
private LinkedList> predictions = new LinkedList>();
public Probability(){
super();
}
public void setInterestedFrequency(List interestedFrequency){
for(String s: interestedFrequency){
this.iFrequency.addValue(s);
}
}
public void setReducingFrequency(List reducingFrequency){
for(String s:reducingFrequency){
this.rFrequency.addValue(s);
}
}
public LinkedList> getPredictions(){
return this.predictions;
}
/*
*return conditional probability of P(interestedClass|reducingClass)
*/
public double conditionalProbability(List interestedSet,
List reducingSet,
String interestedClass,
String reducingClass){
List conditionalData = new LinkedList();
double returnProb = 0;
iFrequency.clear();
rFrequency.clear();
this.setInterestedFrequency(interestedSet);
this.setReducingFrequency(reducingSet);
for(int i = 0;i targetClass, BayesOption bayesOption){
//intialize variables
int numOfClasses = data.getHeaders().size();
Object[] keyNames = data.getHeaders().toArray();
double conditionalProb = 1.0;
double prob = 1.0;
String[] rClass;
String priorName;
iFrequency.clear();
rFrequency.clear();
if(bayesOption.compareTo(BayesOption.TRAIN) == 0){
this.setInterestedFrequency(target
Solution
public class Frequency> {
private Multiset event = HashMultiset.create();
private Multimap event2 = LinkedListMultimap.create();The first problem that I'd note is that the indent is off. Consider adding a level of indent inside the class like this
public class Frequency> {
private Multiset event = HashMultiset.create();
private Multimap event2 = LinkedListMultimap.create();Sometimes this is a copy/paste problem. If you select an entire block of code that contains at least one line with no indent, using Ctrl+K will add indent to the entire block. If all the lines already have indent, it decreases the indent.
Another issue in this section of code is that
Multiset and Multimap should be parameterized. Probabilityimport java.util.*;It's generally better to import each class separately. This imports everything in
java.util whether you use it or not. Many modern IDEs will handle imports for you, so it may not even be more work. protected Frequency iFrequency = new Frequency();
protected Frequency rFrequency = new Frequency();If you are using an up-to-date Java, you don't need to write out the parameters:
protected Frequency iFrequency = new Frequency<>();
protected Frequency rFrequency = new Frequency<>();It will figure out the matching type for you.
private HashMap priors = new HashMap();
private LinkedList> predictions = new LinkedList>();It's more common to define variables as interfaces rather than implementations, so
private Map priors = new HashMap<>();
private List> predictions = new LinkedList<>();It's possible that you are dealing with one of the exceptions to this with
predictions, but I don't see it in this code. With priors, you aren't using any HashMap specific functionality and you don't allow access to it directly. The point of this is to make it easier to change the implementations later. Will it ever matter in this code? Maybe not. But you will eventually write some code where changing the implementation makes sense. If you develop the habit now, that will be easier later.
public Probability(){
super();
}You don't have to do this. Java automatically calls the no argument
super for you. Also, Java will create a default no argument constructor for you. You only have to make a constructor if you want a different behavior from the default. public LinkedList> getPredictions(){
return this.predictions;
}Again, you don't need to write out
LinkedList. List is sufficient. public List> getPredictions(){
return predictions;
}You also don't need to say
this. unless you are resolving a conflict. I didn't notice any usages of this. that were actually needed. iFrequency.clear();
rFrequency.clear();
this.setInterestedFrequency(interestedSet);
this.setReducingFrequency(reducingSet);This is a weird pattern. You normally set object fields once, in the constructor. They contain data that is a characteristic of the object and allow you to access that data in multiple methods without passing it around as parameters. But you're not doing that here. You declare these as object fields but then you clear and initialize them inside a method. Then you clear them again at the end. Why not just declare them as local variables?
Frequency iFrequency = new Frequency<>();
Frequency rFrequency = new Frequency<>();Then they will automatically disappear at the end of the method. This also makes concurrency easier, but that may not matter here. As a general rule, you should always define variables at the smallest scope required. That can save accidental conflicts later.
You can then change your two
setFrequency methods to one addAll method on Frequency. So you can say iFrequency.addAll(interestedSet);
rFrequency.addAll(reducingSet);No need to maintain two identical methods.
Code Snippets
public class Frequency<T extends Comparable<T>> {
private Multiset event = HashMultiset.create();
private Multimap event2 = LinkedListMultimap.create();public class Frequency<T extends Comparable<T>> {
private Multiset<T> event = HashMultiset.create();
private Multimap<T> event2 = LinkedListMultimap.create();import java.util.*;protected Frequency<String> iFrequency = new Frequency<String>();
protected Frequency<String> rFrequency = new Frequency<String>();protected Frequency<String> iFrequency = new Frequency<>();
protected Frequency<String> rFrequency = new Frequency<>();Context
StackExchange Code Review Q#110295, answer score: 4
Revisions (0)
No revisions yet.