patternjavaMinor
Baseline tagger
Viewed 0 times
taggerbaselinestackoverflow
Problem
This is a baseline tagger coed I have written. How can I optimize this code?
Confusion matrix (Part of AccuracyChecker class):
```
public ArrayList createConfusionMatrix(WordChecker checkedTagsList, ArrayList listOfWords){
//BeanComparator fieldComparator = new BeanComparator();
//Collections.sort(checkedTagsList, fieldComparator);
int distinctTagsSize = correctTags.size();
for(int i=0; i li
public class Tagger {
private ArrayList copyTagsList = new ArrayList();
public Tagger(WordChecker tagsList){
for(int i =0; i trainList){
int freq = 0;
ArrayList tagFreqOfWord = new ArrayList();
System.out.println(word);
for(ArrayList wordState: trainList)
{
if (word.equals(wordState.get(1)))
{
freq+=1;
if (tagFreqOfWord.size() == 0)
{
for (int i = 0 ; i<copyTagsList.size();i++)
{
if (((String)wordState.get(4)).equalsIgnoreCase((String)copyTagsList.get(i).get(0)))
{
tagFreqOfWord.add(copyTagsList.get(i));
tagFreqOfWord.get(0).set(1, (int)1);
break;
}
}
}
else
{
Boolean exist= false;
for (List j :tagFreqOfWord)
{
if (((String)wordState.get(4)).equalsIgnoreCase((String)j.get(0)))
{
int var = (int)tagFreqOfWord.get(tagFreqOfWord.indexOf(j)).get(1);
var+=1;
tagFreqOfWord.get(tagFreqOfWord.indexOf(j)).set(1,var);
exist = true;
break;
}
}
if (!exist)
{
List newTag = new ArrayList();
newTag.add(wordState.get(4));
newTag.add(1, 1);;
tagFreqOfWord.add(newTag);
}
}
}
}
System.out.println(tagFreqOfWord);
}
}Confusion matrix (Part of AccuracyChecker class):
```
public ArrayList createConfusionMatrix(WordChecker checkedTagsList, ArrayList listOfWords){
//BeanComparator fieldComparator = new BeanComparator();
//Collections.sort(checkedTagsList, fieldComparator);
int distinctTagsSize = correctTags.size();
for(int i=0; i li
Solution
private ArrayList copyTagsList = new ArrayList();You should always use generic version (use raw type only if you have a good reason!)
Same inside
extractTag using generic version you will not need anymore String cast.The same will be for other collection.
I usually tend to cache the
.size() specially when using it in a for to avoid method call overhead, but i don't remember if JVM could optimizate the call.Example here:
for (int i = 0 ; i<copyTagsList.size();i++)could be
int size = copyTagsList.size();
for (int i = 0 ; i<size;i++)Avoid Wrapper objects if not needed (only in generic!)
tagFreqOfWord.get(0).set(1, (int)1);No need for cast here.
Boolean exist= false;Here
Boolean refer to the wrapper class, not to the native type.. with this line you cause a boxing-unboxing everytime. Replace it with boolean exist = false;.Use better names
for (List j :tagFreqOfWord)while reading
for (int i = 0; i < x; i++) i is OK in most cases, reading this line could lead to confusion because someone could think: "well, it's j it could be the second index in a inner-for!"Or, it could lead to confusion yourself, use one-char-name only if you are working with indexs in for/while/etc. (outside you could use
idx!)int var = (int)tagFreqOfWord.get(tagFreqOfWord.indexOf(j)).get(1);Do you really need the cast here? I don't think. If it's a
Integer it will be read as int, so cast needed.int var = tagFreqOfWord.get(tagFreqOfWord.indexOf(j)).get(1);
var+=1;
tagFreqOfWord.get(tagFreqOfWord.indexOf(j)).set(1,var);You do
indexOf() two times, which could be a problem (it checks two times the same thing) just do it one time and save the result in a variable.int indexOf = tagFreqOfWord.indexOf(j);
int var = tagFreqOfWord.get(indexOf).get(1);
var+=1;
tagFreqOfWord.get(indexOf).set(1,var);Remember, indexOf:
Returns the index of the first occurrence of the specified element in this list, or -1 if this list does not contain the element. More formally, returns the lowest index i such that (o==null ? get(i)==null : o.equals(get(i))), or -1 if there is no such index.
You chould check if it returns
-1Seems like
wordState.get(4) can be saved and avoid .get() call N times (In a LinkedList, call .get everytime could be a problem)So it could be
Boolean exist= false;
String specialWord = (String)wordState.get(4);
for (List j :tagFreqOfWord)
{
if (specialWord.equalsIgnoreCase((String)j.get(0)))
{
int var = (int)tagFreqOfWord.get(tagFreqOfWord.indexOf(j)).get(1);
var+=1;
tagFreqOfWord.get(tagFreqOfWord.indexOf(j)).set(1,var);
exist = true;
break;
}
}
if (!exist)
{
List newTag = new ArrayList();
newTag.add(specialWord);
newTag.add(1, 1);;
tagFreqOfWord.add(newTag);
}So you avoid do a Cast everytime!
int distinctTagsSize = correctTags.size();
for(int i=0; i list = new ArrayList();
confusionMatrix.add(i, list);;
for(int j=0;j<distinctTagsSize; j++ ){
confusionMatrix.get(i).add(j, 0);
if (i == j)
confusionMatrix.get(i).set(j, checkedTagsList.getCheckedTags().get(i).get(1));
}
}Here you create a new list everytime, it's what you want? Ok i know it's Code review so you already know the code is correct, i just want to be sure you know what happens here.
For now i can't see anything wrong in this block.
Here
for(int i=0; i <distinctTagsSize;i++){
for(int j=0;j<listOfWords.size();j++){
if (((String)correctTags.get(i).get(0)).equalsIgnoreCase((String)listOfWords.get(j).get(4))){
if(!((String)listOfWords.get(j).get(5)).equalsIgnoreCase((String)listOfWords.get(j).get(4))){
for (List m: correctTags){
if(((String)listOfWords.get(j).get(5)).equalsIgnoreCase((String)m.get(0))){
int freq= (int)confusionMatrix.get(i).get(correctTags.indexOf(m));
freq+=1;
confusionMatrix.get(i).set(correctTags.indexOf(m),freq);
}
}
}
}
}
}(String)correctTags.get(i).get(0) will not change while for(..; j; ..) so you could dofor(int i=0; i <distinctTagsSize;i++){
String cache1 = (String)correctTags.get(i).get(0);
for(int j=0;j<listOfWords.size();j++){and then use
cache1. (example name!!!!)You could do the same thing for
(String)listOfWords.get(j).get(4) and (String)listOfWords.get(j).get(5) but inside for (..; j; ..) scope.Again, as above
int freq= (int)confusionMatrix.get(i).get(correctTags.indexOf(m));You don't need a cast here, and you can save
correctTags.indexOf(m) return value inside a variable and reuse it inside confusionMatrix.get(i).set(correctTags.indexOf(m),freq);too
```
confusionMatrix.get(i).set(indexO
Code Snippets
private ArrayList<List> copyTagsList = new ArrayList<List>();for (int i = 0 ; i<copyTagsList.size();i++)int size = copyTagsList.size();
for (int i = 0 ; i<size;i++)tagFreqOfWord.get(0).set(1, (int)1);Boolean exist= false;Context
StackExchange Code Review Q#45640, answer score: 5
Revisions (0)
No revisions yet.