HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Dictionary application

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
dictionaryapplicationstackoverflow

Problem

I am creating an application which needs to validated user input to ensure the words being entered are real words. What improvements if any can I make?

class MyDictionary {

    Vector words = new Vector<>();

    public MyDictionary() {
        URL url;
        try {
            url = new URL("http://www.example.com/hugewordlist.txt");
            URLConnection uc = url.openConnection();
            BufferedReader br = new BufferedReader(new InputStreamReader(uc.getInputStream()));
            char[] buffer = new char[1024];
            int i = 0;
            StringBuffer b = new StringBuffer();
            int readBytes = 0;
            while ((readBytes = br.read(buffer, i, 1024)) != -1) {
                i += 1024;
                b.append(buffer, 0, readBytes);
            }
            for (String w : b.toString().split("\n")) {
                words.add(w);
            }
            br.close();
        } catch (MalformedURLException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    public boolean hasWord(String word) {
        return words.contains(word);
    }

    public int size() {
        return words.size();
    }
}

Solution


  • What do you want to do when there are no words? The current solution isn't satisfactory at all. You're simply failing silently, and hasWord will always return false. By the way, if there's an exception in the middle of of the loop, will your file be closed?



  • Searching in a Vector takes O(n) time. This means that if you double the size of the vector, it will take twice as long. Other containers such as TreeSet have O(log n) access time: doubling their size will only slow the process by one iteration. HashSet has O(1) access time. Your lookups will be way faster.



  • Do you know that in a standard text unknown words account for ~10% of your text?

Context

StackExchange Code Review Q#9362, answer score: 7

Revisions (0)

No revisions yet.