HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Checking for spelling errors with a trie

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
spellingwithcheckingfortrieerrors

Problem

I have been lately working in my C++, and I finished up this trie, and I want to know how my C++ is looking so far.

Node.hpp

#pragma once

#include 

static const int ALPHABETSIZE = 26;

struct Node {
    bool isEnd;
    int prefixCount;
    std::shared_ptr child[ALPHABETSIZE];
};


Trie.hpp

#pragma once

#include "Node.hpp"
#include 
#include 

class Trie {
    std::shared_ptr head;

public:
    Trie();

    void insert(const std::string& word) const;
    bool search(const std::string& word) const;

    unsigned int wordsWithPrefix(const std::string& prefix) const;
};


Trie.cpp

#include "Trie.hpp"

Trie::Trie() {
    head = std::make_unique();
    head->isEnd = false;
    head->prefixCount = 0;
    for (int i = 0; i child[i] = nullptr;
    }
}

void Trie::insert(const std::string& word) const{
    auto current = head;
    for (int i = 0; i child[letter])
            current->child[letter] = std::make_shared();
        current->child[letter]->prefixCount++;
        current = current->child[letter];
    }

    current->isEnd = true;
}

bool Trie::search(const std::string& word) const{
    auto current = head;
    for (int i = 0; i child[letter])
            return false;
        current = current->child[letter];
    }

    return current->isEnd;
}

unsigned int Trie::wordsWithPrefix(const std::string& prefix) const{
    auto current = head;
    for (int i = 0; i child[letter])
            return 0;
        current = current->child[letter];
    }
    return current->prefixCount;
}


Source.cpp

```
#include
#include
#include
#include "Trie.hpp"

int main() {
Trie trie;
std::ifstream file;
std::string word = " ", input;

file.open("dictionary.txt");

//Ask for input
std::cin >> input;
std::transform(input.begin(), input.end(), input.begin(), ::tolower);

auto beenHere = false;
//Add every word to the trie.
while (std::getline(file, word)) {
std::transform(word.begin(), word.end(), word.begin()

Solution

Node

Raw arrays:

std::shared_ptr child[ALPHABETSIZE];


are really awkward. There's little reason to prefer to use one here. Use a std::array instead:

std::array, ALPHABETSIZE> children;


Also I pluralized your array name. Seems to make more sense. You also want to make sure that a default-constructed Node is valid, so I find it clearer to just add data member initializers:

struct Node {
    bool isEnd = false;
    int prefixCount = 0;
    std::array, ALPHABETSIZE> children;
};


Now you don't have to worry about default/value initialization.

You current construct your head via:

head = std::make_unique();


But head is a shared_ptr so you should do:

head = std::make_shared();


And with the added data member initializers, that's all we need to do in our constructor. Done!

insertion

Prefer a range-based for loop anytime you want to loop over everything in a container without needing the iterator. It expresses intent better:

for (char c : word) {
    int letter = c - 'a';
    ...
}


Also, what if word contains anything other than a lower-case letter? What if it's a capital letter or a space or a hyphen? This does no error checking here, so leaves open the possibility of undefined behavior due to invalid memory access.

Choice of data structure

Does having 26 children for each node make sense? That could be very memory intensive if our data set is sparse! I'd expect something more like:

std::unordered_map>


with which this would just look like:

auto& next = current->children[letter];
if (!next) next = std::make_shared();
next->prefixCount++;
current = next;


searching

I would rename search() to contains(), and same points apply about sanity checking your inputs and using a range-based for loop.

wordsWithPrefix

Note that your search and wordsWithPrefix do the same thing: we walk the length of the string, break early if we run out of children, and do something with the result. That suggests adding a helper function:

std::shared_ptr atPrefix(const std::string& prefix) const;


which does that work for us. With that function, we can implement:

bool contains(const std::string& word) const
{
    auto node = atPrefix(word);
    return node && node->isEnd;
}

unsigned int wordsWithPrefix(const std::string& prefix) const
{
    auto node = atPrefix(prefix);
    return node ? node->prefixCount : 0;
}


Trie construction

You may want to add a constructor that will iterate over and insert a bunch of words. Maybe take two iterators, same as the other standard containers.

cin and getline

Be careful about mixing those two. It doesn't always work.

Also, don't do this:

std::string word = " ", input;


Space isn't at a premium, you can declare your variables on separate lines.

main()

What is input? I don't understand your usage here. Are you mandating a sorted dictionary.txt for this to work? Some comments indicating this would be helpful.

final loop

at() vs operator[] just does range-checking. You don't need that here, so simply use operator[]. Also if what you want to do is divide input, you don't need a second argument to substr() - just the one will automatically print the remainder of the string from there on out. Besides, your length argument is incorrect anyway.

Prefer:

std::string x;
for (char c : input) {
    x += c;
    if (trie.wordsWithPrefix(x) == 0) {
        std::cout << x << '<' << input.substr(x.size());
        break;
    }
}


file.close()

Unnecessary. The ifstream destructor will take care of itself. RAII for the win.

Code Snippets

std::shared_ptr<Node> child[ALPHABETSIZE];
std::array<std::shared_ptr<Node>, ALPHABETSIZE> children;
struct Node {
    bool isEnd = false;
    int prefixCount = 0;
    std::array<std::shared_ptr<Node>, ALPHABETSIZE> children;
};
head = std::make_unique<Node>();
head = std::make_shared<Node>();

Context

StackExchange Code Review Q#107153, answer score: 6

Revisions (0)

No revisions yet.