patterncppMinor
Checking for spelling errors with a trie
Viewed 0 times
spellingwithcheckingfortrieerrors
Problem
I have been lately working in my C++, and I finished up this trie, and I want to know how my C++ is looking so far.
Node.hpp
Trie.hpp
Trie.cpp
Source.cpp
```
#include
#include
#include
#include "Trie.hpp"
int main() {
Trie trie;
std::ifstream file;
std::string word = " ", input;
file.open("dictionary.txt");
//Ask for input
std::cin >> input;
std::transform(input.begin(), input.end(), input.begin(), ::tolower);
auto beenHere = false;
//Add every word to the trie.
while (std::getline(file, word)) {
std::transform(word.begin(), word.end(), word.begin()
Node.hpp
#pragma once
#include
static const int ALPHABETSIZE = 26;
struct Node {
bool isEnd;
int prefixCount;
std::shared_ptr child[ALPHABETSIZE];
};Trie.hpp
#pragma once
#include "Node.hpp"
#include
#include
class Trie {
std::shared_ptr head;
public:
Trie();
void insert(const std::string& word) const;
bool search(const std::string& word) const;
unsigned int wordsWithPrefix(const std::string& prefix) const;
};Trie.cpp
#include "Trie.hpp"
Trie::Trie() {
head = std::make_unique();
head->isEnd = false;
head->prefixCount = 0;
for (int i = 0; i child[i] = nullptr;
}
}
void Trie::insert(const std::string& word) const{
auto current = head;
for (int i = 0; i child[letter])
current->child[letter] = std::make_shared();
current->child[letter]->prefixCount++;
current = current->child[letter];
}
current->isEnd = true;
}
bool Trie::search(const std::string& word) const{
auto current = head;
for (int i = 0; i child[letter])
return false;
current = current->child[letter];
}
return current->isEnd;
}
unsigned int Trie::wordsWithPrefix(const std::string& prefix) const{
auto current = head;
for (int i = 0; i child[letter])
return 0;
current = current->child[letter];
}
return current->prefixCount;
}Source.cpp
```
#include
#include
#include
#include "Trie.hpp"
int main() {
Trie trie;
std::ifstream file;
std::string word = " ", input;
file.open("dictionary.txt");
//Ask for input
std::cin >> input;
std::transform(input.begin(), input.end(), input.begin(), ::tolower);
auto beenHere = false;
//Add every word to the trie.
while (std::getline(file, word)) {
std::transform(word.begin(), word.end(), word.begin()
Solution
Node
Raw arrays:
are really awkward. There's little reason to prefer to use one here. Use a
Also I pluralized your array name. Seems to make more sense. You also want to make sure that a default-constructed
Now you don't have to worry about default/value initialization.
You current construct your
But
And with the added data member initializers, that's all we need to do in our constructor. Done!
insertion
Prefer a range-based for loop anytime you want to loop over everything in a container without needing the iterator. It expresses intent better:
Also, what if
Choice of data structure
Does having 26 children for each node make sense? That could be very memory intensive if our data set is sparse! I'd expect something more like:
with which this would just look like:
searching
I would rename
wordsWithPrefix
Note that your
which does that work for us. With that function, we can implement:
Trie construction
You may want to add a constructor that will iterate over and insert a bunch of words. Maybe take two iterators, same as the other standard containers.
cin and getline
Be careful about mixing those two. It doesn't always work.
Also, don't do this:
Space isn't at a premium, you can declare your variables on separate lines.
main()
What is
final loop
Prefer:
file.close()
Unnecessary. The
Raw arrays:
std::shared_ptr child[ALPHABETSIZE];are really awkward. There's little reason to prefer to use one here. Use a
std::array instead:std::array, ALPHABETSIZE> children;Also I pluralized your array name. Seems to make more sense. You also want to make sure that a default-constructed
Node is valid, so I find it clearer to just add data member initializers:struct Node {
bool isEnd = false;
int prefixCount = 0;
std::array, ALPHABETSIZE> children;
};Now you don't have to worry about default/value initialization.
You current construct your
head via:head = std::make_unique();But
head is a shared_ptr so you should do:head = std::make_shared();And with the added data member initializers, that's all we need to do in our constructor. Done!
insertion
Prefer a range-based for loop anytime you want to loop over everything in a container without needing the iterator. It expresses intent better:
for (char c : word) {
int letter = c - 'a';
...
}Also, what if
word contains anything other than a lower-case letter? What if it's a capital letter or a space or a hyphen? This does no error checking here, so leaves open the possibility of undefined behavior due to invalid memory access.Choice of data structure
Does having 26 children for each node make sense? That could be very memory intensive if our data set is sparse! I'd expect something more like:
std::unordered_map>with which this would just look like:
auto& next = current->children[letter];
if (!next) next = std::make_shared();
next->prefixCount++;
current = next;searching
I would rename
search() to contains(), and same points apply about sanity checking your inputs and using a range-based for loop. wordsWithPrefix
Note that your
search and wordsWithPrefix do the same thing: we walk the length of the string, break early if we run out of children, and do something with the result. That suggests adding a helper function:std::shared_ptr atPrefix(const std::string& prefix) const;which does that work for us. With that function, we can implement:
bool contains(const std::string& word) const
{
auto node = atPrefix(word);
return node && node->isEnd;
}
unsigned int wordsWithPrefix(const std::string& prefix) const
{
auto node = atPrefix(prefix);
return node ? node->prefixCount : 0;
}Trie construction
You may want to add a constructor that will iterate over and insert a bunch of words. Maybe take two iterators, same as the other standard containers.
cin and getline
Be careful about mixing those two. It doesn't always work.
Also, don't do this:
std::string word = " ", input;Space isn't at a premium, you can declare your variables on separate lines.
main()
What is
input? I don't understand your usage here. Are you mandating a sorted dictionary.txt for this to work? Some comments indicating this would be helpful. final loop
at() vs operator[] just does range-checking. You don't need that here, so simply use operator[]. Also if what you want to do is divide input, you don't need a second argument to substr() - just the one will automatically print the remainder of the string from there on out. Besides, your length argument is incorrect anyway. Prefer:
std::string x;
for (char c : input) {
x += c;
if (trie.wordsWithPrefix(x) == 0) {
std::cout << x << '<' << input.substr(x.size());
break;
}
}file.close()
Unnecessary. The
ifstream destructor will take care of itself. RAII for the win.Code Snippets
std::shared_ptr<Node> child[ALPHABETSIZE];std::array<std::shared_ptr<Node>, ALPHABETSIZE> children;struct Node {
bool isEnd = false;
int prefixCount = 0;
std::array<std::shared_ptr<Node>, ALPHABETSIZE> children;
};head = std::make_unique<Node>();head = std::make_shared<Node>();Context
StackExchange Code Review Q#107153, answer score: 6
Revisions (0)
No revisions yet.