patterncppMinor
Dictionary tree optimization
Viewed 0 times
dictionaryoptimizationtree
Problem
I was wondering if anyone could give me a little bit of help with optimizing a tree-structure in C++. I'm currently developing a dictionary program for a school project but want to take it a bit further in my spare time.
A little bit of info: The dictionary is a simple class which contains a node. Each node in the class contains an array of 26 pointers to sub-nodes. Each node can contain a single letter (
Basically, if I want to search for the definition of "programming," the dictionary class will look at each letter in the requested word and go directly to a sub-node representing the letter (a=0, b=1, c=2, and so on...). When it finally reaches the final letter "g" in "programming" then it returns a user-defined definition.
What I was wondering was if I could get help in optimizing any of the subroutines in this class. Currently, looking up words is in \$O(n)\$ time (where n is the number of letters in the requested word). Copying words in the dictionary is in \$O(m+n)\$ where 'm' is the number of preexisting words in the destination dictionary and 'n' is the number of words in the source dictionary.
Hopefully the source is easy enough to understand (I'm not used to sharing my code and this site seems to mess up my formatting).
```
#ifndef DICTIONARY_H
#define DICTIONARY_H
#include
using namespace std;
typedef const char* cstr;
int toLower(char c) {
//Return a lowercase letter if its within A-Z (65-90) on the ASCII table;
//Otherwise return 'c' as it is (don't bother with non-alphabetical characters).
return (c > 64 && c copy(*src.alphabet[ iter ]); //begin recursive awesomeness!
}
}
}
//-----------------------------------------------------------------------------
// Utilities
//-----------------------------------------------------------------------------
dictionary& dictionary::operator =(con
A little bit of info: The dictionary is a simple class which contains a node. Each node in the class contains an array of 26 pointers to sub-nodes. Each node can contain a single letter (
char) and a single definition (of std::string-type). Traversal is based solely on what letter of the alphabet needs to be looked up.Basically, if I want to search for the definition of "programming," the dictionary class will look at each letter in the requested word and go directly to a sub-node representing the letter (a=0, b=1, c=2, and so on...). When it finally reaches the final letter "g" in "programming" then it returns a user-defined definition.
What I was wondering was if I could get help in optimizing any of the subroutines in this class. Currently, looking up words is in \$O(n)\$ time (where n is the number of letters in the requested word). Copying words in the dictionary is in \$O(m+n)\$ where 'm' is the number of preexisting words in the destination dictionary and 'n' is the number of words in the source dictionary.
Hopefully the source is easy enough to understand (I'm not used to sharing my code and this site seems to mess up my formatting).
```
#ifndef DICTIONARY_H
#define DICTIONARY_H
#include
using namespace std;
typedef const char* cstr;
int toLower(char c) {
//Return a lowercase letter if its within A-Z (65-90) on the ASCII table;
//Otherwise return 'c' as it is (don't bother with non-alphabetical characters).
return (c > 64 && c copy(*src.alphabet[ iter ]); //begin recursive awesomeness!
}
}
}
//-----------------------------------------------------------------------------
// Utilities
//-----------------------------------------------------------------------------
dictionary& dictionary::operator =(con
Solution
Dictionary is a very common word. I am thinking that there may be high possibility of clashing with existing software. I tend to try an make my include guards unique by including the namespace as part of the guard (My base namespace is also my website name so it is unique).
Please stop doing this:
Never do this in a header file you will pollute the global namespace (potentially without the user knowing) thus causing all sorts of side-affects). Even in source files you should probably not do it so that you don't pollute the global namespace for yourself. The reason it is called
OK. Not a real problem:
But I would prefer not to see it as it implies you are using C-Strings (rather than C++ strings). Whose real type is actually
There is already a std::tolower() that does exactly what you expect.
And is implemented in an optimal way for your platform.
Trying to be too clever and shot your self in the foot.
It works. But it is very awkward way of writing it. Thus every maintainer that comes across this code will go "what?". and need to validate that it actually works
Your copy constructor calls
The reason for copy seems that you put common code from the copy constructor and assignment operator in a single place. I would change this up. I would put all the copy code in the copy constructor without the call to
Use copy and swap idiom to implement the assignment operator:
This implementation is not exception safe (and does not handle self assignment). When doing assignment you need to do in three stages:
These three stages are automatically achieved using the copy and swap idiom.
No need to check for NULL before deleting.
Just use the copy constructor here it is much clearer:
I will be back with more:
#ifndef DICTIONARY_H
#define DICTIONARY_HPlease stop doing this:
using namespace std;Never do this in a header file you will pollute the global namespace (potentially without the user knowing) thus causing all sorts of side-affects). Even in source files you should probably not do it so that you don't pollute the global namespace for yourself. The reason it is called
std rather than standard is so that it would not be painful to prefix things with it.std::string x; // A string from the standard namespace.OK. Not a real problem:
typedef const char* cstr;But I would prefer not to see it as it implies you are using C-Strings (rather than C++ strings). Whose real type is actually
char const* if you get them from literals so this can be a potentially dangerous typedef. Also once the data goes into your dictionary do you expect to mutate it? If not then I would like to see the const in the typedef.There is already a std::tolower() that does exactly what you expect.
int toLower(char c) {
//Return a lowercase letter if its within A-Z (65-90) on the ASCII table;
//Otherwise return 'c' as it is (don't bother with non-alphabetical characters).
return (c > 64 && c < 91) ? c+26 : c;
}And is implemented in an optimal way for your platform.
Trying to be too clever and shot your self in the foot.
int iter(26);
while (iter--) alphabet[iter] = 0;It works. But it is very awkward way of writing it. Thus every maintainer that comes across this code will go "what?". and need to validate that it actually works
Your copy constructor calls
copy() which calls clear() which relies on the alphabet array being already initialized with NULL. This is not the case. Thus this will result in undefined behavior.dictionary::node::node(const node& nodeCopy) {
copy(nodeCopy);
}The reason for copy seems that you put common code from the copy constructor and assignment operator in a single place. I would change this up. I would put all the copy code in the copy constructor without the call to
clear(). Thenuse the copy and swap idiom to implement the assignment operator (now there is no reason for the copy() method).Use copy and swap idiom to implement the assignment operator:
dictionary::node& dictionary::node::operator = (const node& nodeCopy) {
copy(nodeCopy);
return *this;
}This implementation is not exception safe (and does not handle self assignment). When doing assignment you need to do in three stages:
1) Copy src (nodeCopy) (without altering the current object).
If you generate an exception during the copy you need to be able
to leave the object in a valid state preferably unchanged.
By doing the copy without altering the current object makes this
easier.
2) Swap the current data with the copies data.
`swapping` is an exception safe activity so you can atomically change
the state of the object without any exceptions being generated.
3) Destroy the old data.
It does not matter if you generate exceptions now the object is in
a consistent state.These three stages are automatically achieved using the copy and swap idiom.
No need to check for NULL before deleting.
if (alphabet[iter]) {
delete alphabet[iter];
alphabet[ iter ] = 0;
}Just use the copy constructor here it is much clearer:
alphabet[ iter ] = new node; //memory-safe operation. The destination node has already been cleared
alphabet[ iter ]->copy(*src.alphabet[ iter ]); //begin recursive awesomeness!
// can be written a:
alphabet[ iter ] = new node(*src.alphabet[iter]);I will be back with more:
Code Snippets
#ifndef DICTIONARY_H
#define DICTIONARY_Husing namespace std;std::string x; // A string from the standard namespace.typedef const char* cstr;int toLower(char c) {
//Return a lowercase letter if its within A-Z (65-90) on the ASCII table;
//Otherwise return 'c' as it is (don't bother with non-alphabetical characters).
return (c > 64 && c < 91) ? c+26 : c;
}Context
StackExchange Code Review Q#15229, answer score: 5
Revisions (0)
No revisions yet.