patterncMinor
Dictionary implementation using hash table in C
Viewed 0 times
implementationhashusingdictionarytable
Problem
I have written the below code which loads the dictionary and checks if the given word is present or not.
The implementation is using a hash table with a chained linked list.
In regards to the hash function, I have kept it simple as I was not very concerned about collisions.
Can someone please review it and let me know of anything I should improve? Like:
`#include
#include
#include
#include
#include
#define SPELL_CHECKER
#ifdef SPELL_CHECKER
#define SIZE 10000 / Number of elements in table /
#define WORD_SIZE 100 / Max size of word /
typedef struct CHAINELEMENTS
{
char word[100]; / word to be saved in list /
struct CHAINELEMENTS next; / pointer to next element */
}chainelements;
typedef struct TBLELEMENTS
{
int total_elements; / total elements in chain - Not used here so far /
chainelements baseelement; / Pointer to chained linked list of words */
}tblelements;
typedef struct HASHTABLE
{
int size; / Number of table elements in hashtable /
tblelements base; / Pointer to first table element */
}hashtable;
/*
Below functions accomplish task as listed
1. Create HASHTABLE
2. For given string find its key and hash index
3. Search the word in hashtable
4. if word doesnt exist insert the word
*/
hashtable* createHashTable(int size);
int getKey(char *string);
void insertWord(hashtable phashtable, char str);
bool searchWord(hashtable phashtable, char str);
int main(void)
{
FILE *fp1;
char oneword[WORD_SIZE];
char c;
char *searchword = "abash";
bool ispresent;
hashtable *phashtable = createHashTable(SIZE);
fp1 = fopen("snippet.txt", "r");
do
{
c = fscanf(fp1, "%s", oneword); / got one word from the file /
printf("%s \n", oneword); / display it on the monitor /
ins
The implementation is using a hash table with a chained linked list.
In regards to the hash function, I have kept it simple as I was not very concerned about collisions.
Can someone please review it and let me know of anything I should improve? Like:
- Optimizations
- Standard practices
- how to cover boundary/error condition?
`#include
#include
#include
#include
#include
#define SPELL_CHECKER
#ifdef SPELL_CHECKER
#define SIZE 10000 / Number of elements in table /
#define WORD_SIZE 100 / Max size of word /
typedef struct CHAINELEMENTS
{
char word[100]; / word to be saved in list /
struct CHAINELEMENTS next; / pointer to next element */
}chainelements;
typedef struct TBLELEMENTS
{
int total_elements; / total elements in chain - Not used here so far /
chainelements baseelement; / Pointer to chained linked list of words */
}tblelements;
typedef struct HASHTABLE
{
int size; / Number of table elements in hashtable /
tblelements base; / Pointer to first table element */
}hashtable;
/*
Below functions accomplish task as listed
1. Create HASHTABLE
2. For given string find its key and hash index
3. Search the word in hashtable
4. if word doesnt exist insert the word
*/
hashtable* createHashTable(int size);
int getKey(char *string);
void insertWord(hashtable phashtable, char str);
bool searchWord(hashtable phashtable, char str);
int main(void)
{
FILE *fp1;
char oneword[WORD_SIZE];
char c;
char *searchword = "abash";
bool ispresent;
hashtable *phashtable = createHashTable(SIZE);
fp1 = fopen("snippet.txt", "r");
do
{
c = fscanf(fp1, "%s", oneword); / got one word from the file /
printf("%s \n", oneword); / display it on the monitor /
ins
Solution
-
Exceptional conditions
It is infeasible to
-
Hash recomputation
and call it from both
-
Cast
Don't do it
-
Memory usage
The word in the table occupies 100 bytes no matter what. It is possible (even though quite unlikely) to corrupt heap with very long words. It is also very wasteful. You should either allocate just right memory with
-
What is a better strategy?
Since you don't care about an order in which collided words appear in the cain, always insert at the beginning. You wouldn't need to special case the empty chain, and the insertion code becomes
Exceptional conditions
It is infeasible to
hashindex > SIZE ever hold. If it ever happen, the flow of the program execution is broken so badly that the only reasonable action is to abort and dump core. As coded, searchWord returning false simply lies to the rest of the program.-
Hash recomputation
insertWord computes the hash, and calls searchWord which also computes the hash. I recommend to have a search helper with signaturebool doSearchWord(phashtable * table, char * str, int hash);and call it from both
searchWord and insertWord with precomputed hash.-
Cast
mallocDon't do it
-
Memory usage
The word in the table occupies 100 bytes no matter what. It is possible (even though quite unlikely) to corrupt heap with very long words. It is also very wasteful. You should either allocate just right memory with
strdup(), or memory map your file and use pointers into the mapped data.-
What is a better strategy?
Since you don't care about an order in which collided words appear in the cain, always insert at the beginning. You wouldn't need to special case the empty chain, and the insertion code becomes
ispresent = searchWord(phashtable, str);
if (ispresent == 0)
{
printf("insert word is not present - so inserting %s \n", str);
pchainelements = malloc(sizeof(chainelements));
if (pchainelements == NULL)
{
fprintf(stderr, " -- memory allocation failed for the word -- \n");
}
base = phashtable[hashindex].baseelement;
phashtable->base[hashindex].baseelement = pchainelements;
strcpy(pchainelements->word, str);
pchainelements->next = base;
}Code Snippets
bool doSearchWord(phashtable * table, char * str, int hash);ispresent = searchWord(phashtable, str);
if (ispresent == 0)
{
printf("insert word is not present - so inserting %s \n", str);
pchainelements = malloc(sizeof(chainelements));
if (pchainelements == NULL)
{
fprintf(stderr, " -- memory allocation failed for the word -- \n");
}
base = phashtable[hashindex].baseelement;
phashtable->base[hashindex].baseelement = pchainelements;
strcpy(pchainelements->word, str);
pchainelements->next = base;
}Context
StackExchange Code Review Q#115843, answer score: 2
Revisions (0)
No revisions yet.