HiveBrain v1.2.0
Get Started
← Back to all entries
patterncModerate

checkmate - C spelling corrector 2.0

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
spellingcorrectorcheckmate

Problem

Since I posted my first version of the spelling corrector here, I've been working on improving it a little in my free time. I've also gone ahead and put the project up on Github so that others can now make contributions to the project if they wish to do so.

checkmate.c:

```
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include

#define TABLE_SIZE 5013
#define ALPHABET_SIZE (sizeof(alphabet) - 1)

char *dictionary = "5k.txt";
const char alphabet[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"abcdefghijklmnopqrstuvwxyz"
" '";

void *checkedMalloc(size_t len)
{
void *ret = malloc(len);
if (!ret)
{
fputs("Out of memory!", stderr);
exit(0);
}
return ret;
}

int arrayExist(char **array, int rows, char *word)
{
for (int i = 0; i 0)
{
memcpy(&dst[*dstLen], &src[srcBegin], length);
*dstLen += length;
}
dst[*dstLen] = 0;
}

int deletion(char *word, char **array, int start)
{
int i = 0;
size_t length = strlen(word);

for (; i = resMax)
{
// initially allocate 50 entries, after double the size
if (resMax == 0) resMax = 50;
else resMax *= 2;
}
res = realloc(res, sizeof(char) resMax);
res[resSize++] = e1[j];
}
}
}

*e2_rows = resSize;

return res;
}

char *bestMatch(char **array, int rows)
{
char *maxWord = NULL;
int maxSize = TABLE_SIZE;
ENTRY *e;
for (int i = 0; i data data;
maxWord = e->key;
}
}
return maxWord;
}

char correct(char word)
{
char **e1 = NULL;
char **e2 = NULL;
char *e1_word = NULL;
char *e2_word = NULL;
char *resWord = word;
int e1_rows = 0;
char e2_rows = 0;

if (find(word)) return word;

e1_rows = (unsigned) totalEdits(word);
if (e1_rows)
{
e1 =

Solution

Program exit code

The checkedMalloc function does exit(0) in case you run out of memory.
Exit code 0 usually means success, so it would be better to use something else.

The main function returns -1 if a problem happens while reading the dictionary.

The readDictionary does exit(-1) if it cannot add a word to the hash table.

It's confusing to have multiple exit points scattered around in the program.
It's also hard to keep track of the exit codes that are magic numbers.

As a first step, it would be good to put the exit codes in well-named constants.
As a second step, it would be good to centralize the exit points if possible.
(For out-of-memory it's probably not practical,
but for the others it might be, especially considering that readDictionary sometimes returns on errors instead of exiting.)

Error handling and function return values

readDictionary behaves very confusingly:

  • Return 0 if opening dictionary file failed



  • Return 0 if getting stats on dictionary file failed



  • Return -1 if mmap failed



  • Exit program with -1 if adding entry to hash table failed



  • Return 1 on success



The return value of this function is checked with !readDictionary(...),
so the returned -1 will be considered success.

It's also unfortunate that the fileName parameter of the function is exactly the same as the dictionary global variable.
Either use the global variable and drop the parameter,
or use the parameter instead of the global variable.

Naming

When I see dictionary, I'm thinking some kind of hash table.
But in this program it's a char* variable,
storing the name of the dictionary file.
So I'd call it dictionary_path.

Usability

Instead of hardcoding correctArray and checkArray inside main,
it would be easier to play with and test the program if it took filenames as command line arguments.

Writing style

I was a bit surprised by this code at first:

return (length)                + // deletion
(length - 1)                   + // transposition
(length * ALPHABET_SIZE)       + // alteration
(length + 1) * ALPHABET_SIZE;    // insertion


At first I didn't really get what are those expressions lined up vertically.
It became clearer as I read the right-end of the lines.
This way it would have been more obvious right off the bat:

return (length)                         // deletion
       + (length - 1)                   // transposition
       + (length * ALPHABET_SIZE)       // alteration
       + (length + 1) * ALPHABET_SIZE;  // insertion

Code Snippets

return (length)                + // deletion
(length - 1)                   + // transposition
(length * ALPHABET_SIZE)       + // alteration
(length + 1) * ALPHABET_SIZE;    // insertion
return (length)                         // deletion
       + (length - 1)                   // transposition
       + (length * ALPHABET_SIZE)       // alteration
       + (length + 1) * ALPHABET_SIZE;  // insertion

Context

StackExchange Code Review Q#98069, answer score: 11

Revisions (0)

No revisions yet.