patterncMinor
Open addressed hash table
Viewed 0 times
hashopentableaddressed
Problem
I'm attempting to implement various data structures and algorithms to better learn C. I would appreciate comments on style, correctness of the implementation, memory management, etc.
One specific question is how to address an error case where either the table or key is null in the hash function. In get or rm I can use an unsigned return of 0 or 1 representing a boolean value.
```
#include
#include
#include
typedef struct HashTable {
size_t size;
char ** buckets;
} HashTable;
HashTable * create_table(size_t size);
int hash_code(char key, HashTable ht);
char get_val(char key, HashTable * ht);
unsigned int set_val(char key, char val, HashTable * ht);
unsigned int rm_val(char key, HashTable ht);
//Bernstein's function uses INITIAL_VALUE of 5381 and M of 33;
unsigned int INITIAL_VALUE = 5381;
unsigned int M = 33;
/*
* Function which allocates a Hash table and
* bucket array of specified size
* @param size the size of the bucket array
*
*/
HashTable * create_table(size_t size)
{
HashTable ht = (HashTable)malloc(sizeof(*ht));
ht->size = size;
ht->buckets = (char **)malloc(sizeof(char) size);
return ht;
}
/*
* Function which returns multiplicative hash of key, which is used as bucket
* @param key pointer to key
* @param ht HashTable structure
*
*/
int hash_code(char key, HashTable ht)
{
if (ht == NULL || key == NULL)
return -1;
unsigned int hash = INITIAL_VALUE; //seed
unsigned int i;
for (i=0;isize;
}
/*
* Function which gets a value from the hashtable, computes hash of key
* and queries the buckets at idx of hash
* @param key pointer to key
* @param ht HashTable structure
*
*/
char get_val(char key, HashTable * ht)
{
if (ht == NULL || key == NULL)
return NULL;
unsigned int hash = hash_code(key,ht);
char * val = ht->buckets[hash];
if (val != NULL)
return val;
return NULL;
}
/*
* Function which sets a value in the hashtable, computes hash of key
* and qu
One specific question is how to address an error case where either the table or key is null in the hash function. In get or rm I can use an unsigned return of 0 or 1 representing a boolean value.
```
#include
#include
#include
typedef struct HashTable {
size_t size;
char ** buckets;
} HashTable;
HashTable * create_table(size_t size);
int hash_code(char key, HashTable ht);
char get_val(char key, HashTable * ht);
unsigned int set_val(char key, char val, HashTable * ht);
unsigned int rm_val(char key, HashTable ht);
//Bernstein's function uses INITIAL_VALUE of 5381 and M of 33;
unsigned int INITIAL_VALUE = 5381;
unsigned int M = 33;
/*
* Function which allocates a Hash table and
* bucket array of specified size
* @param size the size of the bucket array
*
*/
HashTable * create_table(size_t size)
{
HashTable ht = (HashTable)malloc(sizeof(*ht));
ht->size = size;
ht->buckets = (char **)malloc(sizeof(char) size);
return ht;
}
/*
* Function which returns multiplicative hash of key, which is used as bucket
* @param key pointer to key
* @param ht HashTable structure
*
*/
int hash_code(char key, HashTable ht)
{
if (ht == NULL || key == NULL)
return -1;
unsigned int hash = INITIAL_VALUE; //seed
unsigned int i;
for (i=0;isize;
}
/*
* Function which gets a value from the hashtable, computes hash of key
* and queries the buckets at idx of hash
* @param key pointer to key
* @param ht HashTable structure
*
*/
char get_val(char key, HashTable * ht)
{
if (ht == NULL || key == NULL)
return NULL;
unsigned int hash = hash_code(key,ht);
char * val = ht->buckets[hash];
if (val != NULL)
return val;
return NULL;
}
/*
* Function which sets a value in the hashtable, computes hash of key
* and qu
Solution
Regarding the problem of the error case where either the table or key is null in hash_code, you might pass in ht->size instead of ht, thus halving the problem. If you redefine the 'contract' so that a NULL key is treated the same as an empty string "", then you no longer have an error case.
Some comments on the code:
-
I would omit things like 'Function which'... from comments. If it is a function then this can be assumed.
-
hash_code uses -1 as an error code while set/rm_val use 0. Inconsistent.
create_table()
-
ht->buckets not initialised to 0; use calloc instead?
-
you could avoid the first malloc by passing in a pointer to a statically allocated HashTable or by defining the buckets array as a flexible member and then allocate the necessary total size
typedef struct HashTable {
} HashTable;
hash_code():
-
parameter
-
use of strlen in loop control is inefficient; key size does not change but will be evaluated each loop. Perhaps for (; *key; ++key) {...}
-
prefer spaces after ';' in for-loop conditions
-
get_val()
set_val()
rm_val()
main()
Some comments on the code:
- I'd expect a short description of the nature of the hashtable and how it is used at the start.
- INITIAL_VALUE and M should be
const
- function parameters should be
constwhere possible
- return values not stated in headers
-
I would omit things like 'Function which'... from comments. If it is a function then this can be assumed.
-
hash_code uses -1 as an error code while set/rm_val use 0. Inconsistent.
- Generally, unsigned should be used for values that are not numeric, such as bitmaps, not for numbers that you don't intend to go negative (although size_t contradicts that).
create_table()
- don't cast malloc return value
- check malloc return
-
ht->buckets not initialised to 0; use calloc instead?
-
you could avoid the first malloc by passing in a pointer to a statically allocated HashTable or by defining the buckets array as a flexible member and then allocate the necessary total size
typedef struct HashTable {
size_t size;
char *buckets[];} HashTable;
hash_code():
-
parameter
ht: I'd pass in the size (ht->size) instead of ht. In general, it is easier to test a function that takes simple scalar inputs than one that receives structures. -
use of strlen in loop control is inefficient; key size does not change but will be evaluated each loop. Perhaps for (; *key; ++key) {...}
-
prefer spaces after ';' in for-loop conditions
-
return hash % ht->size; loses precision (compiler warning due to type conversion). Just use int throughout unless your hash is ever going to be big enough for the extra bit to matter. If you thought it might get big enough (eg if using a processor where int is small) then you would have to check for overflow of an unsigned int too (I'm not suggesting you do that here).get_val()
- ignores -ve return from hash_code on error
- last 4 lines reduce to
return ht->buckets[hash];
set_val()
- Better to return int, not unsigned int
hashis signed. Other uses are unsigned.
- should
valbe duplicated (allocated)? If not you have to tell the caller that the string must be persistent (ie. the caller must not free it or pass a stack value).
else { return 0; }at the end of function can be reduced toreturn 0;
- comment could be said to be excessive - describing what is obvious from the code; its first ',' should be '.' I would say just: "Set hashtable value unless already set. Returns 1 on success, 0 on failure (value already set)."
rm_val()
- Better to return int, not unsigned int
- ignores -ve return from hash_code on error
else { return 0; }at the end of function can be reduced toreturn 0;
main()
- use EXIT_SUCCESS and EXIT_FAILURE
Code Snippets
size_t size;
char *buckets[];Context
StackExchange Code Review Q#11766, answer score: 3
Revisions (0)
No revisions yet.