HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

Hash table OOP implementation in C

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
implementationooptablehash

Problem

To brush up my C I decided to do some OOP programming in C. I have never tested this before, but I was curious to try it since seeing an example of it a while ago. I decided to implement a simple hash table. My first approach was to give each instance of the object its own methods, i.e. pointers to functions, but this felt meaningless because they could not be used for more than shared state (without giving the methods a self reference). Instead I decided that each "object" should only hold the private state and use normal functions.

Here follows a usage of the "class", main.c:

#include 
#include "hashtable.h"

int main(int argc, char *argv[]) {

    HashTable ht = newHashTable(65535);

    HashTable_put(ht, "1", "two");
    HashTable_put(ht, "2", "three");
    HashTable_put(ht, "3", "five");
    HashTable_put(ht, "4", "seven");
    HashTable_put(ht, "5", "eleven");
    HashTable_put(ht, "6", "thirteen");
    HashTable_put(ht, "7", "seventeen");

    HashTable_remove(ht, "1");
    HashTable_remove(ht, "2");
    HashTable_remove(ht, "6");
    HashTable_remove(ht, "7");

    HashTable_put(ht, "3", "not allowed");

    printf("%s\n", HashTable_get(ht, "1"));
    printf("%s\n", HashTable_get(ht, "2"));
    printf("%s\n", HashTable_get(ht, "3"));
    printf("%s\n", HashTable_get(ht, "4"));
    printf("%s\n", HashTable_get(ht, "5"));
    printf("%s\n", HashTable_get(ht, "6"));
    printf("%s\n", HashTable_get(ht, "7"));

    printf("size: %d\n", HashTable_size(ht));   

    freeHashTable(&ht);

    return 0;
}


And the header hashtable.h:

#ifndef HASHTABLE_H
#define HASHTABLE_H

#include 

typedef struct hashtable *HashTable;

// Create, destory
HashTable newHashTable(int);
void freeHashTable(HashTable *);

// Public methods
int HashTable_buckets(HashTable);
const char *HashTable_get(HashTable, char *);
bool HashTable_put(HashTable, char *, char *);
bool HashTable_remove(HashTable, char *);
int HashTable_size(HashTable);

#endif


And lastly the actual i

Solution

-
Recommend to include "hashtable.h" in "hashtable.c" first as a test to insure "hashtable.h" does not rely on the .c file to include .h files prior.

#include "hashtable.h"
#include 
#include 
#include 
// #include "hashtable.h"


-
Really not a fan of using typedef to hide a pointer. In the end this is a style decision. In my experience, hidden pointer types are harder to debug as they tend to be surprising.

// typedef struct hashtable *HashTable;
typedef struct hashtable HashTable;

// const char *HashTable_get(HashTable, char *);
const char *HashTable_get(HashTable *, char *);


-
Naming convention inconsistency. Suggest below (spacing added for clarity here, not needed for code.)

// HashTable newHashTable(int);
// void      freeHashTable(HashTable *);
HashTable    HashTable_new(int);
void         HashTable_free(HashTable *);

int          HashTable_buckets(HashTable);
const char * HashTable_get(HashTable, char *);
bool         HashTable_put(HashTable, char *, char *);


-
Use type size_t to index arrays. It really is the best type for arrays being not too wide nor too narrow. int may be insufficient.

struct hashtable {
    // int buckets;
    // int size;
    size_t buckets;
    size_t size;
    struct entry **table;
};


-
Functions that do not modify the "pointed to" data, should use const

// int HashTable_size(HashTable);
int HashTable_size(const HashTable);

// or including above ideas
size_t HashTable_size(const *HashTable);
// bool HashTable_put(HashTable this, char *key, char *value)
bool HashTable_put(const *HashTable this, const char *key, const char *value)


-
OP all ready knows "would be better if the the number of buckets was dynamic,". It is important to add that making the bucket size a prime number has performance benefits in that a prime makes a weak pre-hash function better.

-
Rather than allocate the size of a type, allocate to the size of the pointer's reference type. It is easier to maintain and less likely to be wrong.

// ht = malloc(sizeof(struct hashtable));
ht = malloc(sizeof *ht);


-
Check range of values, especially during debug. Consider debugging code where newHashTable(0) invoked a %0 in a latter call HashTable_get(). Another approach is preemptively adjust buckets

HashTable newHashTable(int buckets) {
  assert(buckets > 0);
  // or 
  if (buckets buckets = buckets;

...
djb2_hash(key) % this->buckets;  // prevent % 0


-
Missing test. Tests never try a failed get().

printf("%p\n", (void*) HashTable_get(ht, "x"));


-
I reserve C++ keywords in C. From time to time, it is useful to port C code to C++

// HashTable this
HashTable This


-
IMO, free_like() functions should be NULL tolerant like free(NULL). It does not cost much and is C-like.

void freeHashTable(HashTable *thisPtr) {
  if (thisPtr == NULL) return;


-
Also, IMO, use freeHashTable(HashTable thisPtr). Do not pass the HT variable address. As able, keep the similar HT function signatures using the HT variable in a consistent manor.

-
Naming convention: filenames should match the case of the type, as able. IOWs HashTable.c and HashTable.h

-
No need to declare a variable until it is needed.

// Entry next;
...
        // next = (*ePtr)->next;
        Entry next = (*ePtr)->next;


-
Design: "I also put in effort to make the functions as concise as possible." is a worthy goal, but the conciseness and cohesiveness of the functions/type declarations in the .h file trumps the implementation details - that is good OOP.

-
Hash performance. I see value in widening the pre-hash function to long long at little cost. If long long is a high cost, then code should use size_t, rather than unsigned long. Suggest *33 rather than `((hash

Code Snippets

#include "hashtable.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// #include "hashtable.h"
// typedef struct hashtable *HashTable;
typedef struct hashtable HashTable;

// const char *HashTable_get(HashTable, char *);
const char *HashTable_get(HashTable *, char *);
// HashTable newHashTable(int);
// void      freeHashTable(HashTable *);
HashTable    HashTable_new(int);
void         HashTable_free(HashTable *);

int          HashTable_buckets(HashTable);
const char * HashTable_get(HashTable, char *);
bool         HashTable_put(HashTable, char *, char *);
struct hashtable {
    // int buckets;
    // int size;
    size_t buckets;
    size_t size;
    struct entry **table;
};
// int HashTable_size(HashTable);
int HashTable_size(const HashTable);

// or including above ideas
size_t HashTable_size(const *HashTable);
// bool HashTable_put(HashTable this, char *key, char *value)
bool HashTable_put(const *HashTable this, const char *key, const char *value)

Context

StackExchange Code Review Q#129971, answer score: 4

Revisions (0)

No revisions yet.