patterncMinor

Unordered (hash) map for C

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

unorderedmapforhash

Problem

I am still working on basic data structures for C. Now I came up with a hash map:

unordered_map.h:

```
#ifndef UNORDERED_MAP_H
#define UNORDERED_MAP_H

#include
#include

#ifdef __cplusplus
extern "C" {
#endif

typedef struct unordered_map_t unordered_map_t;
typedef struct unordered_map_iterator_t unordered_map_iterator_t;

/*
Allocates a new, empty map with given comparator function.
*/
unordered_map_t* unordered_map_t_alloc
(size_t initial_capacity,
float load_factor,
size_t (p_hash_function)(void),
bool (p_equals_function)(void, void*));

/*
If p_map contains the key p_key, associates it with value p_value and
returns the old value of that key.
*/
void unordered_map_t_put (unordered_map_t p_map,
void* p_key,
void* p_value);

/*
Returns a positive value if p_key is mapped to some value in this map.
*/
bool unordered_map_t_contains_key (unordered_map_t* p_map,
void* p_key);

/*
Returns the value associated with the p_key, or NULL if p_key is not
mapped in the map.
****

Solution

Pointer used after free

In your clear() function, there is this code:

while (p_entry)
{
    index = p_map->p_hash_function(p_entry->p_key) & p_map->mask;
    p_next_entry = p_entry->p_next;
    free(p_entry);
    p_entry = p_next_entry;

    if (p_map->p_table[index])
    {
        p_map->p_table[index] = p_map->p_table[index]->p_chain_next;
    }
}

Notice that after the entry is freed, you take the bucket head and move it forward by one element. But the head of that bucket is the entry you just freed. So when you access p_chain_next, you are reading into the entry you just freed.

Actually I wouldn't even bother advancing the bucket heads one by one. Since you are clearing the entire hash table, I would probably just memset the entire table to zero. That way you don't have to call the hash function on each entry.

Double call to hash function

In your put() function, you end up calling the hash function twice if the element doesn't already exist, because you double check in case the table may have increased in size since the first call. Since the hash function may be slow, you should store the result of the hash in a local variable to avoid calling the hash function again. An alternative solution would be to add the entry first and then resize the table after.

Out of memory check

In your alloc() function, you're missing a NULL check for when you allocate p_table. That allocation may well fail because the user can pass in an arbitrarily large initial capacity.

Simplify list removal

In your remove() function, you use an if-else block of four cases to remove a node from a doubly linked list:

/* Unlink from the global iteration chain. */
        if (p_current_entry->p_prev && p_current_entry->p_next) 
        {
            /* Once here, the current entry has both next and previous. */
            p_current_entry->p_prev->p_next = p_current_entry->p_next;
            p_current_entry->p_next->p_prev = p_current_entry->p_prev;
        }
        else if (!p_current_entry->p_prev && !p_current_entry->p_next)
        {
            /* Once here, the current entry 
               is the only entry in the chain. */
            p_map->p_head = NULL;
            p_map->p_tail = NULL;
        }
        else if (p_current_entry->p_next)
        {
            /* Once here, the current entry is the head of the chain. */
            p_map->p_head = p_current_entry->p_next;
            p_map->p_head->p_prev = NULL;
        }
        else
        {
            /* Once here, the current entry is the tail of the chain. */
            p_map->p_tail = p_current_entry->p_prev;
            p_map->p_tail->p_next = NULL;
        }

You can actually handle each of the head and tail cases separately like this:

/* Unlink from the global iteration chain. */
        if (p_current_entry->p_prev) {
            p_current_entry->p_prev->p_next = p_current_entry->p_next;
        } else {
            p_map->p_head = p_current_entry->p_next;
        }
        if (p_current_entry->p_next) {
            p_current_entry->p_next->p_prev = p_current_entry->p_prev;
        } else {
            p_map->p_tail = p_current_entry->p_prev;
        }

Use common find function

Right now, you have four nearly identical copies of code to find an existing entry in get(), contains_key(), put(), and remove(). I would recommend writing an internal find_entry() function that all four functions can use.

Note that the four uses aren't all the same. remove() also needs to know the previous entry pointer, and put() also needs to know the hash value if the entry didn't exist. So you'll have to figure out how you want to write your find_entry() function to make that happen.

Use size_t

You use size_t almost everywhere, which is good. But you missed two spots. The return values of your size() and has_next() functions should return size_t instead of int.

Out of date comment

The comment for the is_healthy function says something about an AVL-tree. It must have been copy and pasted from your previous project.

const

A lot of your functions could take const arguments since they do not modify the argument. For example:

int unordered_map_t_size(const unordered_map_t* p_map);

Floating point

The Floating Point Police™ would like to point out that you could avoid using floating point. Instead of a floating point load factor, you could use a fixed point value. For example, the load factor could be a value between 1..256 and represent a fraction out of 256.

Why bother, you ask? When you use floating point:

The OS needs to start saving/restoring floating point state for your process every time you context switch.

If the OS decides to lazily save/restore your floating point state, you will take an exception the first time you use floating point after a context switch. This exception tells the OS that it needs to swap in the floating point state now.

Of course, you pro

Code Snippets

while (p_entry)
{
    index = p_map->p_hash_function(p_entry->p_key) & p_map->mask;
    p_next_entry = p_entry->p_next;
    free(p_entry);
    p_entry = p_next_entry;

    if (p_map->p_table[index])
    {
        p_map->p_table[index] = p_map->p_table[index]->p_chain_next;
    }
}

/* Unlink from the global iteration chain. */
        if (p_current_entry->p_prev && p_current_entry->p_next) 
        {
            /* Once here, the current entry has both next and previous. */
            p_current_entry->p_prev->p_next = p_current_entry->p_next;
            p_current_entry->p_next->p_prev = p_current_entry->p_prev;
        }
        else if (!p_current_entry->p_prev && !p_current_entry->p_next)
        {
            /* Once here, the current entry 
               is the only entry in the chain. */
            p_map->p_head = NULL;
            p_map->p_tail = NULL;
        }
        else if (p_current_entry->p_next)
        {
            /* Once here, the current entry is the head of the chain. */
            p_map->p_head = p_current_entry->p_next;
            p_map->p_head->p_prev = NULL;
        }
        else
        {
            /* Once here, the current entry is the tail of the chain. */
            p_map->p_tail = p_current_entry->p_prev;
            p_map->p_tail->p_next = NULL;
        }

/* Unlink from the global iteration chain. */
        if (p_current_entry->p_prev) {
            p_current_entry->p_prev->p_next = p_current_entry->p_next;
        } else {
            p_map->p_head = p_current_entry->p_next;
        }
        if (p_current_entry->p_next) {
            p_current_entry->p_next->p_prev = p_current_entry->p_prev;
        } else {
            p_map->p_tail = p_current_entry->p_prev;
        }

int unordered_map_t_size(const unordered_map_t* p_map);

Context

StackExchange Code Review Q#102575, answer score: 3

Revisions (0)

No revisions yet.