HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

ConcurrentDictionary, Store Key and Compression

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
concurrentdictionarycompressionstoreandkey

Problem

So i have a ConcurrentDictionary

The "int" is merely the way i store the key, which i want more recommendations on.

Currently i do like this.

public string memoize(Func functor, string code, uint id)
    {
        //Used for the lookup to identify if it's been inputed before
        int codeHash = code.GetHashCode();
        string functor_return;

        if (_compilerCache.TryGetValue(codeHash, out functor_return))
            return functor_return;

        functor_return = functor?.Invoke(code, id);
        _compilerCache[codeHash] = functor_return;

        return functor_return;
    }


Why do i use GetHashCode()?
I don't know, it's just the only way i could figure out rather than store the string value itself.

I also save this "Cache", and hence why the compression and key is important.

//Serialize and Compress object to file
    public static void SerializeObject(string filename, T obj)
    {
        using (Stream stream = File.Open(filename, FileMode.Create))
        using (var cStream = new GZipStream(stream, CompressionLevel.Optimal))
        {
            BinaryFormatter binaryFormatter = new BinaryFormatter();
            binaryFormatter.Serialize(cStream, obj);
        }
    }

    //DeSerialize and Decompress object from file
    public static T DeSerializeObject(string filename)
    {
        T objectToBeDeSerialized = default(T);
        try
        {
            using (Stream stream = File.Open(filename, FileMode.Open))
            using (var cStream = new GZipStream(stream, CompressionMode.Decompress))
            {
                BinaryFormatter binaryFormatter = new BinaryFormatter();
                objectToBeDeSerialized = (T)binaryFormatter.Deserialize(cStream);
            }

        }
        catch (Exception e)
        {
            MessageBox.Show(e.Message + " : Error DeSerializing Cache, corrupted?");
        }

        return objectToBeDeSerialized;
    }


It all works, but i would like to improve it.
I don't re

Solution

GetHashCode is used internally by ConcurrentDictionary to speed up access to the real key.

But, there is no guarantee that 2 different strings will return 2 different HashCode, you may encounter collisions! Such collisions are handled by the ConcurrentDictionary.

So with your code, as the _compilerCache is a Dictionary (it seems to be) and not a Dictionary, you may encounter rare events where two different codes will give the same key. If such a collision happends, you have a subtle and rare bug.

I suggest the following :

-
stick to a Dictionary, use the real key

-
don't input a HashCode instead of the real key

-
you may also input your own GetHashCode in Dictionary constructor

Using code instead of int as a key has a price : memory footprint is larger, but with small strings (10-20 char), it should not be an issue.

Hope this helps !

Context

StackExchange Code Review Q#147577, answer score: 3

Revisions (0)

No revisions yet.