patternsqlModerate
Best Data Type to Store Result of HASHBYTES('MD5', ...)
Viewed 0 times
resultmd5storetypehashbytesdatabest
Problem
What would be the best data type to store the results of the
It outputs 16 bytes of binary as follows: e.g.
I could store it in the following data types:
Every row will have a value (no nulls), and the column will be used for comparison against a similar column in another table.
Which is the best data type to store
I was thinking that since fixed length data types can sometimes be more efficient on joins, etc.
HASHBYTES('MD5', ...)?It outputs 16 bytes of binary as follows: e.g.
0x5CFCD77F9FF836189D2F647EBCEA183EI could store it in the following data types:
char(34)
binary(16)(I think - I read here (https://stackoverflow.com/questions/14722305/what-kind-of-datatype-should-one-use-to-store-hashes#16680423) that using the same algo should return the same number of bytes every time regardless of the input string)
- other?
Every row will have a value (no nulls), and the column will be used for comparison against a similar column in another table.
Which is the best data type to store
HASHBYTES output for use as described above? I was thinking that since fixed length data types can sometimes be more efficient on joins, etc.
binary(16) vs varbinary(8000) (the default output of HASHBYTES) seems best, and binary(16) vs a varchar(34) is better since it would use less storage space.Solution
According to Books Online:
The output conforms to the algorithm standard: 128 bits (16 bytes) for
MD2, MD4, and MD5; 160 bits (20 bytes) for SHA and SHA1; 256 bits (32
bytes) for SHA2_256, and 512 bits (64 bytes) for SHA2_512.
So despite the fact that the function technically returns varbinary(8000), you'll only ever get 16 bytes with MD5. If you are completely certain you only need MD5, then a binary(16) will do it. If you want to play it safe, and allow any possible algorithm (current or future) that HASHBYTES can handle, go with varbinary(8000).
I definitely wouldn't recommend a char type, since that will involve string conversion overhead, and unnecessary storage overhead.
The output conforms to the algorithm standard: 128 bits (16 bytes) for
MD2, MD4, and MD5; 160 bits (20 bytes) for SHA and SHA1; 256 bits (32
bytes) for SHA2_256, and 512 bits (64 bytes) for SHA2_512.
So despite the fact that the function technically returns varbinary(8000), you'll only ever get 16 bytes with MD5. If you are completely certain you only need MD5, then a binary(16) will do it. If you want to play it safe, and allow any possible algorithm (current or future) that HASHBYTES can handle, go with varbinary(8000).
I definitely wouldn't recommend a char type, since that will involve string conversion overhead, and unnecessary storage overhead.
Context
StackExchange Database Administrators Q#191979, answer score: 10
Revisions (0)
No revisions yet.