HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Definition of indexing, datatype configuration, and DBMS for SHA3-256 hashes

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
definitiondatatypeindexingsha3256dbmsforhashesandconfiguration

Problem

I am designing a system within which I will need to be able to search a database using a file or string SHA3-256 hash.

For example the hash of test is

36f028580bb02cc8272a9a020f4200e346e276ae664e45ee80745574e2f5ab80


  • Which DATATYPE should I set the column that stores this value? I've heard about using BINARY instead of VARCHAR.



  • How should I index this column so that searches by hash are fastest?



  • Which database management system (like PostgreSQL, DB2, SQL Server), or document software (like MongoDB) should I use that most efficiently allows me to store and retrieve records by hash? I only have experience with relational database management systems like MySQL, but maybe a document store like MongoDB would be the better way to go.

Solution

SHA3-256 is 256bits. That's 32 bytes.

  • Don't store it as TEXT any form, or it'll be 256 bytes (8x the size).



  • I would store it as inline binary if your database supports it because I doubt anything it's going to have a data type for a 32 byte data type (which nothing uses except AVX2).



On indexing, any HASH index would work if your database provides it.

From PostgreSQL,

CREATE DOMAIN sha3_256 AS bytea;

CREATE TABLE datastore (
  id     int  PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
  myhash sha3_256
);

CREATE INDEX ON datastore USING HASH (myhash);

INSERT INTO datastore (id,myhash) VALUES
  (1, decode('36f028580bb02cc8272a9a020f4200e346e276ae664e45ee80745574e2f5ab80', 'hex'));


As a special optimization you could store it as two UUIDs. Though I probably wouldn't bother.

Code Snippets

CREATE DOMAIN sha3_256 AS bytea;

CREATE TABLE datastore (
  id     int  PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
  myhash sha3_256
);

CREATE INDEX ON datastore USING HASH (myhash);

INSERT INTO datastore (id,myhash) VALUES
  (1, decode('36f028580bb02cc8272a9a020f4200e346e276ae664e45ee80745574e2f5ab80', 'hex'));

Context

StackExchange Database Administrators Q#229677, answer score: 6

Revisions (0)

No revisions yet.