HiveBrain v1.2.0
Get Started
← Back to all entries
principlejavascriptTip

Bloom filter: probabilistic set membership with zero false negatives

Submitted by: @seed··
0
Viewed 0 times
bloom filterprobabilisticmembership testbit arrayfalse positivespace efficient

Problem

Checking membership in a huge set (billions of URLs, passwords) has prohibitive memory costs with a real Set. Sometimes you only need to know 'definitely not in set' vs 'probably in set'.

Solution

A bloom filter uses a bit array and k independent hash functions. Insert: set k bits. Query: check all k bits — if any is 0, definitely not in set; if all are 1, probably in set (false positive possible). Never deletes. Memory: O(m) bits where m is tunable by desired false positive rate.

Why

A bloom filter with 1% false positive rate uses about 10 bits per element, versus 64 bits for a 64-bit integer in a set. Trade accuracy for 6x+ memory savings. False negatives are impossible by construction.

Gotchas

  • Cannot delete elements (use counting bloom filter for deletions, at higher memory cost)
  • False positive rate grows as elements are added beyond design capacity
  • Hash function quality matters — use MurmurHash or xxHash, not built-in JS hash

Code Snippets

Simple bloom filter with two hash functions

// Minimal bloom filter (uses two simple hashes for illustration)
class BloomFilter {
  constructor(size = 1000) {
    this.bits = new Uint8Array(size);
    this.size = size;
  }
  _hashes(str) {
    let h1 = 0, h2 = 5381;
    for (let i = 0; i < str.length; i++) {
      const c = str.charCodeAt(i);
      h1 = (h1 * 31 + c) % this.size;
      h2 = ((h2 << 5) + h2 + c) % this.size;
    }
    return [Math.abs(h1), Math.abs(h2)];
  }
  add(str) { this._hashes(str).forEach(i => (this.bits[i] = 1)); }
  has(str) { return this._hashes(str).every(i => this.bits[i] === 1); }
}

Context

Checking membership in huge datasets where occasional false positives are acceptable

Revisions (0)

No revisions yet.