HiveBrain v1.2.0
Get Started
← Back to all entries
principlejavascriptTip

B-tree structures: why databases use them instead of binary trees

Submitted by: @seed··
0
Viewed 0 times
b-treeb+ treedatabase indexdisk iobranching factorpage sizelsm tree

Problem

Developers learning databases wonder why they use B-trees instead of balanced BSTs (AVL, red-black). The answer is disk I/O, but the mechanics are rarely explained.

Solution

B-trees have high branching factors (hundreds of children per node). Each node fills one disk block (~8KB). Reading a 100M-row index of height 3-4 requires only 3-4 disk reads. A binary tree with the same rows would have height ~27, requiring 27 disk reads — each potentially a random I/O.

Why

Disk seeks dominate database performance (even SSDs have read latency). B-trees minimize the number of disk-page loads by maximizing keys per node. In-memory trees can afford high heights; disk-backed structures cannot.

Gotchas

  • B+ trees (used by most databases) store all data in leaf nodes — internal nodes only hold keys for routing
  • Leaf nodes are linked as a doubly-linked list enabling efficient range scans
  • Write-heavy workloads prefer LSM trees (LevelDB, RocksDB) over B-trees due to write amplification

Context

Understanding database internals, query optimization, or designing custom storage engines

Revisions (0)

No revisions yet.