HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

What is the point of column families?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thewhatcolumnpointfamilies

Problem

I've seen that NoSQL database systems like RocksDB offer a feature called column families. I believe I understand what the concept refers to, but what are the actual (practical) benefits of using them? I presume they can improve look-up performance in some cases, or space locality of key-value entires, at the very least? It wouldn't seem to affect the actual semantics of database access, however, as far as I understand. Is this correct? Is there something I'm missing?

Solution

I've just uncovered some interesting information from the RocksDB FAQ. (RocksDB is a K-V store.)

Here are some relevant extracts.


Q: What are column families used for?


A: The most common reasons of using column families: (1) use different
compaction setting, comparators, compression types, merge operators,
or compaction filters in different parts of data; (2) drop a column
family to delete its data; (3) one column family to store metadata and
another one to store the data.


Q: What's the difference between storing data in multiple column
family and in multiple rocksdb database?


A: The main differences will be backup, atomic writes and performance
of writes. The advantage of using multiple databases: database is the
unit of backup or checkpoint. It's easier to copy a database to
another host than a column family. Advantages of using multiple column
families: (1) write batches are atomic across multiple column families
on one database. You can't achieve this using multiple RocksDB
databases. (2) If you issue sync writes to WAL, too many databases may
hurt the performance.


Q: I have different key spaces. Should I separate them by prefixes, or
use different column families?


A: If each key space is reasonably large, it's a good idea to put them
in different column families. If it can be small, then you should
consider to pack multiple key spaces into one column family, to avoid
the trouble of maintaining too many column families.

Context

StackExchange Database Administrators Q#166159, answer score: 9

Revisions (0)

No revisions yet.