HiveBrain v1.2.0
Get Started
← Back to all entries
patternCritical

Do SSDs reduce the usefulness of Databases

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thedatabasesusefulnessreducessds

Problem

I only heard about Robert Martin today, and it seems like he's a notable figure in the software world, so I don't mean for my title to appear as if it's a click bait or me putting words in his mouth, but this is simply how I interpreted what I heard from him with my limited experience and understanding.

I was watching a video today (on software architecture), on a talk by Robert C. Martin, and in the latter half of the video, the topic of databases was the main focus.

From my understanding of what he said, it seemed like he was saying that SSDs will reduce the usefulness of databases (considerably).

To explain how I came to this interpretation:

He discussed how with HDDs/spinning disks, retrieving data is slow. However, these days we use SSDs, he noted. He starts off with "RAM is coming" and then continues by mentioning RAM disks, but then says he can't call it RAM disk, so resorts to just saying RAM. So with RAM, we don't need the indexes, because every byte takes the same time to get. (this paragraph is paraphrased by me)

So, him suggesting RAM (as in computer memory) as a replacement for DBs (as that's what I interpreted his statement as) doesn't make sense because that's like saying all the records are in-memory processed in the lifetime of an application (unless you pull from a disk file on demand)

So, I resorted to thinking by RAM, he means SSD. So, in that case, he's saying SSDs reduce the usefulness of databases. He even says "If I was Oracle, I'd be scared. The very foundation of why I exist is evaporating."

From my little understanding of SSDs, unlike HDDs, which are O(n) seek time (I'd think), SSDs are near O(1), or almost random. So, his suggestion was interesting to me, because I've never thought about it like that.
The first time I was introduced to databases a few years ago, when a professor was describing the benefits over regular filesystem,
I concluded the primary role of a database is essentially being a very indexed filesystem (as w

Solution

There are some things in a database that should be tweaked when you use SSDs. For instance, speaking for PostgreSQL you can adjust effective_io_concurrency, and random_page_cost. However, faster reads and faster random access isn't what a database does. It ensures

  • ACID (Atomicity, Consistency, Isolation, Durability)



  • Some form of concurrency control, MVCC (Multiversion concurrency control)



  • Standardized access for libraries (XQuery, or SQL)



He's just wrong about indexes. If the whole table can be read into ram, an index is still useful. Don't believe me? Let's do a thought experiment,

-
Imagine you have a table with one indexed column.

CREATE TABLE foobar ( id text PRIMARY KEY );


-
Imagine that there are 500 million rows in that table.

-
Imagine all 500 million rows are concatenated together into a file.

What's faster,

  • grep 'keyword' file



  • SELECT * FROM foobar WHERE id = 'keyword'



It's not just about where data is at, it's about how you order it and what operations you must do to find what you're looking for. PostgreSQL supports B-tree, Hash, GiST, SP-GiST, GIN and BRIN indexes (and Bloom through an extension). You'd be foolish to think that all of that math and functionality goes away because you have faster random access.

Code Snippets

CREATE TABLE foobar ( id text PRIMARY KEY );

Context

StackExchange Database Administrators Q#158957, answer score: 62

Revisions (0)

No revisions yet.