HiveBrain v1.2.0
Get Started
← Back to all entries
patternCritical

Duplicate column for faster queries?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
columnduplicatefasterforqueries

Problem

The title doesn't make too much sense, but I couldn't think a better title for this problem.

I have the following tables

Projects

  • id



  • name



Customers

  • id



  • id_project



  • name



Payments

  • id



  • id_customer



  • date



  • sum



When a users enters the system, he will have access to a certain project. Now, I want to list all the payments for that project, and it should be pretty easy:

SELECT FROM payments where id_customer in (SELECT id from customers where id_project = 5)


My question is : if it isn't better to add a column id_project to payments table this way the queries will be easier and faster.

Solution

It seems you are asking if denormalization makes sense.


Denormalization is the process of attempting to optimize the read performance of a database by adding redundant data or by grouping data. In some cases, denormalization helps cover up the inefficiencies inherent in relational database software. A relational normalized database imposes a heavy access load over physical storage of data even if it is well tuned for high performance.

The answer is always "it depends", so here's my rule of thumb:

If ...

  • the amount of data is not large



  • you aren't doing a ton of joins already



  • and/or database performance is not currently a bottleneck



then stay normalized. Yes, denormalization is faster, but it also means you have redundant data in the system -- data that has to be maintained and kept in sync. There is no longer "one source" for that data, but multiple sources that can deviate. This is risky over time, so you shouldn't do it unless you have very good reasons to do it, backed by some benchmarks.

I would only denormalize when ...

  • the amount of data is very large



  • joins are expensive and you have to do a lot of them to get even trivial queries returned



  • database performance is a bottleneck and/or you want to go as fast as possible



Joins are very fast on modern hardware, but they are never free.

Context

StackExchange Database Administrators Q#461, answer score: 65

Revisions (0)

No revisions yet.