patternMajor
Bad Practice to Store Calculated Data in Database?
Viewed 0 times
practicebadstoredatabasedatacalculated
Problem
Is it bad practice to store calculated data in each row, or is it better to calculate at the application layer with every read from the database.
Storing in the database avoids the need to calculate multiple times, but if an error is made then data needs to be updated rather than just changing the application level calculations.
I think the latter is better, but is there a general rule of thumb?
I need to, for example, calculate total daily nutritional intake of foods. So various
You can imagine if you had to calculate yearly averages, monthly averages, daily averages, etc. for a long period of time it could get quite unwieldy.
What about using materialized views that would get recomputed every time old data, say a week or older, gets updated based on a trigger or something along those lines?
Storing in the database avoids the need to calculate multiple times, but if an error is made then data needs to be updated rather than just changing the application level calculations.
I think the latter is better, but is there a general rule of thumb?
I need to, for example, calculate total daily nutritional intake of foods. So various
portions of energy of foods. I can either calculate the portion energy based on the corresponding food and store the energy of each portion in the portions table OR I can calculate from the join with the corresponding food every time.You can imagine if you had to calculate yearly averages, monthly averages, daily averages, etc. for a long period of time it could get quite unwieldy.
What about using materialized views that would get recomputed every time old data, say a week or older, gets updated based on a trigger or something along those lines?
Solution
If computers were infinitely fast, then, No, you would never store a value that could be calculated from other columns in the database. Storing calculated values is a violation of database normalization.
Examples would include:
In the real world, we do sometimes choose to violate normalization. The motivation is usually for performance reasons.
Never violate normal forms without much thought, and hopefully a consultation with another DBA or database developer. And always document thoroughly your decision and its specific motivation.
is there a general rule of thumb?
Yes: Normalize, and test/profile.
Always start with a normalized design. Load with fake data, test performance. Verify any discovered bottleneck is indeed due to your on-the-fly calculations.
Examples would include:
- The extended cost on an invoice line that carries price and quantity fields.
- The total cost of an invoice’s lines.
- The elapsed time between two moments.
In the real world, we do sometimes choose to violate normalization. The motivation is usually for performance reasons.
Never violate normal forms without much thought, and hopefully a consultation with another DBA or database developer. And always document thoroughly your decision and its specific motivation.
is there a general rule of thumb?
Yes: Normalize, and test/profile.
Always start with a normalized design. Load with fake data, test performance. Verify any discovered bottleneck is indeed due to your on-the-fly calculations.
Context
StackExchange Database Administrators Q#202758, answer score: 25
Revisions (0)
No revisions yet.