patternsqlMinor
Does separating VARCHAR values based on size have a performance gain?
Viewed 0 times
sizegainvarcharperformancebaseddoesvaluesseparatinghave
Problem
I am working with a team that is trying to implement an EAV system. They have decided to split the attribute-value tables out by type and they are debating using different tables for different size ranges of varchar.
EX:
I have always been under the impression that varchar was only going to use the size that it needed.
Do you know if this is going to have any gains in performance and would it be worth the extra coding / logic that would be needed?
EX:
- table_1 - up to varchar(10)
- table_2 - varchar(11) TO varchar(500)
- table_3 - varchar(501) TO varchar(MAX)
I have always been under the impression that varchar was only going to use the size that it needed.
Do you know if this is going to have any gains in performance and would it be worth the extra coding / logic that would be needed?
Solution
My gut says that any performance gain you get is unlikely to be worth the extra hassle (and potential for bugs ) resulting from needing to enforce the separation and perform multiple lookups in your application logic.
If you have a lot of small values and were only querying them and none of the rest you would see some performance gain as more rows would fit in each page so overall less pages will need to be processed in RAM or read from disk for a given query. As soon as you need all the properties in one go (or just a mix) this benefit is going to be blown out of the water by either needing to query multiple tables separately or via a UNION.
The only way to be sure of course is to rig up a reasonably realistic large dataset and run some performance tests against the arrangements you are considering. But I very much doubt you will see any change worth the extra complexity. If you data can be split in more logical ways (i.e. ways implied by your business logic) than I suggest you look into data partitioning, particularly if you can split the partitions over different drives. Whenever you find yourself contemplating a potentially complexity laden optimisation (including partitioning) always go back and reconsider your overall data structure and make sure it is not working against your business logic and check your hardware is sufficient for the load you expect - while it is not guaranteed you are likely to find much more significant gains through looking into these core areas instead.
If you have a lot of small values and were only querying them and none of the rest you would see some performance gain as more rows would fit in each page so overall less pages will need to be processed in RAM or read from disk for a given query. As soon as you need all the properties in one go (or just a mix) this benefit is going to be blown out of the water by either needing to query multiple tables separately or via a UNION.
The only way to be sure of course is to rig up a reasonably realistic large dataset and run some performance tests against the arrangements you are considering. But I very much doubt you will see any change worth the extra complexity. If you data can be split in more logical ways (i.e. ways implied by your business logic) than I suggest you look into data partitioning, particularly if you can split the partitions over different drives. Whenever you find yourself contemplating a potentially complexity laden optimisation (including partitioning) always go back and reconsider your overall data structure and make sure it is not working against your business logic and check your hardware is sufficient for the load you expect - while it is not guaranteed you are likely to find much more significant gains through looking into these core areas instead.
Context
StackExchange Database Administrators Q#1126, answer score: 8
Revisions (0)
No revisions yet.