patternsqlMinor
Optimizing queries against a set of tables that contain partitioned data
Viewed 0 times
tablescontaindataagainstoptimizingthatqueriespartitionedset
Problem
I have a third party program that creates and populates a series of tables to hold some historical data:
Unfortunately, while these tables are indexed, they are not created with a partitioning scheme or partition function so I'm left to manually iterate through the tables. Now I want to allow the user generation of reports, but I'm unsure of how to best write a stored procedure to query against these tables in an optimized manner.
My first idea was to just
My second idea was to programmatically
What would be the best way to optimize presenting this mass of data in a somewhat optimized manner?
data_33 -- contains data for 12/01/11 - 12/05/11
data_34 -- contains data for 12/05/11 - 12/14/11
data_35 -- contains data for 12/14/11 - 12/20/11
data_36 -- contains data for 12/20/11 - 12/22/11
data_37 -- contains data for 12/22/11 - 12/23/11
data_38 -- contains data for 12/24/11 - 12/28/11
data_39 -- contains data for 12/28/11 - 01/02/12
data_(you get the idea)Unfortunately, while these tables are indexed, they are not created with a partitioning scheme or partition function so I'm left to manually iterate through the tables. Now I want to allow the user generation of reports, but I'm unsure of how to best write a stored procedure to query against these tables in an optimized manner.
My first idea was to just
UNION ALL all the tables into one gigantic temporary table, but I'm guessing this is going to be costly as a single table has a large number of records and even doing a SELECT * FROM data_33 is taking about 10 minutes to resolve. On top of that, I'll have to maintain this (materialized) view as the software generates new tables.My second idea was to programmatically
UNION ALL everything by grabbing all the tables in the database matching the results of SELECT name FROM sys.tables WHERE name LIKE 'data_%' but without any other limit I'm afraid this would still be extremely slow.What would be the best way to optimize presenting this mass of data in a somewhat optimized manner?
Solution
Before table partitioning was introduced the way of doing the very thing you asked was called Partitioned Views:
A partitioned view joins horizontally partitioned data from a set of
member tables across one or more servers, making the data appear as if
from one table.
Partitioned Views are not only a UNION ALL over each table. By using appropriate check constraints on the underlying tables partitioning column the optimizer can do some fancy stuff, including partition elimination.
PS. Obviously you can generate the view pragmatically. You can even use Event Notifications to generate it automatically every time your 3rd party tool creates a new table, assuming you're running on SQL Server 2005 or later.
A partitioned view joins horizontally partitioned data from a set of
member tables across one or more servers, making the data appear as if
from one table.
Partitioned Views are not only a UNION ALL over each table. By using appropriate check constraints on the underlying tables partitioning column the optimizer can do some fancy stuff, including partition elimination.
PS. Obviously you can generate the view pragmatically. You can even use Event Notifications to generate it automatically every time your 3rd party tool creates a new table, assuming you're running on SQL Server 2005 or later.
Context
StackExchange Database Administrators Q#17266, answer score: 4
Revisions (0)
No revisions yet.