HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

What can be done to further enhance performance of Multiple Join and Aggregate Queries?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
canwhatandfurtherjoindoneenhanceperformancemultiplequeries

Problem

I have a typical star schema simulated here, and I am mentioning two queries: first query simply joins the fact table with 2 dimension tables and 1 calendar table, and the second query joins and aggregates.

I have experimented and have created indexes by studying the execution plan and some by reading the suggested indexes and all of them have improved performance by some extent.

My question is what can further be done in this case, what indexes can be applied or how can the query be modified to gain better performance and to reduce execution time?

So first the query to create and fill the tables and to create Indexes:

```
CREATE TABLE FactTable (id BIGINT IDENTITY PRIMARY KEY, FKDim1 BIGINT NOT NULL, FKDim2 BIGINT, DateRef DATETIME, Fact1 MONEY, Fact2 MONEY)
CREATE TABLE Dim1Table (id BIGINT IDENTITY PRIMARY KEY, Dim1Name NVARCHAR(20), Dim1Val1 MONEY, Dim1Val2 MONEY)
CREATE TABLE Dim2Table (id BIGINT IDENTITY PRIMARY KEY, Dim2Name NVARCHAR(20), Dim2Val1 MONEY, Dim2Val2 MONEY)
CREATE TABLE CalendarTable (id BIGINT IDENTITY PRIMARY KEY, [Date] DATETIME UNIQUE NONCLUSTERED, [Weekday] NVARCHAR(10), [Month] NVARCHAR(10))

ALTER TABLE FactTable ADD CONSTRAINT FK_Dim1 FOREIGN KEY (FKDim1 ) REFERENCES Dim1Table(ID);
ALTER TABLE FactTable ADD CONSTRAINT FK_Dim2 FOREIGN KEY (FKDim2 ) REFERENCES Dim1Table(ID);
ALTER TABLE FactTable ADD CONSTRAINT FK_Calendar FOREIGN KEY (DateRef) REFERENCES CalendarTable([Date]);

DECLARE @counter INT;
SET @counter = 1;

WHILE @counter < 10000
BEGIN
INSERT INTO Dim1Table(Dim1Name,Dim1Val1,Dim1Val2)VALUES('Dim1-'+CAST((@counter % 100) AS NVARCHAR),RAND() 10000,RAND() 20000);
INSERT INTO Dim2Table(Dim2Name,Dim2Val1,Dim2Val2)VALUES('Dim2-'+CAST(@counter AS NVARCHAR),RAND() 10000,RAND() 20000);
SET @counter = @counter + 1;
END

DECLARE @StartDate DATETIME
DECLARE @EndDate DATETIME
SET @StartDate = CAST('1/1/1995' AS DATETIME)
SET @EndDate = DATEADD(d, 3650, @StartDate)

WHILE @StartDate <= @EndDate
BEGIN
INSERT INTO CalendarTable([Da

Solution

There is rarely any need, point or benefit trying to micro optimise star schema queries with non-clustered indexes laden with included columns. Fact tables are built to be scanned.

The indexes you've created in your examples are subset copies of the parent table, which are being scanned (no seeks). The minor performance improvements come from scanning marginally fewer pages than the parent table. Given that star schemas are built to support ad-hoc query patterns it is not viable to create the indexes to support every possible enquiry.

  • Create your fact table clustered index on the date key. The majority of (typical) fact table queries include a time element and clustering on the date key enables range scanning of fact table rows.



  • Add non-clustered indexes on the foreign keys of your fact tables to assist with highly selective queries. Foreign keys to dimension tables can be created with NOCHECK to prevent any impact on ETL.



  • Cluster your dimension tables on their surrogate keys.



  • Create a non-clustered index on the natural key of each dimension table.



  • Stop.



The optimiser detects star schema query patterns and has strategies to deal with them efficiently, utilising scans and hash joins in Standard Edition or bitmap filtering in Enterprise. Follow the indexing strategy outlined above and let the optimiser deal with the rest.

Context

StackExchange Database Administrators Q#52845, answer score: 8

Revisions (0)

No revisions yet.