HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Sql server table insert performance optimization

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
insertsqloptimizationperformanceservertable

Problem

Setting

In a datawarehouse, I am joining a fact table to 20 dimensions. The fact table has 32 million rows and 30 columns. This is a temporary staging table so I don't have to deal with other users reading or writing to the table. I select 10 columns from the base table and 20 columns from the respective dimensions. The dimension tables are small (between 3 and 15.000 rows). The fields on which are joined are both integers and nvarchars. I use a SELECT ... INTO statement. There are no indexes on the tables.

The execution speed of this query is too slow to be useful.

Tried-out solutions

Because the query takes too long to process, I tried out following solutions:

  • Split the 20 joins into 4 joins on 5 tables. The query performance remains low however.



  • Put indexes on the foreign key columns. No significant time decrease.



  • Make sure the fields of the join condition are integers. I noticed a performance increase of 25%. Not quite what I am searching for.



  • Use an insert into statement instead of select into. Worse performance because of log file growth although the database is in simple recovery mode.



These findings led me to including the actual execution plan which shows that 89% of the cost lies in the table insert. The other costs are 8% table scan on the fact table and 2% on hash matching for the inner joins.

Questions

  • What are the possible reasons of the slow table insert?



  • What are ways to identify this bottleneck without the execution plan?



  • What actions can I take to reduce the cost of the table insert?

Solution

What are the possible reasons of the slow table insert?
What are ways to identify this bottleneck without the execution plan?

Read How to analyse SQL Server performance, specially the part about Analyzing individual query execution wait times.


What actions can I take to reduce the cost of the table insert?

That would depend largely on the result of the performance analysis. First and foremost, ensure the SELECT part is as fast as possible. Assuming that problem is the single threaded fully logged insert, some solutions are:

-
Use partition switch to move 'in' the data. This is, by far, the best solution. Prepare the staging data in a separate staging table, then switch this staging table into the DW table. Read Transferring Data Efficiently by Using Partition Switching.

-
Make sure the INSERT is minimally logged. Read Operations That Can Be Minimally Logged and Prerequisites for Minimal Logging. Even if you use partition switch operations, is still worth making sure the build of the staging table is minimally logged.

-
Make sure your IO subsystem is capable of driving a fast load. Read Introducing SSDs.

Context

StackExchange Database Administrators Q#84164, answer score: 14

Revisions (0)

No revisions yet.