HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

T-SQL INNER JOIN Optimisation

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
sqlinneroptimisationjoin

Problem

Well, I really have read so many articles, used different techniques and done some random things to do this.

My problem was that a have big tables (over one million rows) and some small (with few hundreds rows) - we are talking about 8 inner joins involved.

What reduces the execution time from over 2 minutes to 30 seconds was very strange for me and I was not able to find why this happens.

When I select columns from the tables I cast them. What I did was to cast the column to the most small possible type.

For example:

  • nvarchar(4000) to nvarchar(50 or 25)



  • bigint to int



  • int to tinyint



The result was over 1 minute and 30 second better execution time. Why this happen?

For example if my variable is string with length 10 the nvarchar(4000) and the nvarchar(50) will cast it to nvarchar(10) (or something close to that). So, why when I reduce the outcome type from cast the things go better?

Something more - I make a lot of test to check which function is better - cast or convert (do test for string to string, string to number, number to string) but was not able to define which works better. Sometime the convert gives me few seconds better execution time, but not enough to make a conclusion. Has anyone do something like this and succeeded in?

Thank for your time in advance.

Solution

The reason the performance is better is that the smaller data types have much less working set - see the execution plan (http://www.red-gate.com/our-company/about/book-store/assets/sql-server-execution-plans.pdf)

I expect you get the largest benefit from the nvarchar(4000) to nvarchar(50) - that's a reduction of 80x - and nvarchar(4000) can use up to 8K! of space. For a key, that is not a good idea.

In addition, the fact that there are no foreign keys probably mean you don't have a very good indexing strategy either. If you did have indexes (even for ridiculously large columns), you would probably find they could outperform the cast since it probably wouldn't spool as much.

In general, you don't want any operations on your keys in the join if at all possible, especially for large data sets.

Context

StackExchange Database Administrators Q#11645, answer score: 10

Revisions (0)

No revisions yet.