snippetsqlMinor
How to avoid tempdb spill
Viewed 0 times
spillavoidtempdbhow
Problem
I have several tempdb spills in queries which are currently serving business purposes. I am told by online searching that tempdb spills should never be ignored as they can cause performance problems. It's still a bit unclear to me as to why that is. Since I have also read that SQL Server by itself will request additional memory for a query if it notices its estimation was wrong for a certain operator. It would seem logical then, that there would only be a performance problem when SQL Server is actually out of memory, which would happen regardless as the memory requirement and available memory would not change if only the estimation is changed to be correct.
Do tempdb spills mean that the SQL Server needs more memory added to it or is there some other reason why tempdb spills are bad that I am not accounting for? I do notice performance drops for queries where the size of the tempdb spills is large, so I am assuming it is a problem which will only get worse as data is added.
To avoid tempdb spills I see the following mentioned online:
I believe I've done this. Though I am definitely happy to be proven wrong. It is also worth noting that I also receive tempdb spills when converting said queries into parameterized queries.
I'd really like insights into how a sort operator, and more specifically a tempdb spill can be completely avoided. Either by table design or any other possibility I can't think of. I am finding very little info about this online apart from the rules above which I have already applied. Even a prominent book on SQL Server performance only mentions the word spill twice. Is there any good resource on this which really goes into depth?
This is the sort operator on the production server:
Below you will find the query I am trying to execute which leads to a sort operator, which does not spill locally with 390 thousand Attribute records, but does
Do tempdb spills mean that the SQL Server needs more memory added to it or is there some other reason why tempdb spills are bad that I am not accounting for? I do notice performance drops for queries where the size of the tempdb spills is large, so I am assuming it is a problem which will only get worse as data is added.
To avoid tempdb spills I see the following mentioned online:
- Make sure statistics are up to date.
- Build indexes in a correct manner.
- Adjust query if possible.
I believe I've done this. Though I am definitely happy to be proven wrong. It is also worth noting that I also receive tempdb spills when converting said queries into parameterized queries.
I'd really like insights into how a sort operator, and more specifically a tempdb spill can be completely avoided. Either by table design or any other possibility I can't think of. I am finding very little info about this online apart from the rules above which I have already applied. Even a prominent book on SQL Server performance only mentions the word spill twice. Is there any good resource on this which really goes into depth?
This is the sort operator on the production server:
Below you will find the query I am trying to execute which leads to a sort operator, which does not spill locally with 390 thousand Attribute records, but does
Solution
Ok. You've got the Entity-Attribute-Value (EAV) pattern there, and it's notorious for poor performance. If the tables are not too big, you won't notice. But don't expect blazing fast performance. You should expect TempDb spills, memory grant warnings, locking issues, and poor execution plans. So if you really care about optimal performance, choose a different way to model the data.
SQL Server has this really great feature where you model entities as rows and attributes as columns. It's called the relational model, and it's modeling approach that the SQL Server engine is optimized for.
If you can't use the relational model for your Product attributes because they are too variable, then model a subset of the attributes using XML or JSON, which SQL Server also supports. eg
The JSON should be a simple list of name/value pairs, like this
You can do lookups like this
And if you have an performance-critical attribute lookups, put an indexed computed column on Product that "promotes" the attribute and gives you fast lookups, like this
On top of that, the table design for EAV is not optimal. You should not use UNIQUEIDENTIFIERS for the keys, and even if you do, the key structure should look like this:
In EAV accessing the attributes for a single entity is the primary access path, so the attributes should be clustered by EntityID. And the AttributeValue table doesn't really belong.
SQL Server has this really great feature where you model entities as rows and attributes as columns. It's called the relational model, and it's modeling approach that the SQL Server engine is optimized for.
If you can't use the relational model for your Product attributes because they are too variable, then model a subset of the attributes using XML or JSON, which SQL Server also supports. eg
CREATE TABLE [Product]
(
[Id] UNIQUEIDENTIFIER NOT NULL PRIMARY KEY DEFAULT NEWSEQUENTIALID(),
Name nvarchar(20) not null,
ProductType int not null,
Active bit not null default 1,
Attributes nvarchar(max),
constraint ck_is_valid_json check (isjson(Attributes) = 1)
)The JSON should be a simple list of name/value pairs, like this
{"Description":"My Product","ManufacturerName":"Acme Widgets","ValidOn":"2019-01-01T12:00:00","SafetyStockLevel":10}You can do lookups like this
SELECT *
FROM Product p
WHERE json_value(p.Attributes,'$.Description') = 'My Cool Product'And if you have an performance-critical attribute lookups, put an indexed computed column on Product that "promotes" the attribute and gives you fast lookups, like this
alter table product add Description as json_value(Attributes,'$.Description') persisted
create index ix_product_description on product(Description)On top of that, the table design for EAV is not optimal. You should not use UNIQUEIDENTIFIERS for the keys, and even if you do, the key structure should look like this:
CREATE TABLE [Product] (
[Id] UNIQUEIDENTIFIER NOT NULL PRIMARY KEY DEFAULT NEWSEQUENTIALID()
)
CREATE TABLE [AttributeType] (
[Id] UNIQUEIDENTIFIER NOT NULL PRIMARY KEY DEFAULT NEWSEQUENTIALID(),
[Description] NVARCHAR(MAX) NOT NULL
)
CREATE TABLE [Attribute] (
[ProductId] UNIQUEIDENTIFIER NOT NULL REFERENCES Product,
[AttributeTypeId] UNIQUEIDENTIFIER NOT NULL REFERENCES AttributeType,
[Value] NVARCHAR(MAX) NULL,
CONSTRAINT [PK_Attribute] PRIMARY KEY CLUSTERED (ProductId, AttributeTypeId)
)In EAV accessing the attributes for a single entity is the primary access path, so the attributes should be clustered by EntityID. And the AttributeValue table doesn't really belong.
Code Snippets
CREATE TABLE [Product]
(
[Id] UNIQUEIDENTIFIER NOT NULL PRIMARY KEY DEFAULT NEWSEQUENTIALID(),
Name nvarchar(20) not null,
ProductType int not null,
Active bit not null default 1,
Attributes nvarchar(max),
constraint ck_is_valid_json check (isjson(Attributes) = 1)
){"Description":"My Product","ManufacturerName":"Acme Widgets","ValidOn":"2019-01-01T12:00:00","SafetyStockLevel":10}SELECT *
FROM Product p
WHERE json_value(p.Attributes,'$.Description') = 'My Cool Product'alter table product add Description as json_value(Attributes,'$.Description') persisted
create index ix_product_description on product(Description)CREATE TABLE [Product] (
[Id] UNIQUEIDENTIFIER NOT NULL PRIMARY KEY DEFAULT NEWSEQUENTIALID()
)
CREATE TABLE [AttributeType] (
[Id] UNIQUEIDENTIFIER NOT NULL PRIMARY KEY DEFAULT NEWSEQUENTIALID(),
[Description] NVARCHAR(MAX) NOT NULL
)
CREATE TABLE [Attribute] (
[ProductId] UNIQUEIDENTIFIER NOT NULL REFERENCES Product,
[AttributeTypeId] UNIQUEIDENTIFIER NOT NULL REFERENCES AttributeType,
[Value] NVARCHAR(MAX) NULL,
CONSTRAINT [PK_Attribute] PRIMARY KEY CLUSTERED (ProductId, AttributeTypeId)
)Context
StackExchange Database Administrators Q#267750, answer score: 3
Revisions (0)
No revisions yet.