patternsqlMinor
Very slow update statement taking up to 11min to update 20 or 0 rows even with indexes
Viewed 0 times
rowsupdatestatementwithslowindexesveryeven11mintaking
Problem
When I update 20 rows or no rows, it takes 11 minutes.
I mean 20 rows or no rows by different on
Why is my update so slow, even though I update a small number of rows, or even no rows?
How to handle that?
My actual execution plan:
https://www.brentozar.com/pastetheplan/?id=HJlS11Fy5
Statement update that takes too long:
when try the above statement by replacing update with select, it takes the same time, 11 minutes.
The table I need updated
Sample tables script:
```
CREATE TABLE [dbo].ManufactureMaximumReflowTemperatures NULL,
[value] varchar NULL,
[Z2ValueID] [int] NULL,
[csfeatureid] [int] NULL,
[csvalueid] [int] NULL,
PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[ManufactureMaximumReflowTemperatures] ADD PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [featurenameandvalue_idx] ON [dbo].[ManufactureMaximumReflowTemperatures]
(
[csfeatureid] ASC,
[Z2ValueID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = O
I mean 20 rows or no rows by different on
m.MaximumReflowTemperatureID <> r.z2valueid between two tables.Why is my update so slow, even though I update a small number of rows, or even no rows?
How to handle that?
My actual execution plan:
https://www.brentozar.com/pastetheplan/?id=HJlS11Fy5
Statement update that takes too long:
update r
set r.z2valueid=m.MaximumReflowTemperatureID
from [OperationsTablesDB].[dbo].[ManufactureMaximumReflowTemperatures] r
inner join z2datacore.parts.manufacturingdata m with(nolock)
on m.partid=r.zpartid
where m.MaximumReflowTemperatureID <> r.z2valueidwhen try the above statement by replacing update with select, it takes the same time, 11 minutes.
The table I need updated
[OperationsTablesDB].[dbo].[ManufactureMaximumReflowTemperatures] has 14 million rows and the other table on the join has 15 milion rows.Sample tables script:
```
CREATE TABLE [dbo].ManufactureMaximumReflowTemperatures NULL,
[value] varchar NULL,
[Z2ValueID] [int] NULL,
[csfeatureid] [int] NULL,
[csvalueid] [int] NULL,
PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[ManufactureMaximumReflowTemperatures] ADD PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [featurenameandvalue_idx] ON [dbo].[ManufactureMaximumReflowTemperatures]
(
[csfeatureid] ASC,
[Z2ValueID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = O
Solution
These two points from your post are going to be your biggest contribution factors.
have 14 milion rows
where m.MaximumReflowTemperatureID <>r.z2valueid
For starters, anytime you compare two columns in a table, regardless if equality or inequality comparisons, SQL Server is going to have to compare every row in the table to see if two columns meet the condition. It's not like a scenario where you're doing an index seek to find a row with ID = 5. In your case, the value you would be comparing to would be changing row by row. Therefore, there is no way to SEEK this data.
Add to the fact that you have 14 million rows in one table, and 15 million rows in the other.
Picture your query as a SELECT, but without the <> condition.
Run that select and see how long it takes, and how many rows it returns. Maybe grab an execution plan while you're at it. Now picture SQL Server having to pull all those rows, every time you run your update, even if there is no unequal values. That is because it has to look at both columns, side by side, before it knows if it needs to update that row.
Comparing two columns will almost always be a painful query. However, the following indexes could at least make it more tolerable.
With these indexes, you'll hopefully be pulling smaller indexes into memory, I.E. fewer reads. This is because each of these indexes has two columns each, instead of all columns of the base tables. This should help your query run faster. Without these, you're most liklely doing clustered scans.
Also, make sure to take out the NOLOCK hint.
have 14 milion rows
where m.MaximumReflowTemperatureID <>r.z2valueid
For starters, anytime you compare two columns in a table, regardless if equality or inequality comparisons, SQL Server is going to have to compare every row in the table to see if two columns meet the condition. It's not like a scenario where you're doing an index seek to find a row with ID = 5. In your case, the value you would be comparing to would be changing row by row. Therefore, there is no way to SEEK this data.
Add to the fact that you have 14 million rows in one table, and 15 million rows in the other.
Picture your query as a SELECT, but without the <> condition.
select m.MaximumReflowTemperatureID, r.z2valueid
from [OperationsTablesDB].[dbo].[ManufactureMaximumReflowTemperatures] r
inner join z2datacore.parts.manufacturingdata m with(nolock)
on m.partid=r.zpartidRun that select and see how long it takes, and how many rows it returns. Maybe grab an execution plan while you're at it. Now picture SQL Server having to pull all those rows, every time you run your update, even if there is no unequal values. That is because it has to look at both columns, side by side, before it knows if it needs to update that row.
Comparing two columns will almost always be a painful query. However, the following indexes could at least make it more tolerable.
CREATE NONCLUSTERED INDEX [zpartid_idx]
ON [dbo].[ManufactureMaximumReflowTemperatures] ([zpartid])
INCLUDE (z2valueid)
CREATE NONCLUSTERED INDEX [manufacturingdata_partid]
ON [dbo].[manufacturingdata] ([partid])
INCLUDE (MaximumReflowTemperatureID)With these indexes, you'll hopefully be pulling smaller indexes into memory, I.E. fewer reads. This is because each of these indexes has two columns each, instead of all columns of the base tables. This should help your query run faster. Without these, you're most liklely doing clustered scans.
Also, make sure to take out the NOLOCK hint.
Code Snippets
select m.MaximumReflowTemperatureID, r.z2valueid
from [OperationsTablesDB].[dbo].[ManufactureMaximumReflowTemperatures] r
inner join z2datacore.parts.manufacturingdata m with(nolock)
on m.partid=r.zpartidCREATE NONCLUSTERED INDEX [zpartid_idx]
ON [dbo].[ManufactureMaximumReflowTemperatures] ([zpartid])
INCLUDE (z2valueid)
CREATE NONCLUSTERED INDEX [manufacturingdata_partid]
ON [dbo].[manufacturingdata] ([partid])
INCLUDE (MaximumReflowTemperatureID)Context
StackExchange Database Administrators Q#307708, answer score: 4
Revisions (0)
No revisions yet.