HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Very slow update statement taking up to 11min to update 20 or 0 rows even with indexes

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
rowsupdatestatementwithslowindexesveryeven11mintaking

Problem

When I update 20 rows or no rows, it takes 11 minutes.

I mean 20 rows or no rows by different on m.MaximumReflowTemperatureID <> r.z2valueid between two tables.

Why is my update so slow, even though I update a small number of rows, or even no rows?

How to handle that?

My actual execution plan:

https://www.brentozar.com/pastetheplan/?id=HJlS11Fy5

Statement update that takes too long:

update r
set r.z2valueid=m.MaximumReflowTemperatureID
from [OperationsTablesDB].[dbo].[ManufactureMaximumReflowTemperatures] r
    inner join z2datacore.parts.manufacturingdata m with(nolock)
        on m.partid=r.zpartid
    where m.MaximumReflowTemperatureID <> r.z2valueid


when try the above statement by replacing update with select, it takes the same time, 11 minutes.

The table I need updated [OperationsTablesDB].[dbo].[ManufactureMaximumReflowTemperatures] has 14 million rows and the other table on the join has 15 milion rows.

Sample tables script:

```
CREATE TABLE [dbo].ManufactureMaximumReflowTemperatures NULL,
[value] varchar NULL,
[Z2ValueID] [int] NULL,
[csfeatureid] [int] NULL,
[csvalueid] [int] NULL,
PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[ManufactureMaximumReflowTemperatures] ADD PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [featurenameandvalue_idx] ON [dbo].[ManufactureMaximumReflowTemperatures]
(
[csfeatureid] ASC,
[Z2ValueID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = O

Solution

These two points from your post are going to be your biggest contribution factors.

have 14 milion rows

where m.MaximumReflowTemperatureID <>r.z2valueid

For starters, anytime you compare two columns in a table, regardless if equality or inequality comparisons, SQL Server is going to have to compare every row in the table to see if two columns meet the condition. It's not like a scenario where you're doing an index seek to find a row with ID = 5. In your case, the value you would be comparing to would be changing row by row. Therefore, there is no way to SEEK this data.

Add to the fact that you have 14 million rows in one table, and 15 million rows in the other.

Picture your query as a SELECT, but without the <> condition.

select m.MaximumReflowTemperatureID, r.z2valueid
from [OperationsTablesDB].[dbo].[ManufactureMaximumReflowTemperatures] r
    inner join z2datacore.parts.manufacturingdata m with(nolock)
        on m.partid=r.zpartid


Run that select and see how long it takes, and how many rows it returns. Maybe grab an execution plan while you're at it. Now picture SQL Server having to pull all those rows, every time you run your update, even if there is no unequal values. That is because it has to look at both columns, side by side, before it knows if it needs to update that row.

Comparing two columns will almost always be a painful query. However, the following indexes could at least make it more tolerable.

CREATE NONCLUSTERED INDEX [zpartid_idx]
    ON [dbo].[ManufactureMaximumReflowTemperatures] ([zpartid])
    INCLUDE (z2valueid)

CREATE NONCLUSTERED INDEX [manufacturingdata_partid]
    ON [dbo].[manufacturingdata] ([partid])
    INCLUDE (MaximumReflowTemperatureID)


With these indexes, you'll hopefully be pulling smaller indexes into memory, I.E. fewer reads. This is because each of these indexes has two columns each, instead of all columns of the base tables. This should help your query run faster. Without these, you're most liklely doing clustered scans.

Also, make sure to take out the NOLOCK hint.

Code Snippets

select m.MaximumReflowTemperatureID, r.z2valueid
from [OperationsTablesDB].[dbo].[ManufactureMaximumReflowTemperatures] r
    inner join z2datacore.parts.manufacturingdata m with(nolock)
        on m.partid=r.zpartid
CREATE NONCLUSTERED INDEX [zpartid_idx]
    ON [dbo].[ManufactureMaximumReflowTemperatures] ([zpartid])
    INCLUDE (z2valueid)

CREATE NONCLUSTERED INDEX [manufacturingdata_partid]
    ON [dbo].[manufacturingdata] ([partid])
    INCLUDE (MaximumReflowTemperatureID)

Context

StackExchange Database Administrators Q#307708, answer score: 4

Revisions (0)

No revisions yet.