patternsqlMajor
UPDATE performance where no data changes
Viewed 0 times
updatewherechangesperformancedata
Problem
If I have an
For example would there be any difference in execution speed between UPDATE 1 and UPDATE 2 in the following:
The reason I ask is that I need the row count to include the unchanged row so I know whether to do an insert if the ID does not exist. As such I used the UPDATE 2 form. If there is a performance benefit to using the UPDATE 1 form, is it possible to get the row count that I need somehow?
UPDATE statement that does not actually change any data (because the data is already in the updated state). Is there any performance benefit in putting a check in the WHERE clause to prevent the update?For example would there be any difference in execution speed between UPDATE 1 and UPDATE 2 in the following:
CREATE TABLE MyTable (ID int PRIMARY KEY, Value int);
INSERT INTO MyTable (ID, Value)
VALUES
(1, 1),
(2, 2),
(3, 3);
-- UPDATE 1
UPDATE MyTable
SET
Value = 2
WHERE
ID = 2
AND Value <> 2;
SELECT @@ROWCOUNT;
-- UPDATE 2
UPDATE MyTable
SET
Value = 2
WHERE
ID = 2;
SELECT @@ROWCOUNT;
DROP TABLE MyTable;The reason I ask is that I need the row count to include the unchanged row so I know whether to do an insert if the ID does not exist. As such I used the UPDATE 2 form. If there is a performance benefit to using the UPDATE 1 form, is it possible to get the row count that I need somehow?
Solution
If I have an UPDATE statement that does not actually change any data (because the data is already in the updated state), is there any performance benefit in putting a check in the where clause to prevent the update?
There certainly could be as there is a slight performance difference due to UPDATE 1:
However, how much of a difference there is would need to be measured by you on your system with your schema, and data, and system load. There are several factors that play into how much impact a non-updating UPDATE has:
Also, the following summary section is found in Paul White's article, The Impact of Non-Updating Updates (as noted by @spaghettidba in a comment on his answer):
SQL Server contains a number of optimisations to avoid unnecessary logging or page flushing when processing an UPDATE operation that will not result in any change to the persistent database.
Please keep in mind (especially if you don't follow the link to see Paul's full article), the following two items:
-
Non-updating updates still have some log activity, showing that a transaction is beginning and ending. It is just that no data modification happens (which is still a good savings).
-
As I stated above, you need to test on your system. Use the same research queries that Paul is using and see if you get the same results. I am seeing slightly different results on my system than what is shown in the article. Still no dirty pages to be written, but a little more log activity.
... I need the row count to include the unchanged row so I know whether to do an insert if the ID does not exist. ... is it possible to get the row count that I need somehow?
Simplistically, if you are just dealing with a single row, you can do the following:
For multiple rows, you can get the information needed to make that decision by using the
I show the basic implementation in the following answer:
How to avoid using Merge query when upserting multiple data using xml parameter?
The method shown in that answer doesn't filter out rows that exist yet do not need to be updated. That portion could be added, but you would first need to show exactly where you are getting your dataset that you are merging into
UPDATE 1:
I was finally able to do some testing and here is what I found regarding transaction l
There certainly could be as there is a slight performance difference due to UPDATE 1:
- not actually updating any rows (hence nothing to write to disk, not even minimal log activity), and
- taking out less restrictive locks than what are required for doing the actual update (hence better for concurrency) (Please see Update section towards the end)
However, how much of a difference there is would need to be measured by you on your system with your schema, and data, and system load. There are several factors that play into how much impact a non-updating UPDATE has:
- the amount of contention on the table being updated
- the number of rows being updated
- if there are UPDATE Triggers on the table being updated (as noted by Mark in a comment on the Question). If you execute
UPDATE TableName SET Field1 = Field1, then an Update Trigger will fire and indicate that the field was updated (if you check using either the UPDATE() or COLUMNS_UPDATED functions), and that the field in bothINSERTEDandDELETEDtables are the same value.
Also, the following summary section is found in Paul White's article, The Impact of Non-Updating Updates (as noted by @spaghettidba in a comment on his answer):
SQL Server contains a number of optimisations to avoid unnecessary logging or page flushing when processing an UPDATE operation that will not result in any change to the persistent database.
- Non-updating updates to a clustered table generally avoid extra logging and page flushing, unless a column that forms (part of) the cluster key is affected by the update operation.
- If any part of the cluster key is ‘updated’ to the same value, the operation is logged as if data had changed, and the affected pages are marked as dirty in the buffer pool. This is a consequence of the conversion of the UPDATE to a delete-then-insert operation.
- Heap tables behave the same as clustered tables, except they do not have a cluster key to cause any extra logging or page flushing. This remains the case even where a non-clustered primary key exists on the heap. Non-updating updates to a heap therefore generally avoid the extra logging and flushing (but see below).
- Both heaps and clustered tables will suffer the extra logging and flushing for any row where a LOB column containing more than 8000 bytes of data is updated to the same value using any syntax other than ‘SET column_name = column_name’.
- Simply enabling either type of row versioning isolation level on a database always causes the extra logging and flushing. This occurs regardless of the isolation level in effect for the update transaction.
Please keep in mind (especially if you don't follow the link to see Paul's full article), the following two items:
-
Non-updating updates still have some log activity, showing that a transaction is beginning and ending. It is just that no data modification happens (which is still a good savings).
-
As I stated above, you need to test on your system. Use the same research queries that Paul is using and see if you get the same results. I am seeing slightly different results on my system than what is shown in the article. Still no dirty pages to be written, but a little more log activity.
... I need the row count to include the unchanged row so I know whether to do an insert if the ID does not exist. ... is it possible to get the row count that I need somehow?
Simplistically, if you are just dealing with a single row, you can do the following:
UPDATE MyTable
SET Value = 2
WHERE ID = 2
AND Value <> 2;
IF (@@ROWCOUNT = 0)
BEGIN
IF (NOT EXISTS(
SELECT *
FROM MyTable
WHERE ID = 2 -- or Value = 2 depending on the scenario
)
)
BEGIN
INSERT INTO MyTable (ID, Value) -- or leave out ID if it is an IDENTITY
VALUES (2, 2);
END;
END;
For multiple rows, you can get the information needed to make that decision by using the
OUTPUT clause. By capturing exactly what rows were updated, then you can narrow down the items to look up to know the difference between not updating rows that don't exist as opposed to not updating rows that exist but don't need the update.I show the basic implementation in the following answer:
How to avoid using Merge query when upserting multiple data using xml parameter?
The method shown in that answer doesn't filter out rows that exist yet do not need to be updated. That portion could be added, but you would first need to show exactly where you are getting your dataset that you are merging into
MyTable. Are they coming from a temporary table? A table-valued parameter (TVP)?UPDATE 1:
I was finally able to do some testing and here is what I found regarding transaction l
Context
StackExchange Database Administrators Q#114360, answer score: 28
Revisions (0)
No revisions yet.