patternModerate
How does SQL Server perfom lock when using CROSS APPLY?
Viewed 0 times
perfomsqlcrossapplyusingdoeshowserverwhenlock
Problem
The best way I think I can ask my question is by starting with an example.
Imagine these two queries :
and
NOTE: tTire.CarID is indexed and tCar.Brand is indexed
These two would return the same output.
I know the second query is just foolish as it stand, but again, it just serve as an example for my question.
Now, I guess (even if I don't know the details) SQL Server will perform a clever and limited lock on the data read by the first query (JOIN).
But I'm wondering how it will perform lock for the data read by the second query (APPLY), mainly (and obviously imo) on the "tTire" data read.
In case of
EDIT: This example is so simple it seems SQL Server manage to create the same execution plan, but my "APPLY" question should be though in terms where a RBAR effect occurs.
Imagine these two queries :
SELECT CAR.SerialNumber, TIRE.SerialNumber
FROM dbo.tCar AS CAR
INNER JOIN dbo.tTire AS TIRE ON ( TIRE.CarID = CAR.ID)
WHERE CAR.Brand = 'Jaguar'and
SELECT CAR.SerialNumber, TIRE.SerialNumber
FROM dbo.tCar AS CAR
CROSS APPLY (
SELECT TIRE.SerialNumber
FROM dbo.tTire AS TIRE
WHERE TIRE.CarID = CAR.ID
) AS TIRE
WHERE CAR.Brand = 'Jaguar'NOTE: tTire.CarID is indexed and tCar.Brand is indexed
These two would return the same output.
I know the second query is just foolish as it stand, but again, it just serve as an example for my question.
Now, I guess (even if I don't know the details) SQL Server will perform a clever and limited lock on the data read by the first query (JOIN).
But I'm wondering how it will perform lock for the data read by the second query (APPLY), mainly (and obviously imo) on the "tTire" data read.
In case of
APPLY how do SQL Server perfom lock?EDIT: This example is so simple it seems SQL Server manage to create the same execution plan, but my "APPLY" question should be though in terms where a RBAR effect occurs.
Solution
A query is not an executable program. The SQL query specification language enables us to define the logical results of a query, but it is up to components like the query optimizer and execution engine to take care of the physical processing involved in delivering those results.
In principle, the physical processing side of things is free to perform whatever operations it likes, so long as the data returned matches the query specification. It might use hashing, scanning, sorting ... anything it likes, so long as the results match the logic of the query.
For example, the optimizer might well produce a very similar (or even identical) user-visible execution plan for both your sample queries, because they logically describe the same result. On the other hand, it might not; and it does not really matter either way.
The second major factor here concerns the transaction isolation level that the query is executed at. The different isolation levels available each provide different guarantees regarding the effects on one query due to concurrent data modifications by another.
Locks are just one possible mechanism by which these isolation guarantees might be enforced by the database engine. Other isolation implementations that do not use locks at all are available, in SQL Server as well as most mainstream products.
The point is that the engine guarantees to respect the requested isolation level, and it may do so using locks, or it may not. These details may be important sometimes from a practical point of view (for example when blocking occurs) but they are not truly fundamental.
As a result of all this, there is usually no direct relationship between the original written form of the query and the locks that might be taken - there's just too much other 'stuff' going on in between.
Isn't there even known "tendencies" of the SQL Server's behavior when it deals with RBAR scenarios? Something to be aware of maybe?
That comes with experience, though that is not universally a good thing. Certainly there is plenty of advice out there, though some is outdated, or based on a misconception.
In SQL Server, we can often safely encourage "set-based thinking" and say cursors, loops, and row-by-row processing should most often be avoided, but that is not the whole story, though this is all going a bit beyond your original question.
My usual advice is to write good queries first (using whatever syntax you find most natural) against well-designed and maintained databases, then worry about any specific issues like locking (optimizer limitations, plan shape issues etc.) if and when they arise. People do sometimes learn to avoid things like 'APPLY' for the wrong reasons, and have to unlearn it all later.
In general, I am pretty skeptical of advice that says to always prefer one syntax element or construction over another.
Resources:
In principle, the physical processing side of things is free to perform whatever operations it likes, so long as the data returned matches the query specification. It might use hashing, scanning, sorting ... anything it likes, so long as the results match the logic of the query.
For example, the optimizer might well produce a very similar (or even identical) user-visible execution plan for both your sample queries, because they logically describe the same result. On the other hand, it might not; and it does not really matter either way.
The second major factor here concerns the transaction isolation level that the query is executed at. The different isolation levels available each provide different guarantees regarding the effects on one query due to concurrent data modifications by another.
Locks are just one possible mechanism by which these isolation guarantees might be enforced by the database engine. Other isolation implementations that do not use locks at all are available, in SQL Server as well as most mainstream products.
The point is that the engine guarantees to respect the requested isolation level, and it may do so using locks, or it may not. These details may be important sometimes from a practical point of view (for example when blocking occurs) but they are not truly fundamental.
As a result of all this, there is usually no direct relationship between the original written form of the query and the locks that might be taken - there's just too much other 'stuff' going on in between.
Isn't there even known "tendencies" of the SQL Server's behavior when it deals with RBAR scenarios? Something to be aware of maybe?
That comes with experience, though that is not universally a good thing. Certainly there is plenty of advice out there, though some is outdated, or based on a misconception.
In SQL Server, we can often safely encourage "set-based thinking" and say cursors, loops, and row-by-row processing should most often be avoided, but that is not the whole story, though this is all going a bit beyond your original question.
My usual advice is to write good queries first (using whatever syntax you find most natural) against well-designed and maintained databases, then worry about any specific issues like locking (optimizer limitations, plan shape issues etc.) if and when they arise. People do sometimes learn to avoid things like 'APPLY' for the wrong reasons, and have to unlearn it all later.
In general, I am pretty skeptical of advice that says to always prefer one syntax element or construction over another.
Resources:
- Lesson 2: Understanding Logical Query Processing - Microsoft Press
- Basics of Transaction Isolation Levels - SQL Server Customer Advisory Team
- SQL Server Isolation Levels - My series of articles
Context
StackExchange Database Administrators Q#111013, answer score: 10
Revisions (0)
No revisions yet.