patternMinor
Sudden Query Performance Degradation On A Linked Server
Viewed 0 times
suddenquerydegradationperformanceserverlinked
Problem
I have an application that uses a view on a SQL Server 2008 R2 instance. This view actually queries a linked server (SQL Server 2008) that has replicated data from our main ERP database. Yesterday, we began experiencing some timeout on the application that uses this view and we discovered that the view was taking over 1 minute to complete (before this it was a matter of seconds).
Here's what we know:
matter of milliseconds.
first server simply query the view on the linked server restores
performance to its original state. (view source), (workaround)
Because of this, I am inclined to believe that there is an issue with either the network, or the configuration on my linked server. I am not sure that there is anything wrong with it, but I am posting it too.
What other things should I be looking at to help me to get to the root cause of this?
EDIT
Query Plan: https://i.stack.imgur.com/muSUR.png
IO Stats: https://i.stack.imgur.com/VVoLf.png
Query Columns: https://i.stack.imgur.com/8tzWk.png
Top Operations: https://i.stack.imgur.com/bAOh5.png
Plan Tree: https://i.stack.imgur.com/3AWBk.png
Here's what we know:
- Running the query on the linked server directly returns results in a
matter of milliseconds.
- Creating a view on the linked server, and then having the view on the
first server simply query the view on the linked server restores
performance to its original state. (view source), (workaround)
- When running the query on the first server, I notice that there are quite a few
OLEDBwaits (usingsp_WhoIsActive). Because of this, I decided to check the linked server while the query was running (on the first server). I noticed that the primary wait wasASYNC_NETWORK_IO.
Because of this, I am inclined to believe that there is an issue with either the network, or the configuration on my linked server. I am not sure that there is anything wrong with it, but I am posting it too.
What other things should I be looking at to help me to get to the root cause of this?
EDIT
Query Plan: https://i.stack.imgur.com/muSUR.png
IO Stats: https://i.stack.imgur.com/VVoLf.png
Query Columns: https://i.stack.imgur.com/8tzWk.png
Top Operations: https://i.stack.imgur.com/bAOh5.png
Plan Tree: https://i.stack.imgur.com/3AWBk.png
Solution
If I'm reading the plan and operations correctly, it looks like those nested loops joins with the inner remote operations are the problem.
They were estimated to execute ~750 times apiece, but actually executed almost 6,000 times each. Normally that might not be such an issue, but if network latency is involved with every call, the query becomes extremely sensitive to network conditions. This would certainly explain the
In any event, I would stick with your workaround -- it's probably better to protect the tables behind a view anyway instead of trying to do all the joins remotely.
They were estimated to execute ~750 times apiece, but actually executed almost 6,000 times each. Normally that might not be such an issue, but if network latency is involved with every call, the query becomes extremely sensitive to network conditions. This would certainly explain the
ASYNC_NETWORK_IO waits you're seeing.In any event, I would stick with your workaround -- it's probably better to protect the tables behind a view anyway instead of trying to do all the joins remotely.
Context
StackExchange Database Administrators Q#22179, answer score: 2
Revisions (0)
No revisions yet.