patternsqlMinor
After Restoring Log Shipping to Secondary Server, First Stored Procedure Execution is Slow
Viewed 0 times
afterstoredlogslowrestoringproceduresecondaryfirstshippingserver
Problem
We've set up log shipping to a secondary SQL server on Standby/ Read-Only to offload all SSRS report generation.
This works fine within the restrictions imposed by:
Unfortunately, the first time any/ all stored procedure are run, after the transaction log was restored, it takes a much longer time to complete than normal. All subsequent executions of that same stored procedure complete within the expected time. If we then execute another stored procedure, the first time it is slow and all subsequent executions complete in the expected time.
For reference, the difference in execution is ~00:02 normally compared to ~01:00 on the first run.
I assume this has something to do with either the server execution statistics or the stored procedure parameter sniffing/ stored execution plan.
Is there any way to get around this issue? Or is this inherent to the transaction log restore?
If it was just the very first execution of any stored procedure we could get around this easily by executing any stored procedure upon restore, but it appears to affect the first time all stored procedures are executed.
I tried running
I don't see any results on either my primary or secondary servers for
I'm currently on SQL Server 2012
Query Exection Plan:
The query execution plan at first glance appears significantly different, however, upon saving the execution plan and open
This works fine within the restrictions imposed by:
- Kicking out the user during the transaction log restore (we got around this by setting up multiple instances and restoring the most recent transaction logs using a round-robin schedule)
- The data being out of date by, at most, the time-frame indicated by the scheduled transaction log backup/ restore job.
Unfortunately, the first time any/ all stored procedure are run, after the transaction log was restored, it takes a much longer time to complete than normal. All subsequent executions of that same stored procedure complete within the expected time. If we then execute another stored procedure, the first time it is slow and all subsequent executions complete in the expected time.
For reference, the difference in execution is ~00:02 normally compared to ~01:00 on the first run.
I assume this has something to do with either the server execution statistics or the stored procedure parameter sniffing/ stored execution plan.
Is there any way to get around this issue? Or is this inherent to the transaction log restore?
If it was just the very first execution of any stored procedure we could get around this easily by executing any stored procedure upon restore, but it appears to affect the first time all stored procedures are executed.
I tried running
count( ) on the 11 tables the stored procedure I'm using for testing touches. The first run took 00:32, and subsequent count() took 00:00. Unfortunately, this did not have any impact on the first run of the stored procedure.I don't see any results on either my primary or secondary servers for
is_temporary stats, either before or after execution of a stored procedure. I'm currently on SQL Server 2012
Query Exection Plan:
The query execution plan at first glance appears significantly different, however, upon saving the execution plan and open
Solution
There are a few possible things going on here, here's a non-exhaustive list:
Some things you could do to mitigate this are
Providing us with a "fast" and "slow" example of an execution plan could help us track down exactly which thing is happening.
If you are on SQL Server 2012 or newer, then it's possible that sync stats updates are causing the delay. These "readable secondary stats" get created in TempDB, since the log shipping secondary is read-only. You can read more about that here (the article is about AGs, but the same thing applies in this scenario):
AlwaysOn: Making latest statistics available on Readable Secondary, Read-Only database and Database Snapshot
If this is the problem causing your slowdown, then one solution would be to find those stats, and then create them in the production database, so that they are up-to-date and available after the restore. You can look for temp stats with this query:
Based on the wait stats you provided, and the fact that the plans are the same, this is pretty conclusively due to the buffer pool being cleared by the log restore.
On a normal run, you get 12,768 ms (almost 13 seconds) of IO waits.
On the first run, you get 411,129 ms (almost 7 minutes) of IO waits.
The
- the execution plan cache is cleared by the log restore, so plans will need to be recompiled the first time. If your plans have long compile times, this could explain the difference. You didn't mention exactly how long the delay is on the first run compared to the subsequent runs
- this one seems like the least likely - you can see your plan compilation time in actual execution plan XML
- the buffer pool is also cleared during the restore, so all data has to be read from disk on the first execution
- if this is the case, you'll likely see high
PAGEIOLATCH*waits during the initial run if you check wait stats
Some things you could do to mitigate this are
- "warm up" the buffer cache (by reading all data from important tables into memory using
SELECT COUNT(*) FROM dbo.YourTable),
- "warm up" the proc cache by running all the critical stored procedures as a post-restore step
Providing us with a "fast" and "slow" example of an execution plan could help us track down exactly which thing is happening.
If you are on SQL Server 2012 or newer, then it's possible that sync stats updates are causing the delay. These "readable secondary stats" get created in TempDB, since the log shipping secondary is read-only. You can read more about that here (the article is about AGs, but the same thing applies in this scenario):
AlwaysOn: Making latest statistics available on Readable Secondary, Read-Only database and Database Snapshot
If this is the problem causing your slowdown, then one solution would be to find those stats, and then create them in the production database, so that they are up-to-date and available after the restore. You can look for temp stats with this query:
SELECT * FROM sys.stats WHERE is_temporary = 1;Based on the wait stats you provided, and the fact that the plans are the same, this is pretty conclusively due to the buffer pool being cleared by the log restore.
On a normal run, you get 12,768 ms (almost 13 seconds) of IO waits.
On the first run, you get 411,129 ms (almost 7 minutes) of IO waits.
The
SELECT COUNT() approach you tried may not have helped due to different indexes being used by the actual procedure vs the COUNT() query. You have a few options here:- Go through each execution plan and make note of the indexes being used, and then pull those indexes into memory as a post-restore step - using index hints this time (
SELECT COUNT(*) FROM dbo.YourTable WITH (INDEX (IX_Index_Being_Used_By_Proc)))
- Go through the process of scripting out a process to run each procedure as a post-restore step (this seems a bit easier than option 1)
- Tune the queries so that they don't need to do so many reads (not sure how feasible this is)
- Speed up the I/O subsystem - get faster disks, local SSDs, more channels to the SAN, etc (this is probably the hardest and most expensive option
Code Snippets
SELECT * FROM sys.stats WHERE is_temporary = 1;Context
StackExchange Database Administrators Q#238034, answer score: 8
Revisions (0)
No revisions yet.