HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Unexplained log ship slow down

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
logslowunexplaineddownship

Problem

I have a number of SQL 2008 instances all running Microsoft SQL Server 2008 (SP4) (confirmed with select @@VERSION on the servers in question). They run on either Windows Server 2008 or Windows Server 2008 R2.

Two of them exist solely to log ship with Red Gate SQL Backup 7.4.0.23, and I'm having issues with one of them. (It's one of the 2008 R2 servers, if that makes a difference.) I'm using a t-sql job that scrolls through a very long list of databases (dynamically pulled from the other servers) and restores them.

Previously, this job took less than 10 minutes. It's now taking an hour and a half to two and a half hours. There have been no code changes and no radical increases in the number of databases to restore. Its sibling server, with almost identical code, is running this job in under 4 minutes. (The sibling server is one of the non-R2 servers, if that makes a difference.)

The Event Log and SQL Error log show an error of:


Operating system error 0x80770006 (failed to retrieve text for this error. Reason: 317)."

I don't know if this is the cause of the problem or not; Google suggests that this occurs when different versions of SQL Server coexist, or Red Gate SQL Backup 6.x needs a special patch. I don't think either of these are the issue because the error is intermittent, the SQL Server versions are identical, and I'm running Red Gate SQL Backup 7.x, but I could certainly be wrong. Red Gate forums suggested running a query to see if VAS memory was low, since that could cause similar issues.

VAS Total avail mem, KB Max free size, KB
8320072080 8314974784


Other things I've tried to resolve the issue include:

  • Cleaning old log files out of "C:\ProgramData\Red Gate\SQL Backup\Log[instancename]", because the last time the job slowed down it was because there were too many log files in that directory.



  • Checking for and resolving any memory issues on the server.



  • Making sure that antivirus has exclusions fo

Solution

This is bigger than a comment but something to test first and then implement :

You are logshipping 800+ databases. Thats a large amount of databases that you logship every 15 mins.

You should offload some databases to another server. IMHO 800 databases on a single server is a lot!

We had similar problem with logshipping when we logshipped 200+ databases from NY to LD region.

What we did is as below :

-
There was blocking writing to msdb.dbo.sysjobhistory with (TABLOCKX). The TABLOCK hint means that access to the sysjobhistory is always serialized. And since you have lot of jobs running every 15mins, there will be contention (blocking).

-
We implemented trace flag TF – 1236. It will introduce Database lock partitioning. Partitioning the DATABASE lock keeps the depth of the lock list manageable in each local partition. This significantly optimizes the access path that is used to obtain a DATABASE lock.

-
create indexes on sysjobhistory and log_shipping_monitor_history_detail tables as below :

use msdb
    go

    create nonclustered index [nc_DBA_sysjobhistory] on dbo.sysjobhistory (
            ,[job_id] 
            ,[step_id]     
            ,[run_date]
            ) 
    include (
            [instance_id]
            ,[step_name]
            ,[sql_message_id]
            ,[sql_severity]
            ,[message]
            ,[run_status]
            ,[run_time]
            ,[run_duration]
            ,[retries_attempted]
            ,[server]
            )
    go

    use [msdb]
    go
    ----- this will help sys.sp_MSprocesslogshippingretentioncleanup proc (delete from msdb.dbo.log_shipping_monitor_history_detail).. that does the cleanup of logshipping.

    create nonclustered index [nc_DBA_LogShipping_monitor_history_detail] on [dbo].[log_shipping_monitor_history_detail] (
            [agent_id] asc
            ,[agent_type] asc
            ,[log_time_utc] asc
            )
    go

Code Snippets

use msdb
    go

    create nonclustered index [nc_DBA_sysjobhistory] on dbo.sysjobhistory (
            ,[job_id] 
            ,[step_id]     
            ,[run_date]
            ) 
    include (
            [instance_id]
            ,[step_name]
            ,[sql_message_id]
            ,[sql_severity]
            ,[message]
            ,[run_status]
            ,[run_time]
            ,[run_duration]
            ,[retries_attempted]
            ,[server]
            )
    go

    use [msdb]
    go
    ----- this will help sys.sp_MSprocesslogshippingretentioncleanup proc (delete from msdb.dbo.log_shipping_monitor_history_detail).. that does the cleanup of logshipping.

    create nonclustered index [nc_DBA_LogShipping_monitor_history_detail] on [dbo].[log_shipping_monitor_history_detail] (
            [agent_id] asc
            ,[agent_type] asc
            ,[log_time_utc] asc
            )
    go

Context

StackExchange Database Administrators Q#139555, answer score: 4

Revisions (0)

No revisions yet.