HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

What can cause a mirroring session to timeout then failover?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
canwhattimeoutfailoversessionmirroringthencause

Problem

We have two production SQL Servers running SQL Server 2005 SP4 with cumulative update 3. Both servers run on physical machines that are identical. DELL PowerEdge R815 with 4 x 12 core CPUs and 512GB (yes GB) of ram, with 10GB iSCSI SAN connected drives for all SQL databases and logs. OS is Microsoft Windows Server 2008 R2 Enterprise edition with all SP's and windows updates. OS drive is a RAID 5 array of 3 x 72GB 2.5" 15k SAS drives. SAN is a Dell EqualLogic 6510 with 48 x 10K SAS 3.5" drives, configured in RAID 50, sliced into various LUNs for the 2 SQL Servers, and also shared with an Exchange machine and several VMWare servers.

We have over 20 databases, 11 of which are mirrored with high availability using a witness server. The witness server is a lower powered machine running a SQL Server instance that is used for nothing other than providing witness services. The biggest mirrored database is 450GB and generates around 100-300 iops. Database Mirroring Monitor reports a current send rate around 100kb to 10mb per second, and a mirror commit overhead of (typically) 0 milliseconds. The mirror server has no problem keeping up with the principal.

We are consistently experiencing mirroring failovers. Sometimes a single database will failover, other times almost all databases will failover simultaneously. For instance, last night, we had 10 of 11 databases failover, the remaining database stayed accessible until I manually failed it over.

I have gone through several troubleshooting steps to attempt to identify the problem, but have so far not been able to resolve the issue:

-
The machine came with a Broadcom BCM5709C NetXtreme II 4 port Gigabit network adapter which we initially used as the primary network connection. We have since installed an Intel(R) PRO/1000 PT Dual Port Server Adapter on both machines to eliminate the NIC as the issue.

-
All databases have an automatic full backup nightly along with a log backup for databases involved in mirroring. Log file

Solution

It sounds like you might be running out of TCP ports on the SQL Server. How many connections are you seeing to the server at a time?

Timeouts like that would definitely be causing the problem.

Context

StackExchange Database Administrators Q#22402, answer score: 6

Revisions (0)

No revisions yet.