patternsqlMinor
Connection Timeout Expired without apparent network issue
Viewed 0 times
withoutapparentexpiredissuetimeoutnetworkconnection
Problem
We have one particular SQL Server which is intermittently timing out when accepting connections. The issue is consistent throughout the day, but occurs at a very low incidence. How can I continue to troubleshoot?
Connection Timeout Expired. The timeout period elapsed while
attempting to consume the pre-login handshake acknowledgement. This
could be because the pre-login handshake failed or the server was
unable to respond back in time. The duration spent while attempting
to connect to this server was - [Pre-Login] initialization=0;
handshake=15002; (Microsoft SQL Server, Error: -2)
Server Configuration:
Test Script (executed once per second):
Observations:
Connection Timeout Expired. The timeout period elapsed while
attempting to consume the pre-login handshake acknowledgement. This
could be because the pre-login handshake failed or the server was
unable to respond back in time. The duration spent while attempting
to connect to this server was - [Pre-Login] initialization=0;
handshake=15002; (Microsoft SQL Server, Error: -2)
Server Configuration:
- SQL Server 2016 SP1 CU5 Enterprise (issue also occurred prior to SP1)
- Windows Server 2012 R2 on both server and client
- VMware ESXi, 6.5.0 on HP ProLiant DL360 Gen9
- VM has 8 vCPU, 64 GiB of memory (fully reserved)
Test Script (executed once per second):
$failed = $false;
$loginDuration = (Measure-Command {
$ncon = New-Object System.Data.SqlClient.SqlConnection `
@( 'Data Source=1.2.3.4,16143;Database=Test;User=Test;Password=****;Pooling=false;' );
try
{
$ncon.Open();
$cmd = New-Object System.Data.SqlClient.SqlCommand `
@( 'SELECT @@VERSION', $ncon );
$cmd.ExecuteNonQuery();
$ncon.Dispose();
}
catch
{
$failed = $true;
}
}).TotalMilliseconds;
Write-Metric -metric 'itp.dbserver.logintime' -unit 'milliseconds' `
-value (&{if ($failed) { 120000 } else { $loginDuration }});Observations:
- Issue started occurring after OS Updates, SQL Server Updates, San move, and move from Hyper-V to VMWare
- Most connections succeed (4 failures out of 1,440 attempts)
- Failures are always listed with a low number in "[Pre-Login] initialization=0;" and a high number in "handshake=15002". We do not get errors like "Not found" or "No such host is known", only "Connection Timeout"
- No encryption is enabled for the listener
- Pings show no loss over
Solution
This was eventually identified as a side-effect of VMWare LRO. Disabling host-based LRO resolved the issue. See
- Enable or Disable LRO on a VMXNET3 Adapter on a Windows Virtual Machine
- Large Receive Offload
- Poor network performance or high network latency on Windows virtual machines (2008925)
- vmxnet3 adapter on windows server 2012 with MSSQL server bottleneck problem
Context
StackExchange Database Administrators Q#188255, answer score: 3
Revisions (0)
No revisions yet.