patternsqlMinor
AG available from 1st node, but not 2nd node
Viewed 0 times
1stavailablenodebutfromnot2nd
Problem
I am in the process of configuring an Availability Group in SQL Server 2019 on Windows Server 2019.
We have two Windows Servers, UT01 and UT02, configured with a single network adapter, using static IP addresses on an Active Directory domain.
The networking group assigned a static IP address for the cluster Virtual Computer Object, UTCL, 192.168.0.12. They also assigned a static IP address for the Availability Group Listener, UTAG, 192.168.0.13.
Server
IP Address
subnet
UT01
192.168.0.10
192.168.0.0/24
UT02
192.168.0.11
192.168.0.0/24
UTCL
192.168.0.12
192.168.0.0/24
UTAG
192.168.0.13
192.168.0.0/24
The server team configured with Windows Server Failover Cluster, and I configured the SQL Servers and the Availability Group. Both SQL Servers are configured as named instances; so
From any machine on the local subnet (192.168.0.0/24) I can connect to the Availability Group using sqlcmd as such:
When I run
I can connect to the individual SQL Servers from a network outside the local subnet, via a router, using sqlcmd, i.e.
All good so far, and completely as expected.
However, when I attempt to connect to the AG name via a non-local subnet, the AG only responds when UT01 is running the primary node. When UT02 is running the primary node, we get the typical connectivity error:
A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections.
SQL Server Error Log on UT02 shows no failed logins (it is configured with th
We have two Windows Servers, UT01 and UT02, configured with a single network adapter, using static IP addresses on an Active Directory domain.
The networking group assigned a static IP address for the cluster Virtual Computer Object, UTCL, 192.168.0.12. They also assigned a static IP address for the Availability Group Listener, UTAG, 192.168.0.13.
Server
IP Address
subnet
UT01
192.168.0.10
192.168.0.0/24
UT02
192.168.0.11
192.168.0.0/24
UTCL
192.168.0.12
192.168.0.0/24
UTAG
192.168.0.13
192.168.0.0/24
The server team configured with Windows Server Failover Cluster, and I configured the SQL Servers and the Availability Group. Both SQL Servers are configured as named instances; so
UT01\INS and UT02\INS are the names returned by @@SERVERNAME;From any machine on the local subnet (192.168.0.0/24) I can connect to the Availability Group using sqlcmd as such:
sqlcmd -S UTAG\INSWhen I run
PRINT @@SERVERNAME;, either UT01\INS or UT02\INS is returned, depending on which node is currently running the primary node of the Availability Group.I can connect to the individual SQL Servers from a network outside the local subnet, via a router, using sqlcmd, i.e.
sqlcmd -S UT01\INS and sqlcmd -S UT02\INS work correctly, returning the appropriate instance name for @@SERVERNAME.All good so far, and completely as expected.
However, when I attempt to connect to the AG name via a non-local subnet, the AG only responds when UT01 is running the primary node. When UT02 is running the primary node, we get the typical connectivity error:
A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections.
SQL Server Error Log on UT02 shows no failed logins (it is configured with th
Solution
So, "GARP-based detection" needed to be enabled in the network switch serving the local subnet connecting the two SQL Servers with the other subnets.
GARP is Gratuitous Address Resolution Protocol, which is used to broadcast the MAC address of the network adapter hosting the Availability Group. In our case, the router connected to the switch was not seeing the ARP packets that are automatically generated whenever an Availability Group failover occurs, preventing it from sending TCP packets destined to the AG to the correct MAC address - it was simply always sending those packets to the first node.
This Microsoft page explains the issue.
GARP is Gratuitous Address Resolution Protocol, which is used to broadcast the MAC address of the network adapter hosting the Availability Group. In our case, the router connected to the switch was not seeing the ARP packets that are automatically generated whenever an Availability Group failover occurs, preventing it from sending TCP packets destined to the AG to the correct MAC address - it was simply always sending those packets to the first node.
This Microsoft page explains the issue.
Context
StackExchange Database Administrators Q#321958, answer score: 4
Revisions (0)
No revisions yet.