patternMinor
Why are these two HADR DMVs reporting different states?
Viewed 0 times
reportingwhythesearestateshadrdmvstwodifferent
Problem
SQL Server 2012 (11.0.5058.0) Enterprise Edition
We have 8 Availability Groups in a 2(HA)+1(DR) cluster and our monitoring DMVs are reporting results that confuse me. 6 Availability Groups are configured for HA and DR, 1 is configured for HA only, and 1 is configured for DR only.
Each of the 6 HA/DR Availability Groups have "SQLB" as a primary and "SQLA" as a secondary (synchronous) HA replica and "SQLC" as a secondary (async) replica.
On both secondaries:
reports that all Availability Group replication sync health are
reports that all replicas have a sync health of
The primary replica reports all Availability Groups and replicas with a sync health of
While I understand that one reports on replica sync health and the other reports on AG sync health, it seems logical to me that if the more granular (AG) state was not healthy, that would affect the overall health of the broader context (replica). I cannot find MSDN documentation that describes how the health is determined at each level.
Why would the secondaries report
We have 8 Availability Groups in a 2(HA)+1(DR) cluster and our monitoring DMVs are reporting results that confuse me. 6 Availability Groups are configured for HA and DR, 1 is configured for HA only, and 1 is configured for DR only.
Each of the 6 HA/DR Availability Groups have "SQLB" as a primary and "SQLA" as a secondary (synchronous) HA replica and "SQLC" as a secondary (async) replica.
On both secondaries:
SELECT dhags.group_id, dhags.synchronization_health_desc
FROM sys.dm_hadr_availability_group_states dhagsreports that all Availability Group replication sync health are
NOT_HEALTHY and select replica_id,synchronization_health_desc
from sys.dm_hadr_availability_replica_statesreports that all replicas have a sync health of
HEALTHY.The primary replica reports all Availability Groups and replicas with a sync health of
HEALTHY.While I understand that one reports on replica sync health and the other reports on AG sync health, it seems logical to me that if the more granular (AG) state was not healthy, that would affect the overall health of the broader context (replica). I cannot find MSDN documentation that describes how the health is determined at each level.
Why would the secondaries report
NOT_HEALTHY for Availability Group sync health, but HEALTHY for replica sync health, and why does this differ from the primary's report?Solution
Sadly, sys.dm_hadr_availability_replica states is not a reliable indicator of replica health. Here's the Connect item on one of the bugs we've run into where that DMV stops refreshing - note in the comments that log_send_queue_size in the DMV sys.dm_hadr_database_replica_states shows 0 even when there's log data to be sent.
Note that the Connect item is marked as Won't Fix. Sad trombone.
Note that the Connect item is marked as Won't Fix. Sad trombone.
Context
StackExchange Database Administrators Q#107540, answer score: 5
Revisions (0)
No revisions yet.