HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

How can a Cassandra node see another node as down?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
cannodeseedownanotherhowcassandra

Problem

I'm running Cassandra on three nodes. Here's their nodetool status output:

ubuntu@ip-10-0-8-8:~$ nodetool status
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns    Host ID                               Rack
UN  10.0.9.8   2.07 MB    256     ?       c8d574b9-540c-410f-9326-789eb75d3d14  1c
UN  10.0.8.8   2.06 MB    256     ?       d9454056-a358-4428-ab5f-c03e8042167e  1d
UN  10.0.10.8  2.01 MB    256     ?       3617643d-b0a8-4b72-a9d4-feded4445292  1a


ubuntu@ip-10-0-9-8:~$ nodetool status
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns    Host ID                               Rack
UN  10.0.9.8   2.07 MB    256     ?       c8d574b9-540c-410f-9326-789eb75d3d14  1c
UN  10.0.8.8   2.06 MB    256     ?       d9454056-a358-4428-ab5f-c03e8042167e  1d
DN  10.0.10.8  2.09 MB    256     ?       3617643d-b0a8-4b72-a9d4-feded4445292  1a


ubuntu@ip-10-0-10-8:~$ nodetool status
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns    Host ID                               Rack
UN  10.0.9.8   2.07 MB    256     ?       c8d574b9-540c-410f-9326-789eb75d3d14  1c
UN  10.0.8.8   2.06 MB    256     ?       d9454056-a358-4428-ab5f-c03e8042167e  1d
UN  10.0.10.8  2.01 MB    256     ?       3617643d-b0a8-4b72-a9d4-feded4445292  1a


Everything looks fine except one thing (last line in second block):

DN  10.0.10.8  2.09 MB    256     ?       3617643d-b0a8-4b72-a9d4-feded4445292  1a


The D in the start of the line indicates the node being down. How can it be that 10.0.9.8 is seeing the node as down while the other nodes are seeing it just fine? Does this lead to inconsistencies?

Using Cassandra version 2.1.1 by the way.

Solution

Running nodetool enablegossip on the host that appeared down to other nodes fixed it for me and for now. However, it appeared as down to all other nodes I checked. Running in a non-cloud environment.

I was curious what my other nodes said and I found one that had the same issue (like your 10.0.9.8, showing 10.0.10.8 as down). Only running nodetool enablegossip on 10.8 didn't help. But running disablegossip first and then enablegossip again did!

Context

StackExchange Database Administrators Q#184133, answer score: 4

Revisions (0)

No revisions yet.