HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

How efficiently does GlusterFS deal with failures?

Submitted by: @import:stackexchange-devops··
0
Viewed 0 times
glusterfsefficientlywithfailuresdoeshowdeal

Problem

What happens when one of the storage bricks goes down (HDD failure), how does GlusterFS deal with data recover? Is the stored data still safe?

Solution

Gluster has built in data "translators" that automatically replicate data across all of your bricks. The particular type of translator you're interested in is called AFR for automatic file replication. The AFR translator also uses the DHT (distributed hash table) translator. It's important that you have at least two master bricks since if you only have one, you have a single point of failure. As long as you have at least two bricks, an auto-healing process is triggered by GlusterFS's daemon (it's automatically installed when you invoke gluster the first time) using the other master server's replicated/translated data if anything goes wrong. Actual disaster recovery that the auto healing can't fix requires going through a careful process documented here.

Other wisdom on the topic (for example, Redhat's documentation) recommends having at least six bricks in two sets, actually. This way, "even if we lose two bricks from each set, there is no data loss". Basically, the more bricks you have, the more redundancy you have, and the less chance of data loss you have (however, it'll become cost-prohibitive and time-prohibitive after a while).

Context

StackExchange DevOps Q#181, answer score: 4

Revisions (0)

No revisions yet.