patternModerate
What is the best option for High availability on a data warehouse?
Viewed 0 times
thewhatwarehousehighoptionforavailabilitydatabest
Problem
I have a customer who have an existing 1.5Tb data warehouse and are currently planning for a complete refresh of the DW on a new environment. Their infrastructure manager has organised 2 servers with SQL 2017 standard on each and has now asked me to plan a HA/DR plan for the new DW database/instance.
I immediately thought of using AlwaysOn Availability Groups, although I have never used them before, and none of the articles I read talk about typical data warehouse workloads - it's all OLTP applications. With a large daily ETL process and a smaller intra-day ETL process running on their current DW, will that have an impact on how we approach this?
Thanks - any help to point me in the right direction here would be beneficial!
I immediately thought of using AlwaysOn Availability Groups, although I have never used them before, and none of the articles I read talk about typical data warehouse workloads - it's all OLTP applications. With a large daily ETL process and a smaller intra-day ETL process running on their current DW, will that have an impact on how we approach this?
Thanks - any help to point me in the right direction here would be beneficial!
Solution
Most data warehouses are in Simple recovery model, and most true HA solutions require Full recovery (AGs, Mirroring). The minimum for Log Shipping is Bulk Logged, but that's not really true HA since there's no automatic failover.
If that's the case for yours (because really, a data warehouse in Full recovery is banana-town-crazy), your best bet would be a Failover Cluster.
It doesn't care what recovery model your databases are in, since it's more reliant on Windows than SQL Server. Your SQL Server just has the option to live on different nodes in the cluster if something goes amok with one. The one requirement of this technology is using shared storage, like a SAN.
You get automatic failover in most situations, but no readable replica (you don't get that with BAG, either). With Standard Edition, you're limited to a two node cluster, but that shouldn't be a big deal.
Failover Clusters are also a lot easier to manage if you're not too savvy with SQL Server. You really need basic Windows sysadmin skills. AGs can be a tough cookie when they go down, or when a patch goes bad (which sadly happens rather often).
If that's the case for yours (because really, a data warehouse in Full recovery is banana-town-crazy), your best bet would be a Failover Cluster.
It doesn't care what recovery model your databases are in, since it's more reliant on Windows than SQL Server. Your SQL Server just has the option to live on different nodes in the cluster if something goes amok with one. The one requirement of this technology is using shared storage, like a SAN.
You get automatic failover in most situations, but no readable replica (you don't get that with BAG, either). With Standard Edition, you're limited to a two node cluster, but that shouldn't be a big deal.
Failover Clusters are also a lot easier to manage if you're not too savvy with SQL Server. You really need basic Windows sysadmin skills. AGs can be a tough cookie when they go down, or when a patch goes bad (which sadly happens rather often).
Context
StackExchange Database Administrators Q#245995, answer score: 10
Revisions (0)
No revisions yet.