Skip to main content

Failover Groups vs Active Geo-Replication

Both provide cross-region disaster recovery for Azure SQL — but they work differently and suit different scenarios. This page makes the choice clear.


Side-by-Side Comparison

AspectActive Geo-ReplicationAuto-Failover Groups
What it isAsync readable secondary in any regionManaged failover with single endpoint
Max secondaries41 partner (primary ↔ secondary)
Auto failover❌ Manual only✅ Automatic (with grace period)
Single endpoint❌ Each secondary has its ownfog.database.windows.net (never changes)
Read-only endpoint✅ Per secondaryfog.secondary.database.windows.net
ScopePer databaseMultiple databases (group)
Failover unitOne database at a timeAll databases in the group together
DNS update on failover❌ Must update connection strings✅ Automatic DNS flip
Grace periodN/AConfigurable (default 60 min)
Works with SQL DB
Works with MI✅ (all user DBs, all-or-nothing)
ReplicationAsyncAsync
RPO< 5 seconds< 5 seconds
RTOManual trigger (~30 sec)Grace period + ~30 sec
CostSecondary DB billedSecondary DB billed

What Each Option Is

Active Geo-Replication

A per-database feature that creates an asynchronous readable copy in another region. You manage failover manually. Each secondary has its own connection string.

Think of it as: "I want read replicas in other regions, and I'll handle failover myself."

Auto-Failover Groups

A managed group of databases with automatic failover, a single DNS endpoint that never changes, and built-in read/write + read-only listeners.

Think of it as: "I want Azure to handle everything — one endpoint, automatic failover, multiple databases fail over together."


Endpoint Behavior

This is the biggest practical difference and the most tested on DP-300.

Geo-Replication Endpoints

DatabaseEndpoint
Primaryserver1.database.windows.net
Secondary 1server2-region2.database.windows.net
Secondary 2server3-region3.database.windows.net

Problem: On failover, your app must update its connection string to point to the new primary.

Failover Group Endpoints

EndpointPoints ToChanges on Failover?
fog-name.database.windows.netCurrent primary❌ Never changes
fog-name.secondary.database.windows.netCurrent secondary❌ Never changes

Advantage: Your app connects to the fog endpoint and never needs to change connection strings, even after failover. DNS flips automatically.

🎯 Exam Focus

DP-300 loves this question: "How do you ensure application connection strings don't change during failover?" → Auto-Failover Groups. The fog endpoint abstracts the primary/secondary — DNS updates happen automatically behind the scenes.


Failover Behavior

Geo-Replication Failover

  1. You detect the outage (or Azure notifies you)
  2. You manually trigger failover to a specific secondary
  3. That secondary becomes the new primary
  4. You update application connection strings to the new server
  5. The old primary (when recovered) becomes a secondary

Failover Group Failover

  1. Azure detects the outage
  2. Grace period passes (default 60 min — configurable)
  3. Automatic failover triggers (or you trigger manually before grace period)
  4. DNS endpoint fog.database.windows.net automatically points to the new primary
  5. Applications don't need any changes
  6. The old primary (when recovered) automatically becomes the secondary
⚠️ Watch Out

Grace period trap: The default grace period is 60 minutes. This means Azure waits 60 minutes before automatically failing over. During this time, the database is unavailable. You can set it as low as 1 hour or trigger manual failover immediately. Set this based on your RTO requirement.


Scope: Per-Database vs Group

ScenarioGeo-ReplicationFailover Groups
Fail over one database❌ (all or nothing)
Fail over 10 databases together consistently❌ (10 separate failovers)✅ (one action)
Different databases in different regions✅ (flexible)❌ (one partner region)
Mix of databases with and without DR✅ (per DB choice)❌ (all in group)
🏢 Real-World DBA Note

Production pattern: Use Failover Groups for your core application databases that must fail over together (consistency). Use Geo-Replication for analytics/reporting replicas that don't need automatic failover.


Managed Instance Differences

FeatureSQL DBMI
Geo-Replication❌ Not supported
Failover Groups
FG scopeSelected databasesALL user databases (all-or-nothing)
FG limitMultiple groups per server1 failover group per MI
🎯 Exam Focus

MI critical facts: MI does NOT support Active Geo-Replication — only Failover Groups. And MI Failover Groups replicate ALL user databases — you can't choose which ones. Only 1 failover group per MI.


Choose This When...

RequirementChoose
"Connection strings must never change on failover"Failover Groups
"Failover must be automatic"Failover Groups
"I need readable replicas in 3+ regions"Geo-Replication (up to 4 secondaries)
"Multiple databases must fail over as a unit"Failover Groups
"I want control over which databases have DR"Geo-Replication (per database)
"I use Managed Instance"Failover Groups (only option)
"I need the simplest setup"Failover Groups
"I need read locality in multiple regions"Geo-Replication (secondary per region)
"Recommended by Microsoft for production DR"Failover Groups

Common Misconceptions

MisconceptionReality
"Geo-Replication has automatic failover"❌ Manual only. Failover Groups have automatic.
"Failover Groups support 4 secondaries"❌ Only 1 partner. Geo-Replication supports up to 4.
"I can choose which MI databases to replicate"❌ MI Failover Groups replicate ALL user databases.
"Failover is instant"❌ Grace period (default 60 min) + ~30 sec failover.
"Both protect against accidental DELETE"❌ Neither does. Deletion replicates. Use PITR.
"Geo-Replication works on MI"❌ MI only supports Failover Groups.
"Failover Group DNS takes hours to update"❌ DNS TTL is 30 seconds. Failover is fast once triggered.
"I need both for full DR"❌ Usually one or the other. FG is sufficient for most production scenarios.

RPO/RTO Summary

SolutionRPORTO
Geo-Replication (manual failover)< 5 secManual trigger + ~30 sec
Failover Groups (auto)< 5 secGrace period + ~30 sec
Failover Groups (manual)< 5 sec~30 sec
PITR (for comparison)5-10 minHours

Both have identical RPO (< 5 seconds) because both use async replication. The RTO difference is the grace period — Geo-Rep depends on how fast you react, Failover Groups wait for the grace period then auto-failover.


Flashcards

What is the biggest practical difference between Geo-Replication and Failover Groups?
Click to reveal answer
Failover Groups provide a single DNS endpoint that never changes on failover. Geo-Replication requires updating connection strings to point to the new primary.
1 / 8

Quiz

Q1/5
0 correct
An application needs cross-region DR with zero connection string changes on failover. What should you use?