SQL Managed Instance — High Availability Architecture
SQL Managed Instance has built-in HA that works differently from SQL Server on VMs. You don't configure WSFC or load balancers — Microsoft handles it. But understanding the architecture is critical for DP-300 and for choosing the right service tier.
General Purpose vs Business Critical HA
General Purpose Tier
| Aspect | Details |
|---|---|
| HA Mechanism | Azure Service Fabric manages compute. Storage is remote (Azure Premium Storage). |
| Replicas | 1 compute node + 1 standby (no data, spun up on failure) |
| Failover time | ~30 seconds (new node must attach remote storage) |
| Read replicas | ❌ None |
| Storage | Remote — Azure Premium Storage (up to 16 TB) |
| IOPS | Lower — depends on storage tier, not local SSD |
| Use case | Cost-sensitive workloads, dev/test, apps tolerant of brief failover |
| SLA | 99.99% |
How General Purpose failover works:
- Azure detects compute node failure
- A new compute node is provisioned (from standby pool)
- The new node attaches to the existing remote storage
- Databases come online — no data loss (storage is durable)
- Client connections are redirected via the gateway
General Purpose failover can take up to 30 seconds or longer because the standby node needs to start SQL Server and attach remote storage files. Applications should implement retry logic with exponential backoff.
Business Critical Tier
| Aspect | Details |
|---|---|
| HA Mechanism | Built-in Always On AG (similar technology as SQL Server AG) |
| Replicas | 1 primary + 3 secondary replicas (all with local data) |
| Failover time | ~5-10 seconds |
| Read replicas | ✅ 1 free read-only replica (built-in read scale-out) |
| Storage | Local SSD on each replica — much faster I/O |
| IOPS | Higher — local NVMe/SSD storage |
| Use case | Mission-critical workloads, OLTP, low latency requirements |
| SLA | 99.99% |
How Business Critical failover works:
- Primary replica failure is detected
- One of the 3 secondaries is promoted to primary (all have local data copies)
- Failover is near-instant because data is already local
- The gateway redirects connections to the new primary
- A new secondary is provisioned in the background
Business Critical = Always On AG under the hood (4-node AG with synchronous commit). You get one free read-only replica accessible via the connection string option ApplicationIntent=ReadOnly. This is the same technology as SQL Server AG but fully managed.
VNet Architecture
Unlike Azure SQL Database, MI is deployed inside your VNet:
Subnet Requirements
| Requirement | Details |
|---|---|
| Dedicated subnet | MI requires its own subnet — no other resources allowed |
| Minimum size | /27 (32 IPs) for a single MI, /26 recommended |
| NSG | Auto-created and managed by Azure; you can add additional rules |
| Route Table | Auto-created; includes mandatory routes for MI management |
| Delegation | Subnet must be delegated to Microsoft.Sql/managedInstances |
Don't put anything else in the MI subnet! The subnet must be delegated exclusively to Managed Instance. Also, you cannot easily change the subnet size after MI is deployed — plan for growth (use /26 or /25).
Failover Groups for Managed Instance
For cross-region DR, MI uses Failover Groups (similar to Azure SQL DB):
# Create MI failover group
az sql instance-failover-group create \
--resource-group rg-dp300 \
--location eastus \
--mi mi-primary \
--partner-mi mi-secondary \
--partner-resource-group rg-dp300-dr \
--name fog-mi-dp300 \
--failover-policy Automatic \
--grace-period 60
# Connection endpoints (auto-generated):
# Read-Write: fog-mi-dp300.zone-id.database.windows.net
# Read-Only: fog-mi-dp300.secondary.zone-id.database.windows.net
| Feature | Details |
|---|---|
| Scope | All user databases on the instance (all or nothing) |
| Endpoint | Single DNS endpoint, no change on failover |
| Failover | Automatic with configurable grace period |
| Replication | Asynchronous geo-replication |
| Read-Only | Secondary is readable via .secondary endpoint |
MI Failover Groups replicate all user databases — you can't pick individual databases like in Azure SQL DB. The system databases (master, msdb) are NOT replicated. Logins and agent jobs must be manually synced or scripted.
MI Link Feature
The MI Link enables near real-time data replication between on-premises SQL Server and Managed Instance:
Use cases:
- Offload reporting to MI while keeping on-prem as primary
- Migration staging — replicate continuously, then cut over
- Hybrid DR — MI as a disaster recovery target
Performance Comparison
| Metric | General Purpose | Business Critical |
|---|---|---|
| vCores | 4-80 | 4-128 |
| Max storage | 16 TB | 16 TB |
| IOPS | ~500-7500 (storage-dependent) | ~30,000-400,000+ (local SSD) |
| Latency | ~5-10ms (remote storage) | ~1-2ms (local SSD) |
| Read replicas | 0 | 1 (free) |
| Failover time | ~30s | ~5-10s |
| Zone redundancy | ✅ (optional) | ✅ (optional) |
| In-memory OLTP | ❌ | ✅ |
| Price | ~40-60% less | Higher |
Start with General Purpose for most workloads. Move to Business Critical only if you need: sub-2ms latency, very high IOPS, read replicas, or In-memory OLTP. The cost difference is significant — Business Critical is roughly 2-3x the price.