Backup, Restore, PITR & LTR
Backup and restore is not the same as HA/DR — it's your last line of defense when everything else fails. HA/DR keeps you running during failures. Backup/restore recovers you from data corruption, accidental deletion, compliance needs, and disasters where HA/DR can't help.
Why Backup ≠ HA/DR
| Scenario | HA/DR Helps? | Backup/Restore Helps? |
|---|---|---|
| Server hardware failure | ✅ | ❌ (too slow) |
| Region outage | ✅ (Failover Groups) | ✅ (Geo-restore) |
Accidental DELETE / DROP TABLE | ❌ (replicates the mistake) | ✅ (PITR) |
| Data corruption | ❌ (corruption replicates) | ✅ (restore pre-corruption point) |
| Ransomware / malicious change | ❌ (encrypted data replicates) | ✅ (restore clean backup) |
| Compliance (retain 7 years) | ❌ | ✅ (LTR) |
| Dev/test environment clone | ❌ | ✅ (restore copy) |
DP-300 trap: Many exam questions describe a corruption or accidental deletion scenario and offer Failover Groups as an option. Failover Groups replicate corruption too — they protect against infrastructure failure, not data-level mistakes. The answer is always PITR or backup restore.
What Microsoft Manages vs What You Manage
| Aspect | Azure SQL DB | SQL MI | SQL on VM |
|---|---|---|---|
| Full backups | ✅ Auto (weekly) | ✅ Auto (weekly) | ❌ You configure |
| Differential backups | ✅ Auto (12-24h) | ✅ Auto (12-24h) | ❌ You configure |
| Log backups | ✅ Auto (5-10 min) | ✅ Auto (5-10 min) | ❌ You configure |
| Backup storage | ✅ Managed | ✅ Managed | ❌ You configure |
| Encryption at rest | ✅ Auto (TDE) | ✅ Auto (TDE) | ❌ You configure |
| Retention period | You set (7-35 days) | You set (7-35 days) | You manage entirely |
| LTR policies | You configure | You configure | You manage (backup to Blob) |
| Backup verification | ✅ Auto | ✅ Auto | ❌ You run RESTORE VERIFYONLY |
| Point-in-time restore | ✅ Built-in | ✅ Built-in | ❌ You restore manually |
The biggest PaaS advantage: On Azure SQL DB and MI, you will never lose a backup and never need to write a backup script. Microsoft takes full, differential, and log backups automatically. Your only job is to set the retention period and LTR policy.
Automated Backup Schedule
| Backup Type | Frequency | You Configure? | Purpose |
|---|---|---|---|
| Full | Weekly | ❌ Automatic | Base for all restores |
| Differential | Every 12-24 hours | ❌ Automatic | Changes since last full |
| Transaction Log | Every 5-10 minutes | ❌ Automatic | Point-in-time granularity |
How PITR Uses These Backups
To restore to a specific point in time, Azure:
- Restores the most recent full backup before that point
- Applies the most recent differential after that full
- Replays transaction log backups up to the exact second you specified
This is why log backups every 5-10 minutes give you RPO of seconds — you can restore to any point within the retention window.
Point-in-Time Restore (PITR)
PITR is the most commonly used restore method for Azure SQL DB and MI.
Key Facts
| Aspect | Detail |
|---|---|
| Retention | 7-35 days (configurable, default 7) |
| Granularity | Any second within the retention window |
| Result | Creates a NEW database (never overwrites original) |
| Speed | Minutes to hours (depends on database size) |
| Cost | New database billed at the same tier |
| Cross-region | ❌ Same region only (use Geo-restore for cross-region) |
PITR creates a NEW database — it never overwrites the existing one. This catches many exam takers. After PITR, you have both the original (possibly corrupted) and the restored copy. You must manually swap them (rename, update connection strings).
PITR Scenarios
| Scenario | Action |
|---|---|
| "Restore to 3 hours ago" | PITR to GETDATE() - 3 hours → new DB |
| "Undo accidental DELETE" | PITR to just before the DELETE → extract missing rows → insert back |
| "Clone for dev/test" | PITR to current time → rename for dev use |
| "Corrupted stored procedure" | PITR to before corruption → script the correct proc → apply to production |
Long-Term Retention (LTR)
For compliance and regulatory requirements beyond 35 days:
| Aspect | Detail |
|---|---|
| Max retention | Up to 10 years |
| Storage | Azure Blob Storage (RA-GRS by default) |
| Policy | Weekly (W), Monthly (M), Yearly (Y) retention |
| Cost | Per GB stored (separate from DB billing) |
| Restore | Creates a new database from LTR backup |
LTR Policy Examples
| Policy | What It Keeps | Use Case |
|---|---|---|
W=4 | Last 4 weekly backups | General disaster recovery |
M=12 | Last 12 monthly backups | Quarterly compliance audits |
Y=7, WeekOfYear=1 | Last 7 yearly (first week of year) | 7-year regulatory retention |
W=4, M=12, Y=10 | Weekly + Monthly + Yearly | Full compliance coverage |
LTR stores full backups only — no differentials or log backups. You cannot do point-in-time restore from an LTR backup. LTR gives you the database state at the time of that weekly backup snapshot.
Backup Storage Redundancy
| Option | Copies | Protection | Cost | Use Case |
|---|---|---|---|---|
| LRS | 3 in one datacenter | Disk/rack failure | Lowest | Dev/test |
| ZRS | 3 across availability zones | Zone failure | Medium | Production (same region) |
| GRS | 6 (3 local + 3 paired region) | Region failure | Higher | Production (cross-region DR) |
| RA-GRS | 6 + read access to secondary | Region failure + read | Highest | Critical compliance |
Backup storage redundancy can only be set at database creation time (or within a short window). You cannot change from LRS to GRS after the fact. Choose based on your DR requirements from day one.
Backup to URL (SQL on VM / SQL MI) — Credential Patterns
For SQL Server on Azure VM and SQL MI, you back up directly to Azure Blob Storage with BACKUP TO URL. Both backup and restore require a SQL Server CREDENTIAL, but whether you write WITH CREDENTIAL in the T-SQL depends on which auth pattern you use.
The two auth patterns (memorize the difference)
| SAS Token (recommended) | Storage Account Key (legacy) | |
|---|---|---|
| Credential NAME | The container URL exactly | Any friendly name |
IDENTITY value | 'SHARED ACCESS SIGNATURE' | '<storage account name>' |
SECRET value | SAS token (no leading ?) | Storage account key |
WITH CREDENTIAL in BACKUP/RESTORE? | ❌ Omit — auto-matched by URL | ✅ Required — must reference the name |
| Blob type | Block blob (striping, up to 12 TB) | Page blob (legacy, 1 TB) |
Pattern A — SAS (modern, what the exam expects)
-- One-time setup: credential name MUST equal the container URL
CREATE CREDENTIAL [https://acct.blob.core.windows.net/backups]
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'sv=2022-...&sig=...'; -- no leading "?"
-- Backup — no WITH CREDENTIAL clause
BACKUP DATABASE db1
TO URL = 'https://acct.blob.core.windows.net/backups/db1.bak'
WITH COMPRESSION, FORMAT, STATS = 10;
-- Restore — same rule, no WITH CREDENTIAL clause
RESTORE DATABASE db1
FROM URL = 'https://acct.blob.core.windows.net/backups/db1.bak'
WITH REPLACE, RECOVERY;
Pattern B — Storage Account Key (legacy)
CREATE CREDENTIAL MyAzureCred
WITH IDENTITY = 'acct',
SECRET = '<storage account key>';
BACKUP DATABASE db1
TO URL = 'https://acct.blob.core.windows.net/backups/db1.bak'
WITH CREDENTIAL = 'MyAzureCred', FORMAT; -- must specify
Decision rule (this is the trick that fails most candidates):
- Credential name starts with
https://→ SAS pattern → omitWITH CREDENTIAL - Credential name is a friendly word (
MyAzureCred) → Key pattern → must includeWITH CREDENTIAL = 'name'
Memory hook: "URL-named credential = silent. Friendly-named credential = must shout its name."
Cross-server restore: The CREDENTIAL is server-scoped. To restore on a different SQL Server, you must CREATE CREDENTIAL again on that target server with the same name and SAS/key. The credential is not stored in the backup file.
SAS permission requirements
| Operation | SAS permissions needed |
|---|---|
BACKUP TO URL | Read, Write, Create, List |
RESTORE FROM URL | Read, List |
Striping for large backups
BACKUP DATABASE db1
TO URL = 'https://acct.blob.core.windows.net/backups/db1_1.bak',
URL = 'https://acct.blob.core.windows.net/backups/db1_2.bak',
URL = 'https://acct.blob.core.windows.net/backups/db1_3.bak'
WITH COMPRESSION, FORMAT;
Block blob max = 200 GB per stripe → striping enables backups up to ~12 TB.
Managed Identity (SQL Server 2022+ / SQL MI)
CREATE CREDENTIAL [https://acct.blob.core.windows.net/backups]
WITH IDENTITY = 'Managed Identity';
No SAS or key. Grant the SQL Server's managed identity the Storage Blob Data Contributor role on the container.
BACKUP / RESTORE WITH Options — Effects Matrix
The exam frequently shows a BACKUP LOG or RESTORE statement and asks you to predict its side effects on the log chain, the differential base, the database state, and recoverability. Memorize this matrix.
| Option | Effect on log chain | Effect on differential base | Effect on DB state | Typical use |
|---|---|---|---|---|
NORECOVERY | n/a (BACKUP) / preserves chain (RESTORE) | n/a | After RESTORE: DB stays in RESTORING state, ready for more logs | Tail-log + log-shipping; preparing AG/MI seeding |
STANDBY = 'undo.bak' | Preserves chain (RESTORE) | n/a | DB is READ-ONLY; further logs can still be applied | Log shipping with reporting on the standby |
RECOVERY (default on RESTORE) | Closes chain on the restored DB | n/a | DB is ONLINE | Final restore step |
COPY_ONLY | Does NOT break the log chain (LOG) | Does NOT reset the differential base (FULL) | n/a | Ad-hoc backup that must not interfere with the regular schedule |
CONTINUE_AFTER_ERROR | Allows backup to proceed past damaged pages | n/a | n/a | Damaged-DB tail-log to capture as much as possible |
NO_TRUNCATE | Backs up the active log without truncating | n/a | Works even if DB is damaged / SUSPECT | Tail-log of a damaged DB before restoring |
INIT | n/a | n/a | Overwrites existing backup sets in the file | Single-file rotating backup |
FORMAT | n/a | n/a | Reinitializes the media header (destroys all backup sets) | First write to a media set |
CHECKSUM | n/a | n/a | Verifies page checksums during backup | Always recommended |
COMPRESSION | n/a | n/a | Smaller backup file, more CPU | Default-on for most workloads |
Worked example — what does this actually do?
BACKUP LOG DB1
TO DISK = 'X:\Tail.trn'
WITH NORECOVERY, COPY_ONLY, CONTINUE_AFTER_ERROR;
Line by line:
| Clause | Effect |
|---|---|
BACKUP LOG DB1 | Captures the active portion of the transaction log. |
WITH NORECOVERY | Puts DB1 into RESTORING state immediately after the backup — classic tail-log + restore pattern. |
COPY_ONLY | The log backup does not participate in the normal log chain (no truncation, no chain advance). Subsequent regular log backups still restore cleanly. |
CONTINUE_AFTER_ERROR | If the database has damaged pages, the backup continues instead of aborting — best-effort capture. |
Trap pairing: NORECOVERY + COPY_ONLY is unusual together. NORECOVERY says "I'm about to restore over this DB", but COPY_ONLY says "don't touch the chain". The combination shows up when you want to take an out-of-band tail capture while leaving the official backup chain alone — for example, before a forensic investigation. If the question asks "which backup chain is affected?" the answer is none (because of COPY_ONLY).
COPY_ONLY on a FULL backup does not reset the differential base. COPY_ONLY on a LOG backup does not truncate the log. Differential backups cannot be COPY_ONLY.
Restore Options by Platform
Azure SQL Database
| Restore Type | RPO | RTO | Creates New DB? | Cross-Region? |
|---|---|---|---|---|
| PITR | Seconds | Minutes-hours | ✅ Yes | ❌ Same region |
| Deleted DB restore | To deletion time | Minutes | ✅ Yes | ❌ Same region |
| Geo-restore | Up to 1 hour | Hours | ✅ Yes | ✅ Any region |
| LTR restore | Weekly snapshot | Hours | ✅ Yes | ✅ Any region |
| Copy | Current point | Minutes | ✅ Yes | ❌ Same server |
SQL Managed Instance
| Restore Type | RPO | RTO | Notes |
|---|---|---|---|
| PITR | Seconds | Minutes-hours | Same as SQL DB |
| Cross-instance restore | Seconds | Minutes | Restore to different MI |
| Geo-restore | Up to 1 hour | Hours | From GRS backup |
| LTR restore | Weekly | Hours | Same as SQL DB |
| Native backup to URL | You control | You control | Manual backup to Azure Blob |
SQL Server on VM
| Restore Type | RPO | RTO | Notes |
|---|---|---|---|
| Native BACKUP/RESTORE | You control | You control | Full control, full responsibility |
| Backup to Azure Blob | You control | Hours | BACKUP TO URL |
| Automated Backup (IaaS Agent) | Configurable | Hours | If SQL IaaS Agent configured |
| Azure Backup for SQL VM | 15 min | Hours | Managed backup vault |
| File-level backup (Azure Backup) | Daily | Hours | VM-level backup (not SQL-aware) |
Azure Backup for SQL Server in Azure VM is a separate service from SQL Server's native backup. It provides centralized backup management, 15-minute RPO, long-term retention in Recovery Services Vault, and no backup scheduling scripts needed.
RPO and RTO Explained
| Term | What It Means | Business Question |
|---|---|---|
| RPO (Recovery Point Objective) | How much data can you afford to lose | "How old can the restored data be?" |
| RTO (Recovery Time Objective) | How long can you be down | "How fast do we need to be back online?" |
RPO/RTO by Solution
| Solution | RPO | RTO | Cost |
|---|---|---|---|
| Automated backups (PITR) | 5-10 min | Hours | Free (included) |
| Geo-restore | Up to 1 hour | Hours | Free (included with GRS) |
| Active Geo-Replication | < 5 seconds | ~30 seconds | 💰 (secondary DB cost) |
| Failover Groups | < 5 seconds | Grace period + ~30s | 💰 (secondary DB cost) |
| BC tier local replica | 0 (synchronous) | < 10 seconds | 💰💰 (BC tier cost) |
| LTR backup | Weekly | Hours | 💰 (storage cost) |
The cost-RPO tradeoff: Automated backups (free) give you hours of RTO and minutes of RPO. Failover Groups (extra DB cost) give you seconds of both. Choose based on your business requirements — not every database needs sub-second RPO.
Common Misunderstandings
| Misconception | Reality |
|---|---|
| "PITR overwrites my database" | ❌ PITR creates a NEW database. Original is untouched. |
| "I can restore to any point from LTR" | ❌ LTR only stores weekly full backups. No point-in-time granularity. |
| "Geo-replication protects against accidental DELETE" | ❌ Deletion replicates instantly. Use PITR instead. |
| "I can change backup redundancy anytime" | ❌ Only at creation (or within a short window). |
| "Azure SQL DB backups are my responsibility" | ❌ Fully automated. You only set retention + LTR policy. |
| "Backup retention of 7 days means 7 recovery points" | ❌ You can restore to ANY SECOND within those 7 days. |
| "PITR can restore to a different region" | ❌ Same region only. Use Geo-restore for cross-region. |
Backup Strategy Decision Matrix
| Requirement | Solution |
|---|---|
| "Recover from accidental deletion" | PITR (within 7-35 day retention) |
| "Recover from data corruption" | PITR (restore to before corruption) |
| "Keep backups for 5+ years" | LTR policy (W/M/Y) |
| "Cross-region disaster recovery" | Geo-restore (hours RTO) or Failover Groups (seconds) |
| "Clone database for dev/test" | PITR to current time or database copy |
| "Recover a deleted database" | Deleted database restore (within retention) |
| "Restore to on-premises" | Export BACPAC or native backup (VM/MI only) |
Anti-Patterns
- "PITR is enough for compliance retention." PITR caps at 35 days. For 5/7/10-year retention you need LTR policy (Weekly / Monthly / Yearly). Treat them as complementary.
- "Restore over the original DB." PITR / Geo-restore always create a NEW database. Plan a swap (rename + connection string update) as part of the runbook.
- "Geo-restore = Failover Group." Geo-restore = hours RTO, manual, latest geo-replicated backup. Failover Group = seconds RTO, automated, continuous async replication. Pick by RTO.
- "BACPAC is a real backup." BACPAC is schema + data, not transactionally consistent unless DB is offline. Use only for migration / clone, not DR.
- "Native backups on SQL VM → local D: drive." D: drive is ephemeral on Azure VMs. Backup to Azure Blob (URL) so the backup survives VM redeploy.
- "Disable auto-backups during a heavy load to save IO." You can't on Azure SQL DB / MI — they're system-managed. On SQL VM, disabling = no recovery point that window. Throttle via Resource Governor or schedule heavy load off-window instead.
LTR policies apply going forward only. Enabling LTR today does NOT retroactively keep backups from yesterday. Plan retention before the data is created — or accept the gap.
Migration Between Backup / Restore Strategies
| From → To | Path | Cost |
|---|---|---|
| Default 7-day PITR → 35-day PITR | Configure on DB | Linear backup storage cost |
| PITR only → PITR + LTR (W/M/Y) | Configure LTR policy | LTR storage cost; gain compliance retention |
| LTR W/M/Y → LTR Y only | Edit policy; W/M backups age out | Cuts cost; lose granular long-term recovery |
| Geo-restore → Failover Group | Create FOG; flip over read-write listener | Cuts RTO from hours to seconds; secondary compute billed |
| Native backup to local disk → Backup-to-URL (Blob) | Change BACKUP DATABASE TO URL | Survives VM redeploy; egress cost negligible |
| Manual backup script → Maintenance Plan / Ola Hallengren | Replace with proven scripts | Battle-tested; less ops |
| Single-region LTR → LTR with geo-redundant storage | Storage account RA-GRS for LTR | Doubles storage cost; gains regional resilience |
| BACPAC export → Database copy + transactional replication for migration | DB copy via PITR; replicate delta | Lower downtime than BACPAC import |
Most expensive moves: enabling LTR + geo-redundant storage (storage cost compounds over years) and PITR → Failover Group (always-on secondary compute).
Real Scenarios
- Regulated bank, 7-year retention → PITR 35d + LTR W/M/Y geo-redundant. Driver: regulator; trade-off: storage cost grows linearly forever — budget for it.
- SaaS app, RTO < 1 min, RPO < 5 sec → Failover Group (auto-failover policy 1h grace, manual approval). Driver: business SLA. Trade-off: secondary compute billed full price.
- Dev clone of prod every Monday → Database copy via PITR → dev sub (or
az sql db copy). Driver: latest data, no schema drift. Trade-off: dev compute cost. - Recover from accidental TRUNCATE 2h ago → PITR to 5 minutes before TRUNCATE → new DB → export the table → BCP into prod. Driver: surgical recovery. Trade-off: time to recover ~30 min.
- SQL VM in IaaS, hands-off backups → Azure Backup for SQL Server in VM (auto-protection on VM tag) + retention W/M/Y in Recovery Services Vault. Driver: native managed service. Trade-off: vault pricing per protected instance.