Index Design & Query Tuning Deep Dive
How SQL Server Finds Data — The Foundation
Index Architecture — B-Tree Structure
Key insight: B-tree depth is typically 3-4 levels even for millions of rows. An Index Seek reads 3-4 pages (microseconds) vs a Table Scan reading thousands of pages.
The Key Lookup Problem
The #1 performance killer after missing indexes.
The problem: Index on City finds 500 rows quickly, but Name and Email aren't in the index. SQL Server must do 500 individual lookups back to the clustered index — each one is a random I/O.
The solution: Add INCLUDE columns to the index:
-- Before: Key Lookup for every row
CREATE INDEX IX_City ON Customers(City);
-- After: Covering index — no Key Lookup needed
CREATE INDEX IX_City ON Customers(City) INCLUDE (Name, Email);
INCLUDE columns are stored ONLY at the leaf level of the index (not in the B-tree navigation structure). They make the index "cover" the query without making the B-tree wider. If you see Key Lookups in an execution plan → add the missing columns to INCLUDE.
SARGable vs Non-SARGable Queries
SARGable (Search ARGument able) = the query can use an index seek. Non-SARGable = forced to scan.
| Pattern | SARGable? | Why |
|---|---|---|
WHERE LastName = 'Smith' | ✅ Yes | Direct equality match |
WHERE LastName LIKE 'Sm%' | ✅ Yes | Leading characters known |
WHERE OrderDate >= '2025-01-01' | ✅ Yes | Range on indexed column |
WHERE YEAR(OrderDate) = 2025 | ❌ No | Function on column — can't seek |
WHERE LastName LIKE '%mith' | ❌ No | Leading wildcard — must scan |
WHERE Price * 1.1 > 100 | ❌ No | Expression on column |
WHERE ISNULL(City, 'N/A') = 'Cairo' | ❌ No | Function wrapping column |
WHERE CAST(OrderDate AS DATE) = '2025-01-01' | ❌ No | CAST on column |
How to Fix Non-SARGable Queries
| Non-SARGable | SARGable Rewrite |
|---|---|
WHERE YEAR(OrderDate) = 2025 | WHERE OrderDate >= '2025-01-01' AND OrderDate < '2026-01-01' |
WHERE LastName LIKE '%mith' | Use Full-Text Search or reverse the string |
WHERE Price * 1.1 > 100 | WHERE Price > 100 / 1.1 |
WHERE ISNULL(City, 'N/A') = 'Cairo' | WHERE City = 'Cairo' |
Golden rule: Never apply a function/expression to the LEFT side of a WHERE clause on an indexed column. Move the function to the RIGHT side (constant side). This is heavily tested on DP-300.
Oracle DBA parallel: SARGability in SQL Server = Oracle's ability to use "index range scan" vs "full table scan". The same rule applies in Oracle — WHERE TRUNC(date_col) = DATE won't use the index. The fix is the same: use range predicates instead of functions on columns.
Index Design Strategy
The Decision Checklist
Column Order Matters!
The order of columns in a composite index is critical:
Index: IX_Orders (CustomerID, OrderDate, Status)
| Query | Uses Index? | Why |
|---|---|---|
WHERE CustomerID = 5 | ✅ Seek | Leftmost column |
WHERE CustomerID = 5 AND OrderDate > '2025-01-01' | ✅ Seek | Left-to-right prefix |
WHERE OrderDate > '2025-01-01' | ❌ Scan | Skips leftmost column |
WHERE Status = 'Active' | ❌ Scan | Skips first 2 columns |
WHERE CustomerID = 5 AND Status = 'Active' | ✅ Partial seek | Seeks on CustomerID, scans Status |
Put the most selective (equality) columns FIRST in a composite index. Range columns go last. Example: INDEX (CustomerID, Status, OrderDate) — equality, equality, range.
Index Anti-Patterns
| Anti-Pattern | Problem | Fix |
|---|---|---|
| Too many indexes | Slows INSERT/UPDATE/DELETE (each index must be updated) | Drop unused indexes (check sys.dm_db_index_usage_stats) |
| Overlapping indexes | IX1(A, B) and IX2(A, B, C) — IX1 is redundant | Drop the shorter one |
| Wide indexes (many columns in key) | Larger B-tree, slower maintenance | Move non-key columns to INCLUDE |
| Index on low-cardinality column | IX(Gender) — only 2 values, scan anyway | Remove or combine with high-cardinality column |
| No index maintenance | Fragmented indexes = slow reads | Schedule REORGANIZE/REBUILD |
Wait Statistics — Systematic Troubleshooting
Wait stats tell you what SQL Server is waiting for — the most powerful diagnostic tool.
Top Wait Types Cheat Sheet
| Wait Type | Means | Oracle Equivalent | Fix |
|---|---|---|---|
| PAGEIOLATCH_SH/EX | Waiting for data page from disk | db file sequential read | Add missing indexes, scale storage IOPS |
| LCK_M_S / LCK_M_X | Waiting for lock (blocking) | enq: TX - row lock contention | Fix long transactions, use RCSI |
| CXPACKET | Parallelism overhead | PX Deq: Execute Reply | Lower MAXDOP, update stats |
| SOS_SCHEDULER_YIELD | CPU-bound query | CPU used by this session | Optimize query, scale compute |
| WRITELOG | Waiting for log flush | log file sync | Scale to higher tier (more log throughput) |
| RESOURCE_SEMAPHORE | Waiting for memory grant | PGA memory | Reduce query memory needs, scale up |
| THREADPOOL | No available worker threads | Session limit hit | Reduce connections, connection pooling |
| IO_COMPLETION | Waiting for I/O | direct path read/write | Scale storage, reduce I/O |
Exam pattern: "Database is slow, top wait is X. What should you do?" → Map the wait type to the fix. PAGEIOLATCH = I/O/indexes. LCK = blocking. CXPACKET = MAXDOP. SOS_SCHEDULER = CPU. WRITELOG = log throughput.
Blocking & Deadlocks
Blocking Chain
How to investigate:
sys.dm_exec_requests→blocking_session_idcolumn shows who's blocking whomsys.dm_tran_locks→ shows what locks are held and requested- Read Committed Snapshot Isolation (RCSI) = the best fix for most blocking scenarios
RCSI — The Blocking Eliminator
| Isolation Level | Readers Block Writers? | Writers Block Readers? | Row Versioning? |
|---|---|---|---|
| Read Committed (default) | ❌ | ✅ Yes (readers wait) | ❌ |
| Read Committed Snapshot (RCSI) | ❌ | ❌ No blocking! | ✅ (tempdb) |
| Snapshot | ❌ | ❌ | ✅ (tempdb) |
RCSI is enabled by default in Azure SQL DB and MI. On SQL Server VMs, you must enable it: ALTER DATABASE mydb SET READ_COMMITTED_SNAPSHOT ON. This is a major exam topic — Azure SQL = no reader/writer blocking by default.
Deadlocks
Deadlock resolution strategy:
- Check system_health XEvent session (always captures deadlock graphs)
- Analyze the deadlock graph — identify the two resources and sessions
- Fix by: accessing tables in consistent order, keeping transactions short, adding appropriate indexes
TempDB Optimization
TempDB is used for sorts, hash joins, version store (RCSI), temp tables, and spills.
TempDB Contention Pattern
TempDB best practices:
- Multiple data files — 1 per CPU core (up to 8 files for most)
- All files same size (ensures even round-robin allocation)
- Pre-size files — don't rely on autogrow
- In Azure SQL DB/MI: Microsoft manages this for you
- On SQL VMs: you must configure this manually
If the exam mentions PAGELATCH waits in tempdb → add more tempdb data files. If it mentions tempdb filling up → check for long-running transactions (version store), large sorts (missing indexes), or explicit temp table abuse.
Parameter Sniffing — Detection & Solutions
Solutions (from least to most aggressive)
| Solution | How | When |
|---|---|---|
| Update Statistics | UPDATE STATISTICS table | Stale stats causing bad estimates |
| OPTIMIZE FOR UNKNOWN | Query hint | Removes sniffing — uses average stats |
| RECOMPILE | OPTION(RECOMPILE) or sp-level | Always gets fresh plan (CPU cost) |
| Plan Guides | Force a specific plan | Emergency fix for vendor code you can't change |
| Parameter Sensitive Plan (IQP) | Compat level 160 | Best: automatic multiple plan variants |
Database-Scoped Configuration
Important settings that control query behavior per database.
| Setting | What It Does | Recommended |
|---|---|---|
| MAXDOP | Max parallelism per query | 8 or less (depends on cores) |
| LEGACY_CARDINALITY_ESTIMATION | Use old CE model | OFF (unless regressions) |
| OPTIMIZE_FOR_AD_HOC_WORKLOADS | Cache only plan stubs for one-time queries | ON (saves plan cache memory) |
| QUERY_OPTIMIZER_HOTFIXES | Enable latest optimizer fixes | ON |
| LAST_QUERY_PLAN_STATS | Capture last actual plan | ON (enables sys.dm_exec_query_plan_stats) |
OPTIMIZE_FOR_AD_HOC_WORKLOADS = ON prevents single-use query plans from bloating the plan cache. Instead of caching the full plan on first execution, it caches a small "stub." Full plan is only cached on second execution. Exam-tested!
DBCC CHECKDB — Database Integrity
Verifies database integrity — catches corruption before it causes data loss.
| Command | What It Checks | Impact |
|---|---|---|
| CHECKDB | Everything (allocation, structural, logical) | Heavy — schedule off-peak |
| CHECKTABLE | Single table | Medium — targeted fix |
| CHECKALLOC | Allocation consistency | Light |
| CHECKFILEGROUP | One filegroup | Medium |
Strategy: Run CHECKDB weekly on SQL VMs (Azure SQL DB/MI does this automatically). Check msdb.dbo.suspect_pages for known corruption.
In Azure SQL DB and MI, Microsoft runs integrity checks automatically — you don't need to schedule DBCC CHECKDB. On SQL VMs, YOU must schedule it (via SQL Agent). This is an exam trap — knowing who's responsible for what.