Skip to main content

Index Design & Query Tuning Deep Dive

How SQL Server Finds Data — The Foundation

Data Access Methods
🐢
Table Scan
Reads every page. O(n) cost. No useful index exists for the query predicate.
🎯
Index Seek
B-tree navigation: 3-4 page reads. O(log n). Index matches the WHERE clause.
Covering Index
Seek + all columns in index (INCLUDE). Zero Key Lookups. Best possible performance.

Index Architecture — B-Tree Structure

B-Tree Index Structure
🏛️
Root Node
Entry point. Contains key ranges pointing to intermediate pages. Single page.
📎
Intermediate Levels
Navigate key ranges to narrow down to the correct leaf page. 1-2 levels typical.
🍃
Leaf Level
Contains actual data (clustered) or row pointers (non-clustered). Doubly-linked list for range scans.

Key insight: B-tree depth is typically 3-4 levels even for millions of rows. An Index Seek reads 3-4 pages (microseconds) vs a Table Scan reading thousands of pages.

The Key Lookup Problem

The #1 performance killer after missing indexes.

The Key Lookup Problem
🔍
Index Seek on City
NC index on City finds 500 matching rows
Fast: only 3-4 page reads
⚠️
500 Key Lookups
Name and Email not in index
Each row requires a random I/O back to clustered index
Fix: INCLUDE Columns
CREATE INDEX IX_City ON Customers(City) INCLUDE (Name, Email)
Covering index eliminates all lookups

The problem: Index on City finds 500 rows quickly, but Name and Email aren't in the index. SQL Server must do 500 individual lookups back to the clustered index — each one is a random I/O.

The solution: Add INCLUDE columns to the index:

-- Before: Key Lookup for every row
CREATE INDEX IX_City ON Customers(City);

-- After: Covering index — no Key Lookup needed
CREATE INDEX IX_City ON Customers(City) INCLUDE (Name, Email);
🎯 Exam Focus

INCLUDE columns are stored ONLY at the leaf level of the index (not in the B-tree navigation structure). They make the index "cover" the query without making the B-tree wider. If you see Key Lookups in an execution plan → add the missing columns to INCLUDE.

SARGable vs Non-SARGable Queries

SARGable (Search ARGument able) = the query can use an index seek. Non-SARGable = forced to scan.

PatternSARGable?Why
WHERE LastName = 'Smith'✅ YesDirect equality match
WHERE LastName LIKE 'Sm%'✅ YesLeading characters known
WHERE OrderDate >= '2025-01-01'✅ YesRange on indexed column
WHERE YEAR(OrderDate) = 2025❌ NoFunction on column — can't seek
WHERE LastName LIKE '%mith'❌ NoLeading wildcard — must scan
WHERE Price * 1.1 > 100❌ NoExpression on column
WHERE ISNULL(City, 'N/A') = 'Cairo'❌ NoFunction wrapping column
WHERE CAST(OrderDate AS DATE) = '2025-01-01'❌ NoCAST on column

How to Fix Non-SARGable Queries

Non-SARGableSARGable Rewrite
WHERE YEAR(OrderDate) = 2025WHERE OrderDate >= '2025-01-01' AND OrderDate < '2026-01-01'
WHERE LastName LIKE '%mith'Use Full-Text Search or reverse the string
WHERE Price * 1.1 > 100WHERE Price > 100 / 1.1
WHERE ISNULL(City, 'N/A') = 'Cairo'WHERE City = 'Cairo'
🎯 Exam Focus

Golden rule: Never apply a function/expression to the LEFT side of a WHERE clause on an indexed column. Move the function to the RIGHT side (constant side). This is heavily tested on DP-300.

🏢 Real-World DBA Note

Oracle DBA parallel: SARGability in SQL Server = Oracle's ability to use "index range scan" vs "full table scan". The same rule applies in Oracle — WHERE TRUNC(date_col) = DATE won't use the index. The fix is the same: use range predicates instead of functions on columns.

Index Design Strategy

The Decision Checklist

Index Design Checklist
🎯
Equality Columns First
Put most selective equality columns (=) first in composite index key
📈
Range Columns Last
Range predicates (>, <, BETWEEN) go after equality columns in the key
📋
INCLUDE for SELECT
Add columns from SELECT list as INCLUDE to avoid Key Lookups
🗑️
Drop Overlapping
IX(A,B) is redundant if IX(A,B,C) exists. Drop the shorter one.

Column Order Matters!

The order of columns in a composite index is critical:

Index: IX_Orders (CustomerID, OrderDate, Status)
QueryUses Index?Why
WHERE CustomerID = 5✅ SeekLeftmost column
WHERE CustomerID = 5 AND OrderDate > '2025-01-01'✅ SeekLeft-to-right prefix
WHERE OrderDate > '2025-01-01'❌ ScanSkips leftmost column
WHERE Status = 'Active'❌ ScanSkips first 2 columns
WHERE CustomerID = 5 AND Status = 'Active'✅ Partial seekSeeks on CustomerID, scans Status
🎯 Exam Focus

Put the most selective (equality) columns FIRST in a composite index. Range columns go last. Example: INDEX (CustomerID, Status, OrderDate) — equality, equality, range.

Index Anti-Patterns

Anti-PatternProblemFix
Too many indexesSlows INSERT/UPDATE/DELETE (each index must be updated)Drop unused indexes (check sys.dm_db_index_usage_stats)
Overlapping indexesIX1(A, B) and IX2(A, B, C) — IX1 is redundantDrop the shorter one
Wide indexes (many columns in key)Larger B-tree, slower maintenanceMove non-key columns to INCLUDE
Index on low-cardinality columnIX(Gender) — only 2 values, scan anywayRemove or combine with high-cardinality column
No index maintenanceFragmented indexes = slow readsSchedule REORGANIZE/REBUILD

Wait Statistics — Systematic Troubleshooting

Wait stats tell you what SQL Server is waiting for — the most powerful diagnostic tool.

Wait Statistics Troubleshooting
💾
PAGEIOLATCH
Waiting for data page from disk. Fix: add missing indexes, scale storage IOPS.
🔒
LCK_M_S / LCK_M_X
Waiting for row/page lock (blocking). Fix: shorten transactions, enable RCSI.
⚙️
CXPACKET
Parallelism synchronization overhead. Fix: lower MAXDOP, update statistics.
💻
SOS_SCHEDULER_YIELD
CPU-bound query yielding scheduler. Fix: optimize query, scale compute.
📝
WRITELOG
Waiting for log flush to disk. Fix: scale to higher tier for more log throughput.
🧠
RESOURCE_SEMAPHORE
Waiting for memory grant. Fix: reduce query memory needs, scale up RAM.

Top Wait Types Cheat Sheet

Wait TypeMeansOracle EquivalentFix
PAGEIOLATCH_SH/EXWaiting for data page from diskdb file sequential readAdd missing indexes, scale storage IOPS
LCK_M_S / LCK_M_XWaiting for lock (blocking)enq: TX - row lock contentionFix long transactions, use RCSI
CXPACKETParallelism overheadPX Deq: Execute ReplyLower MAXDOP, update stats
SOS_SCHEDULER_YIELDCPU-bound queryCPU used by this sessionOptimize query, scale compute
WRITELOGWaiting for log flushlog file syncScale to higher tier (more log throughput)
RESOURCE_SEMAPHOREWaiting for memory grantPGA memoryReduce query memory needs, scale up
THREADPOOLNo available worker threadsSession limit hitReduce connections, connection pooling
IO_COMPLETIONWaiting for I/Odirect path read/writeScale storage, reduce I/O
🎯 Exam Focus

Exam pattern: "Database is slow, top wait is X. What should you do?" → Map the wait type to the fix. PAGEIOLATCH = I/O/indexes. LCK = blocking. CXPACKET = MAXDOP. SOS_SCHEDULER = CPU. WRITELOG = log throughput.

Blocking & Deadlocks

Blocking Chain

Blocking Chain
🔒
Session A Holds Lock
BEGIN TRAN + UPDATE on Row 1
Exclusive lock held, transaction not committed
Session B Waits
SELECT on same Row 1
Blocked by Session A's exclusive lock (LCK_M_S wait)
Session C Waits
Also needs Row 1
Blocked by Session A → chain grows
Fix: RCSI or Short Txns
RCSI: readers get versioned snapshot, no blocking
Keep transactions short, commit quickly

How to investigate:

  • sys.dm_exec_requestsblocking_session_id column shows who's blocking whom
  • sys.dm_tran_locks → shows what locks are held and requested
  • Read Committed Snapshot Isolation (RCSI) = the best fix for most blocking scenarios

RCSI — The Blocking Eliminator

Isolation LevelReaders Block Writers?Writers Block Readers?Row Versioning?
Read Committed (default)✅ Yes (readers wait)
Read Committed Snapshot (RCSI)❌ No blocking!✅ (tempdb)
Snapshot✅ (tempdb)
🎯 Exam Focus

RCSI is enabled by default in Azure SQL DB and MI. On SQL Server VMs, you must enable it: ALTER DATABASE mydb SET READ_COMMITTED_SNAPSHOT ON. This is a major exam topic — Azure SQL = no reader/writer blocking by default.

Deadlocks

Deadlock — Circular Wait
🔄
Session A → B
Session A holds lock on Resource 1, waits for Resource 2 (held by B)
🔄
Session B → A
Session B holds lock on Resource 2, waits for Resource 1 (held by A). SQL kills the cheaper victim.

Deadlock resolution strategy:

  1. Check system_health XEvent session (always captures deadlock graphs)
  2. Analyze the deadlock graph — identify the two resources and sessions
  3. Fix by: accessing tables in consistent order, keeping transactions short, adding appropriate indexes

TempDB Optimization

TempDB is used for sorts, hash joins, version store (RCSI), temp tables, and spills.

TempDB Contention Pattern

TempDB Contention
⚠️
PAGELATCH Waits
Contention on PFS/GAM/SGAM allocation pages when many sessions create temp objects
📁
Multiple Data Files
Add 1 file per CPU core (up to 8). All same size for even round-robin allocation.
☁️
Azure SQL DB/MI
Microsoft manages TempDB config automatically. On VMs: YOU must configure this.

TempDB best practices:

  • Multiple data files — 1 per CPU core (up to 8 files for most)
  • All files same size (ensures even round-robin allocation)
  • Pre-size files — don't rely on autogrow
  • In Azure SQL DB/MI: Microsoft manages this for you
  • On SQL VMs: you must configure this manually
🎯 Exam Focus

If the exam mentions PAGELATCH waits in tempdb → add more tempdb data files. If it mentions tempdb filling up → check for long-running transactions (version store), large sorts (missing indexes), or explicit temp table abuse.

Parameter Sniffing — Detection & Solutions

Parameter Sniffing
📝
First Execution
@Country = 'Luxembourg' (10 rows)
Optimizer creates plan optimized for small result set
📦
Plan Cached
Nested Loop plan cached for reuse
Works great for small datasets
💥
Second Execution
@Country = 'USA' (5M rows)
Reuses small-data plan → catastrophically slow
Fix Options
RECOMPILE, OPTIMIZE FOR UNKNOWN
Parameter Sensitive Plan (compat 160)

Solutions (from least to most aggressive)

SolutionHowWhen
Update StatisticsUPDATE STATISTICS tableStale stats causing bad estimates
OPTIMIZE FOR UNKNOWNQuery hintRemoves sniffing — uses average stats
RECOMPILEOPTION(RECOMPILE) or sp-levelAlways gets fresh plan (CPU cost)
Plan GuidesForce a specific planEmergency fix for vendor code you can't change
Parameter Sensitive Plan (IQP)Compat level 160Best: automatic multiple plan variants

Database-Scoped Configuration

Important settings that control query behavior per database.

SettingWhat It DoesRecommended
MAXDOPMax parallelism per query8 or less (depends on cores)
LEGACY_CARDINALITY_ESTIMATIONUse old CE modelOFF (unless regressions)
OPTIMIZE_FOR_AD_HOC_WORKLOADSCache only plan stubs for one-time queriesON (saves plan cache memory)
QUERY_OPTIMIZER_HOTFIXESEnable latest optimizer fixesON
LAST_QUERY_PLAN_STATSCapture last actual planON (enables sys.dm_exec_query_plan_stats)
🎯 Exam Focus

OPTIMIZE_FOR_AD_HOC_WORKLOADS = ON prevents single-use query plans from bloating the plan cache. Instead of caching the full plan on first execution, it caches a small "stub." Full plan is only cached on second execution. Exam-tested!

DBCC CHECKDB — Database Integrity

Verifies database integrity — catches corruption before it causes data loss.

CommandWhat It ChecksImpact
CHECKDBEverything (allocation, structural, logical)Heavy — schedule off-peak
CHECKTABLESingle tableMedium — targeted fix
CHECKALLOCAllocation consistencyLight
CHECKFILEGROUPOne filegroupMedium

Strategy: Run CHECKDB weekly on SQL VMs (Azure SQL DB/MI does this automatically). Check msdb.dbo.suspect_pages for known corruption.

🎯 Exam Focus

In Azure SQL DB and MI, Microsoft runs integrity checks automatically — you don't need to schedule DBCC CHECKDB. On SQL VMs, YOU must schedule it (via SQL Agent). This is an exam trap — knowing who's responsible for what.


Flashcards

What is a Key Lookup and how do you fix it?
Click to reveal answer
When a non-clustered index find the row but must go back to the clustered index for additional columns. Fix: add missing columns to INCLUDE in the non-clustered index (creating a 'covering' index).
1 / 10

Quiz

Q1/6
0 correct
A query on a 10M row table runs slowly. The execution plan shows an Index Seek on CustomerID followed by 50,000 Key Lookups. What's the best fix?