Skip to main content

Lab: Performance Troubleshooting with Query Store & DMVs

Deploy an Azure SQL Database, simulate real-world performance problems (missing indexes, parameter sniffing, blocking chains), then hunt and fix them using Query Store, DMVs, wait stats, and execution plan analysis.

DP-300 Domain: 3 — Monitor, Configure, and Optimize (20-25%)

Architecture

Performance troubleshooting lab architecture: PerfLabDB S2 with Query Store, workload scripts and DBA investigation hitting the same database

Performance Problems You'll Investigate

5 Real-World Scenarios
📉
Missing Indexes
Table scans on large tables. Find them with DMVs, create the right index, measure improvement.
🔒
Blocking Chains
Long-running transactions blocking others. Identify the blocker, understand lock escalation.
📊
Parameter Sniffing
Same query, wildly different performance. Query Store shows plan regression.
⏱️
High Wait Stats
PAGEIOLATCH, LCK_M_X, CXPACKET — decode what the server is waiting on.
📋
Query Store Regression
A query that was fast yesterday is slow today. Force a known-good plan.

Prerequisites

  • Azure subscription
  • Azure CLI installed
  • sqlcmd or VS Code SQL Server extension

Lab Steps

Deploy Lab Database
Create Azure SQL DB (S2) with AdventureWorksLT sample data and Query Store enabled
Create Test Tables & Workload
Add larger tables to simulate real production scenarios
Scenario 1: Missing Indexes
Run table scans, find missing indexes with DMVs, create and measure
Scenario 2: Blocking
Create blocking chains, identify blockers with DMVs, resolve
Scenario 3: Parameter Sniffing
Force plan regression, use Query Store to identify and fix
Scenario 4: Wait Stats Analysis
Analyze top waits, correlate with performance issues
Bonus: Forced Plan in Query Store
Pin a known-good execution plan to prevent regression

Step 1: Deploy Lab Database

$resourceGroup = "rg-perf-lab"
$location = "westeurope"
$suffix = Get-Random -Maximum 9999
$serverName = "sql-perflab-$suffix"
$adminPassword = "<YOUR_STRONG_PASSWORD>"

# Create resource group + server + database
az group create --name $resourceGroup --location $location

az sql server create `
--resource-group $resourceGroup `
--name $serverName `
--admin-user sqladmin `
--admin-password $adminPassword `
--location $location

# S2 tier gives enough DTU for performance testing
az sql db create `
--resource-group $resourceGroup `
--server $serverName `
--name PerfLabDB `
--service-objective S2 `
--sample-name AdventureWorksLT

# Allow your IP
$myIp = (Invoke-RestMethod -Uri https://api.ipify.org)
az sql server firewall-rule create `
--resource-group $resourceGroup `
--server $serverName `
--name AllowMyIP `
--start-ip-address $myIp `
--end-ip-address $myIp

Step 2: Create Test Data

Connect to PerfLabDB and run:

-- Create a larger table for performance testing
-- (AdventureWorksLT tables are small)
SELECT TOP 500000
ROW_NUMBER() OVER (ORDER BY a.object_id) AS OrderID,
CAST(DATEADD(DAY, -ABS(CHECKSUM(NEWID())) % 1000, GETDATE()) AS DATE) AS OrderDate,
ABS(CHECKSUM(NEWID())) % 10000 AS CustomerID,
ABS(CHECKSUM(NEWID())) % 500 AS ProductID,
CAST(ABS(CHECKSUM(NEWID())) % 10000 / 100.0 AS DECIMAL(10,2)) AS Amount,
CASE ABS(CHECKSUM(NEWID())) % 4
WHEN 0 THEN 'Pending'
WHEN 1 THEN 'Shipped'
WHEN 2 THEN 'Delivered'
ELSE 'Cancelled'
END AS Status
INTO dbo.BigOrders
FROM sys.all_objects a CROSS JOIN sys.all_objects b;

-- Verify row count
SELECT COUNT(*) AS TotalRows FROM dbo.BigOrders;
-- Should be 500,000 rows

-- Ensure Query Store is ON and capturing
ALTER DATABASE PerfLabDB SET QUERY_STORE = ON;
ALTER DATABASE PerfLabDB SET QUERY_STORE (
OPERATION_MODE = READ_WRITE,
DATA_FLUSH_INTERVAL_SECONDS = 60,
INTERVAL_LENGTH_MINUTES = 1,
QUERY_CAPTURE_MODE = ALL
);

Step 3: Scenario 1 — Missing Indexes

Problem: A query scans the entire 500K row table because there's no index.

-- Run this query (will be slow — full table scan)
SELECT OrderDate, SUM(Amount) AS TotalAmount, COUNT(*) AS OrderCount
FROM dbo.BigOrders
WHERE CustomerID = 4242
GROUP BY OrderDate
ORDER BY OrderDate;

-- Check the execution plan — you'll see a TABLE SCAN
-- Now find what indexes SQL Server recommends:
SELECT
OBJECT_NAME(mid.object_id) AS TableName,
mid.equality_columns,
mid.inequality_columns,
mid.included_columns,
migs.avg_user_impact AS [Improvement%],
migs.user_seeks,
migs.last_user_seek
FROM sys.dm_db_missing_index_details mid
JOIN sys.dm_db_missing_index_groups mig ON mid.index_handle = mig.index_handle
JOIN sys.dm_db_missing_index_group_stats migs ON mig.index_group_handle = migs.group_handle
WHERE mid.database_id = DB_ID()
ORDER BY migs.avg_user_impact DESC;

Fix: Create the recommended index:

CREATE NONCLUSTERED INDEX IX_BigOrders_CustomerID
ON dbo.BigOrders (CustomerID)
INCLUDE (OrderDate, Amount);

-- Re-run the query — observe INDEX SEEK instead of TABLE SCAN
SELECT OrderDate, SUM(Amount) AS TotalAmount, COUNT(*) AS OrderCount
FROM dbo.BigOrders
WHERE CustomerID = 4242
GROUP BY OrderDate
ORDER BY OrderDate;
⚠️ Watch Out

Don't blindly create every missing index DMV suggests. Each index has a write cost (INSERT/UPDATE/DELETE overhead). In production, only create indexes with high impact (>80%) and frequent seeks.

Step 4: Scenario 2 — Blocking

Open two query windows (Session A and Session B):

Session A (the blocker):

BEGIN TRANSACTION;
UPDATE dbo.BigOrders SET Status = 'Processing' WHERE OrderID = 1;
-- DON'T commit yet — this holds a lock

Session B (the blocked query):

-- This will wait (blocked by Session A's lock)
UPDATE dbo.BigOrders SET Status = 'Shipped' WHERE OrderID = 1;

Investigation (open a third window):

-- Find blocking chains
SELECT
r.session_id AS blocked_session,
r.blocking_session_id AS blocker_session,
r.wait_type,
r.wait_time / 1000.0 AS wait_seconds,
t.text AS blocked_query,
s.login_name AS blocker_login
FROM sys.dm_exec_requests r
JOIN sys.dm_exec_sessions s ON r.blocking_session_id = s.session_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) t
WHERE r.blocking_session_id <> 0;

-- View all active locks
SELECT
l.request_session_id,
l.resource_type,
l.request_mode,
l.request_status,
OBJECT_NAME(p.object_id) AS table_name
FROM sys.dm_tran_locks l
LEFT JOIN sys.partitions p ON l.resource_associated_entity_id = p.hobt_id
WHERE l.resource_database_id = DB_ID()
ORDER BY l.request_session_id;

Resolution:

-- Option 1: Commit/rollback Session A
-- In Session A:
ROLLBACK TRANSACTION;

-- Option 2 (emergency): Kill the blocking session
-- KILL <blocker_session_id>;

Step 5: Scenario 3 — Parameter Sniffing

-- Create a stored procedure with uneven data distribution
CREATE OR ALTER PROCEDURE dbo.GetOrdersByStatus
@Status NVARCHAR(20)
AS
BEGIN
SELECT OrderID, OrderDate, CustomerID, Amount
FROM dbo.BigOrders
WHERE Status = @Status
ORDER BY OrderDate DESC;
END;
GO

-- First execution with a RARE value (compiles a plan optimized for few rows)
EXEC dbo.GetOrdersByStatus @Status = 'Cancelled';

-- Second execution with a COMMON value (reuses the "few rows" plan — BAD)
EXEC dbo.GetOrdersByStatus @Status = 'Shipped';

Investigate with Query Store:

-- Find the procedure's query in Query Store
SELECT
q.query_id,
qt.query_sql_text,
p.plan_id,
rs.count_executions,
rs.avg_duration / 1000.0 AS avg_duration_ms,
rs.avg_logical_io_reads,
TRY_CAST(p.query_plan AS XML) AS query_plan
FROM sys.query_store_query q
JOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_id
JOIN sys.query_store_plan p ON q.query_id = p.query_id
JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id
WHERE qt.query_sql_text LIKE '%BigOrders%Status%'
ORDER BY rs.avg_duration DESC;

Fix options:

-- Option 1: RECOMPILE hint (recompiles every execution)
CREATE OR ALTER PROCEDURE dbo.GetOrdersByStatus
@Status NVARCHAR(20)
AS
BEGIN
SELECT OrderID, OrderDate, CustomerID, Amount
FROM dbo.BigOrders
WHERE Status = @Status
ORDER BY OrderDate DESC
OPTION (RECOMPILE);
END;
GO

-- Option 2: OPTIMIZE FOR UNKNOWN (uses average statistics)
-- OPTION (OPTIMIZE FOR (@Status UNKNOWN));

-- Option 3: Force a specific plan in Query Store (see bonus section)

Step 6: Scenario 4 — Wait Stats Analysis

-- Top wait types (what is the server spending time on?)
SELECT TOP 10
wait_type,
waiting_tasks_count,
wait_time_ms / 1000.0 AS wait_time_seconds,
signal_wait_time_ms / 1000.0 AS signal_wait_seconds,
(wait_time_ms - signal_wait_time_ms) / 1000.0 AS resource_wait_seconds
FROM sys.dm_os_wait_stats
WHERE wait_type NOT IN (
'WAITFOR', 'SLEEP_TASK', 'BROKER_TASK_STOP',
'CLR_AUTO_EVENT', 'CLR_MANUAL_EVENT',
'REQUEST_FOR_DEADLOCK_SEARCH', 'LAZYWRITER_SLEEP',
'CHECKPOINT_QUEUE', 'XE_TIMER_EVENT'
)
ORDER BY wait_time_ms DESC;

Common waits and what they mean:

Wait TypeMeaningCommon Fix
PAGEIOLATCH_SH/EXReading data from disk (not in buffer pool)Add memory, create better indexes, reduce table scans
LCK_M_X, LCK_M_SLock contention (blocking)Optimize transactions, reduce lock duration
CXPACKETParallelism waitsAdjust MAXDOP, check for skewed parallelism
WRITELOGTransaction log writesMove log to faster disk, batch commits
SOS_SCHEDULER_YIELDCPU pressureQuery tuning, add CPU, reduce unnecessary work

Bonus: Force a Plan in Query Store

-- Find query_id and the good plan_id from Step 5
-- Then force the good plan:
EXEC sp_query_store_force_plan
@query_id = <query_id>,
@plan_id = <good_plan_id>;

-- Verify the forced plan
SELECT
q.query_id,
p.plan_id,
p.is_forced_plan,
p.force_failure_count
FROM sys.query_store_query q
JOIN sys.query_store_plan p ON q.query_id = p.query_id
WHERE p.is_forced_plan = 1;
🏢 Real-World DBA Note

Forcing plans is a DP-300 exam topic and a real production lifesaver. When a critical query regresses after a statistics update or recompile, forcing the known-good plan gives you immediate relief while you investigate the root cause.

Verification Checklist

ScenarioSymptomTool UsedFixed?
Missing IndexTable scan on 500K rowssys.dm_db_missing_index_details✅ Index created
BlockingSession waiting on locksys.dm_exec_requests✅ Transaction resolved
Parameter SniffingWrong plan reusedQuery Store + OPTION (RECOMPILE)✅ Plan regression fixed
Wait StatsHigh PAGEIOLATCHsys.dm_os_wait_stats✅ Diagnosed root cause
Plan RegressionQuery slow after recompilesp_query_store_force_plan✅ Good plan forced

Common Failures

SymptomLikely CauseFix
Query Store empty after restartNot enabled on DBALTER DATABASE [DB] SET QUERY_STORE = ON
sp_BlitzFirst fails on Azure SQL DBFirst Responder Kit needs master accessUse sys.dm_exec_* DMVs directly; FRK works on MI/IaaS, not DB
Forced plan rolls back after restartQS in READ_ONLY mode (storage full)ALTER DATABASE [DB] SET QUERY_STORE (OPERATION_MODE = READ_WRITE, MAX_STORAGE_SIZE_MB = 2000)
Wait stats look clean but app slowHigh network/client-side latencyMeasure ASYNC_NETWORK_IO; check app connection pooling
Index recommendation hurts other queriesSingle-query optimization, ignored workloadValidate with sys.dm_db_missing_index_* aggregated; weigh write cost
⚠️ Watch Out

Don't blindly accept SSMS "Missing Index" hints. Each new index slows writes and adds storage. Aggregate recommendations across the whole workload (Query Store + DMVs) before creating any index. Drop unused indexes (sys.dm_db_index_usage_stats) first.

Variations to Try

  • Run the same workload on General Purpose vs Business Critical — compare wait stats
  • Force a bad plan intentionally with QS, then unforce — observe behavior
  • Enable Automatic Tuning (FORCE_LAST_GOOD_PLAN) — watch it self-correct over 24h
  • Add Extended Events session for query_post_execution_showplan — capture top regressors
  • Compare DTU vs vCore behavior on same workload

Production Application

  • Daily perf triage → sp_BlitzFirst (or wait stats DMV) → Query Store top regressed queries → plan forcing.
  • Post-deploy regression hunt → Query Store "Regressed Queries" report, force the previous good plan as a hotfix while devs fix the query.
  • Capacity planning → wait stats + resource_stats DMVs over 7 days → right-size tier.
  • Anti-pattern: don't add indexes from a single SSMS green-bar suggestion — you'll end up with overlapping/unused indexes that slow writes.

Clean Up

az group delete --name rg-perf-lab --yes --no-wait

Key Takeaways

What are the three most important DMVs for performance troubleshooting?
Click to reveal answer
1. **sys.dm_exec_requests** — currently running queries, wait types, blocking. 2. **sys.dm_os_wait_stats** — cumulative wait statistics showing what the server spends time on. 3. **sys.dm_db_missing_index_details** — indexes SQL Server recommends based on query patterns.
1 / 4