Advanced Monitoring & Optimization
The Complete Monitoring Stack
Extended Events (XEvents)
The modern, lightweight replacement for SQL Trace/Profiler. Think of it as Oracle's event tracing (10046, 10053) but better.
Architecture
Key Predefined Sessions
| Session | What It Captures | Enabled By Default |
|---|---|---|
| system_health | Errors, deadlocks, memory issues | ✅ Yes |
| AlwaysOn_health | AG state changes, failover events | ✅ (if AG configured) |
Common XEvent Use Cases
| Scenario | Events to Capture |
|---|---|
| Slow queries | sql_statement_completed (filter: duration > 5s) |
| Deadlocks | xml_deadlock_report (or check system_health) |
| Login failures | error_reported (error_number = 18456) |
| Plan changes | query_post_compilation_showplan |
| Wait analysis | wait_info, wait_completed |
Extended Events replaces SQL Profiler (Profiler is deprecated). XEvents uses less CPU, supports filtering at the source (predicates), and works on Azure SQL DB, MI, and VMs. system_health session is always running — check it first for deadlocks and errors.
Oracle DBA parallel: Extended Events = Oracle's ALTER SESSION SET EVENTS + DBMS_MONITOR + AWR event tracing, but unified in one framework. The system_health session is like Oracle's incident capture — always running, catches critical events automatically.
Database Watcher (Preview)
Near real-time monitoring dashboard that doesn't require manual setup of diagnostic pipelines.
| Feature | Details |
|---|---|
| Data store | Azure Data Explorer (Kusto) |
| Query language | KQL |
| Latency | Near real-time (~seconds) |
| Supported | Azure SQL DB, MI |
| Dashboards | Pre-built Grafana dashboards |
| Key metrics | Sessions, waits, query stats, resource usage, blocking |
Database Watcher is Microsoft's newest monitoring solution — it streams data to Azure Data Explorer for KQL analysis. If the exam mentions "near real-time monitoring with KQL dashboards" → Database Watcher.
SQL Insights (Azure Monitor)
Deeper telemetry than basic Azure Monitor metrics. Uses a monitoring VM to collect DMV data.
Resource Governor (VM & MI)
Isolates workloads by limiting CPU, memory, and I/O per workload group.
Architecture components:
- Resource Pool — defines CPU/memory/I/O limits
- Workload Group — maps to a resource pool with additional settings (request limits)
- Classifier Function — routes incoming sessions to workload groups based on rules
Resource Governor is available on SQL Server on VM and MI (not Azure SQL DB). If the exam says "prevent reporting queries from consuming all CPU" → Resource Governor. For Azure SQL DB, use read replicas (BC tier) or Named Replicas (Hyperscale) to isolate workloads instead.
Oracle DBA parallel: Resource Governor = Oracle Resource Manager. Resource Pools = Consumer Groups. Classifier Function = Resource Manager plan rules. Same concept: prevent one workload from starving another.
Intelligent Query Processing (IQP)
A family of automatic query optimization features. No code changes needed — just upgrade compatibility level.
Key exam-tested IQP features:
| Feature | Problem It Solves | Compat Level |
|---|---|---|
| Memory Grant Feedback | Queries request too much/too little memory (spills to disk) | 150 |
| Adaptive Joins | Optimizer picks wrong join type at compile time | 150 |
| Batch Mode on Rowstore | Analytics queries slow on OLTP tables | 150 |
| Parameter Sensitive Plan | Parameter sniffing → bad plan for some values | 160 |
| DOP Feedback | Over/under-parallelism wastes resources | 160 |
IQP requires the right compatibility level (150 = SQL 2019, 160 = SQL 2022). If the exam asks "how to enable adaptive joins" → set compatibility level to 150+. No query changes needed — it's automatic.
Statistics Maintenance
Statistics tell the query optimizer about data distribution. Stale statistics → bad plans → poor performance.
| Setting | Behavior | Best For |
|---|---|---|
| AUTO_UPDATE_STATISTICS | Updates when ~20% of rows change | Most workloads (default: ON) |
| AUTO_UPDATE_STATISTICS_ASYNC | Updates in background (doesn't block queries) | OLTP (avoids query compilation delays) |
| Manual UPDATE STATISTICS | On-demand, full scan or sample % | After bulk loads, partition switches |
After large data loads: always run UPDATE STATISTICS manually. Auto-update won't trigger until the 20% threshold. For partitioned tables, update stats on specific partitions with UPDATE STATISTICS ... WITH RESAMPLE ON PARTITIONS(N).
Execution Plans — What to Look For
| Red Flag in Plan | What It Means | Fix |
|---|---|---|
| Table/Index Scan (on large table) | Missing index or unSARGable WHERE | Create covering index |
| Key Lookup (high cost) | Index doesn't include needed columns | Add INCLUDE columns |
| Sort (with warning ⚠️) | Memory spill to tempdb | Add index matching ORDER BY, or increase memory |
| Parallelism (CXPACKET waits) | Skewed distribution across threads | Update stats, limit MAXDOP |
| Thick arrows (many rows) | Cardinality estimate wrong | Update stats, rewrite query |
| Nested Loops (on large sets) | Should be Hash Match | Update stats, hint if needed |
Oracle DBA parallel: Execution plans in SQL Server are like Oracle explain plans. Key difference: SQL Server shows actual rows vs estimated in actual plans (like Oracle's V$SQL_PLAN_STATISTICS_ALL). Look for big gaps between estimated and actual — that means stale stats.