AZ-204 · Azure Developer Associate

Azure Application Insights & App Performance Monitoring

A complete one-note covering APM fundamentals, instrumentation strategies, metrics, availability testing, and Application Map — with analogies, definitions, and real-world use cases.

APM Telemetry Distributed Tracing OpenTelemetry Availability Tests Application Map KQL
🗺️
Mental Map — Where does this fit?
Big Picture Analogy
Azure Monitor is the hospital. Application Insights is the specialist doctor assigned to your apps — wiring up heart-rate monitors (metrics), running blood tests (logs), watching vital signs in real time (Live Metrics), and alerting nurses when something goes wrong (Smart Detection). Without this doctor, you only know the patient is sick after they collapse.
Application Insights in Azure Monitor ecosystem Hierarchy: Azure Monitor contains Log Analytics, Application Insights, Metrics Explorer, and Alerts AZURE MONITOR — UMBRELLA SERVICE Log Analytics KQL queries Application Insights APM · This Module Metrics Explorer Time-series data Alerts & Actions Notifications
🔬
1 · What is Application Insights?
Core Definition
Application Insights is an extension of Azure Monitor that provides Application Performance Monitoring (APM). It collects metrics, telemetry, and trace logs from your live applications and visualises them so you can proactively detect issues before users notice, and reactively diagnose root causes when incidents happen.
APM
Application Performance Monitoring — the discipline of tracking an app's health, speed, and error rates continuously across its full lifecycle (dev → test → production).
Telemetry
Automated data collected from a running system. Includes request counts, response times, exceptions, page views, dependency calls, and custom events written by developers.
Trace Logs
Diagnostic log messages emitted at runtime (e.g., log.Information("Order placed")). App Insights correlates these with the specific HTTP request that produced them.
Analogy — The Flight Black Box
Your app is an aeroplane. Without Application Insights it has no flight recorder. If it crashes you have no idea why. App Insights is the Flight Data Recorder + Cockpit Voice Recorder — continuously capturing altitude (response time), engine metrics (CPU/memory), co-pilot chatter (trace logs), and GPS track (distributed traces).
FeatureWhat it doesAnalogy
Live MetricsReal-time stream of requests, failures, and server healthAmbulance dashboard while driving
Availability TestsProbes your URL from global Azure datacentres on a scheduleSecurity guard doing hourly rounds
Smart DetectionML-based anomaly detection — alerts on sudden failure spikesSmoke detector in the kitchen
Application MapVisual topology of microservices + health overlayCircuit board with red LEDs on failures
Distributed TracingEnd-to-end journey of a single request through N servicesParcel tracking from warehouse to door
Usage AnalyticsFunnels, retention, user flowsCCTV heat-map in a retail store
GitHub / ADOCreate work items directly from an exceptionAuto-filing bug report to Jira
What App Insights monitors — at a glance
Request rates & response times Failure rates Dependency calls (SQL, HTTP) Exceptions & stack traces Page views & AJAX User & session counts Performance counters (CPU/RAM) Docker / Azure host diagnostics Custom events & metrics Distributed traces
🌍
Real-World Use Cases
E-Commerce — Flipkart / Amazon scenario
Detecting checkout failures during a flash sale
During a sale, thousands of concurrent users hit checkout. Smart Detection fires an alert within 2 minutes when the payment-service failure rate jumps from 0.5% → 12%. The on-call engineer opens Application Map, sees the payment-service node glowing red, clicks through to Distributed Traces, and finds a SQL connection-pool timeout. Fix deployed before most users notice.
SaaS Platform — Salesforce / Dynamics scenario
Identifying a slow dependency degrading UX globally
Average response time creeps from 200ms → 1.4s over three weeks. Log-based metrics show the "ReportGenerator" external API is 5× slower. Usage analytics reveal 22% of users abandoned the report page. PM and dev team correlate performance data with user drop-off — making a business case to replace the third-party API.
Banking / FinTech — HDFC / Revolut scenario
Availability SLA monitoring across geographies
Availability Tests ping the login endpoint every 5 minutes from 5 Azure regions. One morning, the India-East probe fails while US and Europe pass. Alert fires, geo-routing issue found in Azure Front Door config. Without global probes, the team would have relied on customer complaints — damaging trust in a regulated industry.
📊
2 · Log-Based vs Standard Metrics
Analogy — CCTV vs Sensor Dashboard
Log-based metrics = full CCTV footage. Every event is stored; you can rewind and inspect any moment in detail. But replaying hours of footage at query time is slow. Standard (pre-aggregated) metrics = the live sensor dashboard on the security desk — it shows "42 people entered in the last minute" instantly because it was computed as it happened. You can't drill into individuals, but it's blazing fast for alerts and dashboards.
📼 Log-Based Metrics
StorageIndividual events in Log Analytics
Query timeKusto query at read time (slower)
DetailAll event properties available
Best forAd-hoc diagnostics, root-cause
SamplingAccuracy drops if sampling on
SDKAny version
⚡ Standard (Pre-aggregated)
StorageTime-series aggregates
Query timePre-computed — very fast
DetailOnly key dimensions kept
Best forDashboards, real-time alerts
SamplingUnaffected — aggregated before
SDKSDK 2.7+ for preaggregation
💡 Key insight: Both metric types coexist in App Insights. In Metrics Explorer, use the namespace selector to switch between "Log-based metrics" and "Standard metrics (preview)". Use Standard for operations dashboards; switch to Log-based when investigating a specific incident.
🔧
3 · Instrumenting Your App
Instrumentation
Enabling an application to capture and emit telemetry data. Think of it as "wiring up sensors inside the app." More sensors = more visibility, but also more effort to configure.
Zero code changes
Auto-Instrumentation
Enable via Azure Portal on App Service, AKS, VMs — no code changes needed.

Best for: Already-deployed apps, migrations, teams without SDK expertise.

Limitation: Less configurable; not available for all languages. Check docs before assuming coverage.
SDK in code
Manual Instrumentation
Install App Insights SDK or Azure Monitor OpenTelemetry Distro in your project.

Best for: Custom events, business metrics, full telemetry pipeline control.

Requirement: You manage SDK version updates yourself.
Use the SDK only when…
① Custom events / metrics — e.g., "items sold", "loan approved", "game won"
② Control telemetry flow — filter, enrich, or sample before sending to Azure
③ Autoinstrumentation unavailable — language or platform not supported
What is OpenTelemetry?
OpenTelemetry (OTel) is the open-source, vendor-neutral standard for capturing traces, metrics, and logs — backed by CNCF. Microsoft is a Platinum Member. Think of it as the USB standard for observability: instead of every cloud vendor having a proprietary cable (SDK), OTel gives you one universal connector that works anywhere — AWS, GCP, Azure, or on-prem.
App Insights → OpenTelemetry terminology map
Old (Application Insights)New (OpenTelemetry)
AutocollectorsInstrumentation libraries
ChannelExporter
Codeless / Agent-basedAutoinstrumentation
TracesLogs
RequestsServer Spans
DependenciesClient / Internal Spans
Operation IDTrace ID
Operation Parent IDSpan ID
🌐
4 · Availability Tests
Definition
Availability Tests (aka Synthetic Transaction Monitoring) are scheduled HTTP probes fired from Azure datacentres across the world to verify your app is up, responsive, and returning correct results. No code changes required.
Analogy — Mystery Shoppers at Every Branch
Imagine your app is a bank with branches worldwide. Azure sends "mystery shoppers" (probes) every 5 minutes to every branch, checking: Is the door open? Is service quick? Is the ATM working? If a branch fails 3 checks in a row, the regional manager (you) gets an alert.
✅ Standard Test
Standard Availability Test
Single HTTP request. Validates TLS/SSL cert, cert expiry, custom headers, GET/POST/HEAD verbs. Recommended for most scenarios.
🛠 Custom
Custom TrackAvailability()
Your own code runs the test logic and calls TrackAvailability(). Use for multi-step, login-required, or non-HTTP tests.
⚠ Retiring Sep 2026
URL Ping Test (Classic)
Simple URL check via portal. Being retired Sep 30, 2026. Migrate to Standard Tests before the deadline.
⚠️ Retirement warning: URL Ping Tests retire September 30, 2026. Migrate existing tests to Standard Tests before then to continue running single-step availability tests in your App Insights resources.
Healthcare SaaS — Epic / MedGenix scenario
Detecting regional outages before patients can't book appointments
Availability tests ping the appointment booking endpoint from East US, West Europe, and Southeast Asia every 5 minutes. When a CDN misconfiguration affects Southeast Asia traffic, the Singapore probe fails while US and Europe pass. The infrastructure team receives an alert and rolls back the CDN config within 8 minutes — before any patient calls the helpdesk.
🗺️
5 · Application Map
Definition
Application Map is a visual, auto-discovered topology of your distributed application. Every microservice and external dependency becomes a node. Node colour = health status. Click any node to drill into its performance, failures, and traces.
Analogy — Google Maps for Your Microservices
Google Maps shows every city (service), road (API call), and roadblock (failure). You can instantly see which junction is causing the traffic jam — rather than reading through thousands of log lines. Application Map is the Google Maps for your microservices. Red nodes = jams. Click to get the Waze-style drill-down of why it's slow.
Application Map example topology Web frontend calls API Gateway which calls Order Service and Payment Service. Payment Service is failing at 12%. Web Frontend Blazor SPA API Gateway APIM / Ocelot Order Service .NET API Payment Service ⚠ 12% failure Azure SQL DB Dependency Stripe API External dep.
Healthy service
Failing / hot-spot
Internal dependency (SQL, Storage)
External dependency (3rd party API)
Key concepts in Application Map
Component — any independently deployable microservice with App Insights SDK installed.
Dependency — external services (SQL, Redis, Event Hub, REST API) your component calls. Observed but not instrumented by your team.
Cloud Role Name — property App Insights uses to label each node on the map. Can be overridden in code.
Progressive Discovery — the map builds itself by following HTTP dependency calls. Hit "Update map components" to refresh.
Ride-sharing — Uber / OLA scenario
Pinpointing the broken link in a 15-service chain
A rider reports "app stuck on booking." With 15 microservices, reading logs would take hours. Application Map shows the Pricing Service in amber (elevated latency) while downstream Surge-Calculator is red (failures). Drill-down reveals a Redis cache eviction storm. Team fixes cache TTL config in 20 minutes — without App Map it would have been a multi-hour bridge call.
🚀
6 · Getting Started — Decision Flow
Path A
At Runtime
No code change. Enable in portal. Best for live apps.
Path B
At Dev Time
Add SDK. Custom events + full telemetry control.
Path C
Browser / SPA
JS snippet for page views, AJAX & user flows.
Path D
Availability
Add Standard Tests from Azure Portal.
📖
Glossary of Key Terms
Instrumentation Key / Connection StringUnique token linking your app to an App Insights resource
SamplingReduces telemetry volume — stores only a % of events
Kusto (KQL)Query language for log-based metrics in Log Analytics
Span / Trace IDOTel IDs — trace = full request journey, span = one step
PreaggregationComputing metric summaries at collection time, not query time
Smart DetectionAutomated ML anomaly alerts — no manual rule setup needed
Live Metrics StreamSub-second real-time dashboard; zero storage impact on host
Cloud Role NameProperty used to label nodes on Application Map
CNCFCloud Native Computing Foundation — governs OTel, Kubernetes & more
OTel DistroMicrosoft's pre-packaged OTel SDK + Azure Monitor exporter bundle
🎯
AZ-204 Exam Tips
✅ Remember
App Insights is an extension of Azure Monitor — not a standalone service
Availability tests require no code changes — any HTTP/HTTPS endpoint works
URL Ping Tests retire Sep 30, 2026 — migrate to Standard Tests
Both Log-based & Standard metrics coexist — switch via namespace selector
SDK 2.7+ enables client-side preaggregation → less cost + better accuracy
Smart Detection needs no manual alert rules — it is fully automatic
App Map uses Cloud Role Name property to label each node
Live Metrics has zero storage impact on the host environment
⚠ Trick Questions
Standard metrics are NOT affected by sampling — log-based are
App Map discovers topology via HTTP dependency tracking, not network scans
Custom TrackAvailability() = you own test logic; Standard Test = Azure owns it
OTel "Traces" = App Insights "Logs" — confusing terminology inversion!
Autoinstrumentation may not support all languages — always check docs
Log-based metrics use Kusto queries AT READ TIME — slower than Standard
Both metric types visible in the SAME Metrics Explorer — switch via namespace
SDK not required if autoinstrumentation is available and no custom events needed
Quick Decision Guide
If you need…Use this
Monitor deployed app with no code changeAutoinstrumentation via Azure Portal
Custom business events (orders, logins, etc.)Manual SDK or OTel Distro
Near-real-time dashboards, low query costStandard (pre-aggregated) metrics
Drill into a specific user session or exceptionLog-based metrics + KQL query
Check public endpoint availability globallyStandard Availability Tests
Test a multi-step login flow for availabilityCustom TrackAvailability()
See which of 20 microservices is failingApplication Map — look for red/amber nodes
Trace a single slow request end-to-endDistributed Tracing — search by Trace ID
Alerts without writing alert rules manuallyEnable Smart Detection