AZ-204 · Azure Developer Associate

Azure Application Insights & App Performance Monitoring

A complete one-note covering APM fundamentals, instrumentation strategies, metrics, availability testing, and Application Map — with analogies, definitions, and real-world use cases.

APM Telemetry Distributed Tracing OpenTelemetry Availability Tests Application Map KQL

🗺️

Mental Map — Where does this fit?

Big Picture Analogy

Azure Monitor is the hospital. Application Insights is the specialist doctor assigned to your apps — wiring up heart-rate monitors (metrics), running blood tests (logs), watching vital signs in real time (Live Metrics), and alerting nurses when something goes wrong (Smart Detection). Without this doctor, you only know the patient is sick after they collapse.

🔬

1 · What is Application Insights?

Core Definition

Application Insights is an extension of Azure Monitor that provides Application Performance Monitoring (APM). It collects metrics, telemetry, and trace logs from your live applications and visualises them so you can proactively detect issues before users notice, and reactively diagnose root causes when incidents happen.

APM

Application Performance Monitoring — the discipline of tracking an app's health, speed, and error rates continuously across its full lifecycle (dev → test → production).

Telemetry

Automated data collected from a running system. Includes request counts, response times, exceptions, page views, dependency calls, and custom events written by developers.

Trace Logs

Diagnostic log messages emitted at runtime (e.g., log.Information("Order placed")). App Insights correlates these with the specific HTTP request that produced them.

Analogy — The Flight Black Box

Your app is an aeroplane. Without Application Insights it has no flight recorder. If it crashes you have no idea why. App Insights is the Flight Data Recorder + Cockpit Voice Recorder — continuously capturing altitude (response time), engine metrics (CPU/memory), co-pilot chatter (trace logs), and GPS track (distributed traces).

Feature	What it does	Analogy
Live Metrics	Real-time stream of requests, failures, and server health	Ambulance dashboard while driving
Availability Tests	Probes your URL from global Azure datacentres on a schedule	Security guard doing hourly rounds
Smart Detection	ML-based anomaly detection — alerts on sudden failure spikes	Smoke detector in the kitchen
Application Map	Visual topology of microservices + health overlay	Circuit board with red LEDs on failures
Distributed Tracing	End-to-end journey of a single request through N services	Parcel tracking from warehouse to door
Usage Analytics	Funnels, retention, user flows	CCTV heat-map in a retail store
GitHub / ADO	Create work items directly from an exception	Auto-filing bug report to Jira

What App Insights monitors — at a glance

Request rates & response times Failure rates Dependency calls (SQL, HTTP) Exceptions & stack traces Page views & AJAX User & session counts Performance counters (CPU/RAM) Docker / Azure host diagnostics Custom events & metrics Distributed traces

🌍

Real-World Use Cases

E-Commerce — Flipkart / Amazon scenario

Detecting checkout failures during a flash sale

During a sale, thousands of concurrent users hit checkout. Smart Detection fires an alert within 2 minutes when the payment-service failure rate jumps from 0.5% → 12%. The on-call engineer opens Application Map, sees the payment-service node glowing red, clicks through to Distributed Traces, and finds a SQL connection-pool timeout. Fix deployed before most users notice.

SaaS Platform — Salesforce / Dynamics scenario

Identifying a slow dependency degrading UX globally

Average response time creeps from 200ms → 1.4s over three weeks. Log-based metrics show the "ReportGenerator" external API is 5× slower. Usage analytics reveal 22% of users abandoned the report page. PM and dev team correlate performance data with user drop-off — making a business case to replace the third-party API.

Banking / FinTech — HDFC / Revolut scenario

Availability SLA monitoring across geographies

Availability Tests ping the login endpoint every 5 minutes from 5 Azure regions. One morning, the India-East probe fails while US and Europe pass. Alert fires, geo-routing issue found in Azure Front Door config. Without global probes, the team would have relied on customer complaints — damaging trust in a regulated industry.

📊

2 · Log-Based vs Standard Metrics

Analogy — CCTV vs Sensor Dashboard

Log-based metrics = full CCTV footage. Every event is stored; you can rewind and inspect any moment in detail. But replaying hours of footage at query time is slow. Standard (pre-aggregated) metrics = the live sensor dashboard on the security desk — it shows "42 people entered in the last minute" instantly because it was computed as it happened. You can't drill into individuals, but it's blazing fast for alerts and dashboards.

📼 Log-Based Metrics

StorageIndividual events in Log Analytics

Query timeKusto query at read time (slower)

DetailAll event properties available

Best forAd-hoc diagnostics, root-cause

SamplingAccuracy drops if sampling on

SDKAny version

⚡ Standard (Pre-aggregated)

StorageTime-series aggregates

Query timePre-computed — very fast

DetailOnly key dimensions kept

Best forDashboards, real-time alerts

SamplingUnaffected — aggregated before

SDKSDK 2.7+ for preaggregation

💡 Key insight: Both metric types coexist in App Insights. In Metrics Explorer, use the namespace selector to switch between "Log-based metrics" and "Standard metrics (preview)". Use Standard for operations dashboards; switch to Log-based when investigating a specific incident.

🔧

3 · Instrumenting Your App

Instrumentation

Enabling an application to capture and emit telemetry data. Think of it as "wiring up sensors inside the app." More sensors = more visibility, but also more effort to configure.

Zero code changes

Auto-Instrumentation

Enable via Azure Portal on App Service, AKS, VMs — no code changes needed.

Best for: Already-deployed apps, migrations, teams without SDK expertise.

Limitation: Less configurable; not available for all languages. Check docs before assuming coverage.

SDK in code

Manual Instrumentation

Install App Insights SDK or Azure Monitor OpenTelemetry Distro in your project.

Best for: Custom events, business metrics, full telemetry pipeline control.

Requirement: You manage SDK version updates yourself.

Use the SDK only when…

① Custom events / metrics — e.g., "items sold", "loan approved", "game won"
② Control telemetry flow — filter, enrich, or sample before sending to Azure
③ Autoinstrumentation unavailable — language or platform not supported

What is OpenTelemetry?

OpenTelemetry (OTel) is the open-source, vendor-neutral standard for capturing traces, metrics, and logs — backed by CNCF. Microsoft is a Platinum Member. Think of it as the USB standard for observability: instead of every cloud vendor having a proprietary cable (SDK), OTel gives you one universal connector that works anywhere — AWS, GCP, Azure, or on-prem.

App Insights → OpenTelemetry terminology map

Old (Application Insights)	New (OpenTelemetry)
Autocollectors	Instrumentation libraries
Channel	Exporter
Codeless / Agent-based	Autoinstrumentation
Traces	Logs
Requests	Server Spans
Dependencies	Client / Internal Spans
Operation ID	Trace ID
Operation Parent ID	Span ID

🌐

4 · Availability Tests

Definition

Availability Tests (aka Synthetic Transaction Monitoring) are scheduled HTTP probes fired from Azure datacentres across the world to verify your app is up, responsive, and returning correct results. No code changes required.

Analogy — Mystery Shoppers at Every Branch

Imagine your app is a bank with branches worldwide. Azure sends "mystery shoppers" (probes) every 5 minutes to every branch, checking: Is the door open? Is service quick? Is the ATM working? If a branch fails 3 checks in a row, the regional manager (you) gets an alert.

✅ Standard Test

Standard Availability Test

Single HTTP request. Validates TLS/SSL cert, cert expiry, custom headers, GET/POST/HEAD verbs. Recommended for most scenarios.

🛠 Custom

Custom TrackAvailability()

Your own code runs the test logic and calls TrackAvailability(). Use for multi-step, login-required, or non-HTTP tests.

⚠ Retiring Sep 2026

URL Ping Test (Classic)

Simple URL check via portal. Being retired Sep 30, 2026. Migrate to Standard Tests before the deadline.

⚠️ Retirement warning: URL Ping Tests retire September 30, 2026. Migrate existing tests to Standard Tests before then to continue running single-step availability tests in your App Insights resources.

Healthcare SaaS — Epic / MedGenix scenario

Detecting regional outages before patients can't book appointments

Availability tests ping the appointment booking endpoint from East US, West Europe, and Southeast Asia every 5 minutes. When a CDN misconfiguration affects Southeast Asia traffic, the Singapore probe fails while US and Europe pass. The infrastructure team receives an alert and rolls back the CDN config within 8 minutes — before any patient calls the helpdesk.

🗺️

5 · Application Map

Definition

Application Map is a visual, auto-discovered topology of your distributed application. Every microservice and external dependency becomes a node. Node colour = health status. Click any node to drill into its performance, failures, and traces.

Analogy — Google Maps for Your Microservices

Google Maps shows every city (service), road (API call), and roadblock (failure). You can instantly see which junction is causing the traffic jam — rather than reading through thousands of log lines. Application Map is the Google Maps for your microservices. Red nodes = jams. Click to get the Waze-style drill-down of why it's slow.

Healthy service

Failing / hot-spot

Internal dependency (SQL, Storage)

External dependency (3rd party API)

Key concepts in Application Map

Component — any independently deployable microservice with App Insights SDK installed.
Dependency — external services (SQL, Redis, Event Hub, REST API) your component calls. Observed but not instrumented by your team.
Cloud Role Name — property App Insights uses to label each node on the map. Can be overridden in code.
Progressive Discovery — the map builds itself by following HTTP dependency calls. Hit "Update map components" to refresh.

Ride-sharing — Uber / OLA scenario

Pinpointing the broken link in a 15-service chain

A rider reports "app stuck on booking." With 15 microservices, reading logs would take hours. Application Map shows the Pricing Service in amber (elevated latency) while downstream Surge-Calculator is red (failures). Drill-down reveals a Redis cache eviction storm. Team fixes cache TTL config in 20 minutes — without App Map it would have been a multi-hour bridge call.

🚀

6 · Getting Started — Decision Flow

Path A

At Runtime

No code change. Enable in portal. Best for live apps.

Path B

At Dev Time

Add SDK. Custom events + full telemetry control.

Path C

Browser / SPA

JS snippet for page views, AJAX & user flows.

Path D

Availability

Add Standard Tests from Azure Portal.

📖

Glossary of Key Terms

Instrumentation Key / Connection StringUnique token linking your app to an App Insights resource

SamplingReduces telemetry volume — stores only a % of events

Kusto (KQL)Query language for log-based metrics in Log Analytics

Span / Trace IDOTel IDs — trace = full request journey, span = one step

PreaggregationComputing metric summaries at collection time, not query time

Smart DetectionAutomated ML anomaly alerts — no manual rule setup needed

Live Metrics StreamSub-second real-time dashboard; zero storage impact on host

Cloud Role NameProperty used to label nodes on Application Map

CNCFCloud Native Computing Foundation — governs OTel, Kubernetes & more

OTel DistroMicrosoft's pre-packaged OTel SDK + Azure Monitor exporter bundle

🎯

AZ-204 Exam Tips

✅ Remember

App Insights is an extension of Azure Monitor — not a standalone service

Availability tests require no code changes — any HTTP/HTTPS endpoint works

URL Ping Tests retire Sep 30, 2026 — migrate to Standard Tests

Both Log-based & Standard metrics coexist — switch via namespace selector

SDK 2.7+ enables client-side preaggregation → less cost + better accuracy

Smart Detection needs no manual alert rules — it is fully automatic

App Map uses Cloud Role Name property to label each node

Live Metrics has zero storage impact on the host environment

⚠ Trick Questions

Standard metrics are NOT affected by sampling — log-based are

App Map discovers topology via HTTP dependency tracking, not network scans

Custom TrackAvailability() = you own test logic; Standard Test = Azure owns it

OTel "Traces" = App Insights "Logs" — confusing terminology inversion!

Autoinstrumentation may not support all languages — always check docs

Log-based metrics use Kusto queries AT READ TIME — slower than Standard

Both metric types visible in the SAME Metrics Explorer — switch via namespace

SDK not required if autoinstrumentation is available and no custom events needed

⚡

Quick Decision Guide

If you need…	Use this
Monitor deployed app with no code change	Autoinstrumentation via Azure Portal
Custom business events (orders, logins, etc.)	Manual SDK or OTel Distro
Near-real-time dashboards, low query cost	Standard (pre-aggregated) metrics
Drill into a specific user session or exception	Log-based metrics + KQL query
Check public endpoint availability globally	Standard Availability Tests
Test a multi-step login flow for availability	Custom TrackAvailability()
See which of 20 microservices is failing	Application Map — look for red/amber nodes
Trace a single slow request end-to-end	Distributed Tracing — search by Trace ID
Alerts without writing alert rules manually	Enable Smart Detection