Azure Cost Management — Find and Kill Your Wasted Spend
Your Azure bill is climbing, and nobody can explain why. Every team points fingers at another team's subscription. Sound familiar? Azure Cost Management gives you the visibility to trace every dollar back to its source and the tools to stop the bleeding before your finance team loses their mind.
Cost Management + Billing Overview
Azure Cost Management + Billing is built into the Azure portal at no extra cost. It pulls data from all your subscriptions, management groups, and even AWS accounts (via connector) into a single pane. The data refreshes multiple times per day and retains up to 13 months of history.
# Check your current month's cost for a subscription
az consumption usage list \
--subscription "Production" \
--start-date 2025-08-01 \
--end-date 2025-08-09 \
--query "[].{Service:consumedService, Cost:pretaxCost, Currency:currency}" \
--output table
The three pillars are Cost Analysis (understand what you spent), Budgets (set guardrails), and Advisor (get actionable recommendations). Most teams skip straight to Advisor. Do not do that. You need to understand your cost shape first.
Cost Analysis Views
Cost Analysis is where you interrogate your spend. The built-in views cover most questions:
# Get a cost breakdown by resource group for this billing period
az costmanagement query \
--type ActualCost \
--scope "subscriptions/<subscription-id>" \
--timeframe MonthToDate \
--dataset-grouping name="ResourceGroup" type="Dimension" \
--output table
The portal provides five default views you should use daily:
| View | What It Shows | Best For |
|---|---|---|
| Accumulated costs | Running total over time | Spotting cost acceleration |
| Daily costs | Day-by-day bar chart | Identifying spikes |
| Cost by service | Breakdown by Azure service | Knowing which services dominate |
| Cost by resource | Individual resource costs | Finding the expensive outliers |
| Invoice details | Line items matching your bill | Reconciling with finance |
Create custom views by adding filters (resource group, tag, location, meter) and save them for your team. Export these views as CSV or schedule automated exports to a storage account for ingestion into Power BI.
# Schedule a daily cost export to a storage account
az costmanagement export create \
--name "daily-cost-export" \
--scope "subscriptions/<subscription-id>" \
--type ActualCost \
--timeframe MonthToDate \
--recurrence Daily \
--schedule-recurrence-period from="2025-08-09T00:00:00Z" to="2025-12-31T00:00:00Z" \
--storage-account-id "/subscriptions/<sub-id>/resourceGroups/rg-billing/providers/Microsoft.Storage/storageAccounts/costexports" \
--storage-container "exports"
Budgets and Alerts
Budgets are your first line of defense. Set them at the subscription or resource group level, and Azure will notify you before you blow past your limits.
# Create a monthly budget with alerts at 50%, 80%, and 100%
az consumption budget create \
--budget-name "prod-monthly-budget" \
--amount 10000 \
--category Cost \
--time-grain Monthly \
--start-date 2025-08-01 \
--end-date 2026-07-31 \
--resource-group rg-production \
--notifications \
'{"50-percent":{"enabled":true,"operator":"GreaterThanOrEqualTo","threshold":50,"contactEmails":["platform-team@company.com"]},
"80-percent":{"enabled":true,"operator":"GreaterThanOrEqualTo","threshold":80,"contactEmails":["platform-team@company.com","finance@company.com"]},
"100-percent":{"enabled":true,"operator":"GreaterThanOrEqualTo","threshold":100,"contactEmails":["platform-team@company.com","finance@company.com","cto@company.com"]}}'
You can also trigger Action Groups from budget alerts. An Action Group can call a webhook, run a Logic App, or invoke an Azure Function. A common pattern: when spend hits 90%, an Azure Function stops all non-essential dev VMs automatically.
Azure Advisor Recommendations
Azure Advisor scans your resources and gives you free money — or at least tells you how to stop wasting it. The Cost section surfaces recommendations like:
- Shut down or resize underutilized VMs
- Buy Reserved Instances for steady-state workloads
- Delete orphaned disks and public IPs
- Right-size underused SQL databases
# List all cost-saving recommendations from Advisor
az advisor recommendation list \
--category Cost \
--output table
# Get estimated yearly savings
az advisor recommendation list \
--category Cost \
--query "[].{Resource:resourceMetadata.resourceId, Savings:extendedProperties.savingsAmount, Currency:extendedProperties.savingsCurrency}" \
--output table
Do not just read these recommendations. Act on them weekly. Assign an owner on your team for the Advisor Cost review. The average enterprise finds 20-30% waste on their first pass.
Reserved Instances and Spot VMs
These are your two biggest cost levers:
| Pricing Model | Discount | Commitment | Best For |
|---|---|---|---|
| Pay-as-you-go | 0% (baseline) | None | Short experiments |
| Reserved Instances (1-year) | ~20-40% | 1 year upfront or monthly | Steady-state production |
| Reserved Instances (3-year) | ~40-60% | 3 years upfront or monthly | Databases, core infra |
| Spot VMs | Up to 90% | None (can be evicted) | Batch, CI/CD, stateless |
| Savings Plans | ~15-30% | Hourly compute commitment | Variable workloads |
# Purchase a 1-year reservation for 4 Standard_D4s_v5 VMs
az reservations reservation-order purchase \
--sku Standard_D4s_v5 \
--location eastus \
--reserved-resource-type VirtualMachines \
--quantity 4 \
--term P1Y \
--billing-scope-id "/subscriptions/<subscription-id>" \
--display-name "Prod VMs 1-Year RI" \
--applied-scope-type Single \
--applied-scope "/subscriptions/<subscription-id>"
Start by reserving your baseline — the VMs, databases, and services that run 24/7. Use Spot VMs for anything that can tolerate interruption. Savings Plans are a good middle ground when your compute usage is steady but you frequently change VM sizes.
Right-Sizing VMs
The most common waste is running oversized VMs. A D8s_v5 (8 vCPUs, 32 GB RAM) hosting a service that never exceeds 15% CPU is money on fire.
# Check CPU utilization for VMs in a resource group over 30 days
az monitor metrics list \
--resource "/subscriptions/<sub-id>/resourceGroups/rg-production/providers/Microsoft.Compute/virtualMachines/vm-api-server" \
--metric "Percentage CPU" \
--interval P1D \
--aggregation Average Maximum \
--start-time 2025-07-09T00:00:00Z \
--end-time 2025-08-09T00:00:00Z \
--output table
If the average is below 20% and the max is below 50%, drop one size. If it never crosses 10%, drop two sizes. Resize during a maintenance window since it requires a VM reboot.
Hunting Orphaned Resources
Orphaned resources are the silent killers. When you delete a VM, Azure does not delete its managed disk, public IP, or NIC. These keep billing.
# Find unattached managed disks
az disk list \
--query "[?managedBy==null].{Name:name, RG:resourceGroup, Size:diskSizeGb, SKU:sku.name, State:diskState}" \
--output table
# Find unattached public IP addresses
az network public-ip list \
--query "[?ipConfiguration==null].{Name:name, RG:resourceGroup, IP:ipAddress, SKU:sku.name}" \
--output table
# Find unattached NICs
az network nic list \
--query "[?virtualMachine==null].{Name:name, RG:resourceGroup}" \
--output table
Run these queries monthly. Better yet, create an Azure Automation runbook that runs them weekly and emails the results to your team. A company with 200+ VMs can easily have 50+ orphaned disks totaling hundreds of dollars per month.
Auto-Shutdown for Dev VMs
Dev and test VMs do not need to run at 3 AM. Auto-shutdown saves you 60% or more on non-production compute.
# Enable auto-shutdown at 7 PM for a dev VM
az vm auto-shutdown \
--resource-group rg-dev \
--name vm-dev-01 \
--time 1900 \
--email "dev-team@company.com"
For starting VMs back up in the morning, use Azure Automation with a scheduled runbook or an Azure Function on a timer trigger. The auto-shutdown feature only handles stopping — you need automation for the start side.
Cost Allocation Tags
Tags are how you answer "who spent what." Without them, your cost reports are a mess of anonymous resource groups. Enforce tags with Azure Policy (we will cover that next), and use them as grouping dimensions in Cost Analysis.
A minimum tagging standard:
| Tag Key | Purpose | Example Values |
|---|---|---|
CostCenter | Maps to finance cost center | CC-1234, CC-5678 |
Environment | Lifecycle stage | Production, Staging, Dev |
Team | Owning team | Platform, Backend, Data |
Project | Business project | ProjectX, Migration2025 |
Owner | Responsible person | jane@company.com |
# Tag all resources in a resource group
az tag create \
--resource-id "/subscriptions/<sub-id>/resourceGroups/rg-production" \
--tags CostCenter=CC-1234 Environment=Production Team=Platform Owner=jane@company.com
Azure vs AWS Cost Tools
If you have worked with AWS, here is how Azure tools map:
| Capability | Azure | AWS |
|---|---|---|
| Cost dashboard | Cost Management + Billing | Cost Explorer |
| Budget alerts | Budgets | AWS Budgets |
| Recommendations | Azure Advisor (Cost) | AWS Trusted Advisor / Cost Optimization Hub |
| Reservations | Reserved Instances / Savings Plans | Reserved Instances / Savings Plans |
| Spot compute | Spot VMs | Spot Instances |
| Tagging enforcement | Azure Policy | AWS SCP + Tag Policies |
| Export / Reporting | Cost Exports + Power BI | CUR (Cost and Usage Reports) + Athena |
Both clouds have similar tooling. Azure's Cost Management has a slight edge in the portal experience — the drilldown and pivot capabilities are more intuitive. AWS wins on third-party ecosystem with tools like Vantage, Kubecost, and Infracost having deeper integrations.
Wrapping Up
Cloud cost management is not a one-time project. It is a weekly discipline. Start with visibility (Cost Analysis and tags), add guardrails (budgets and alerts), act on Advisor recommendations, and invest in reservations for steady-state workloads. Hunt orphaned resources monthly. Right-size VMs quarterly. The companies that treat cost optimization as an engineering practice — not an afterthought — are the ones that scale without their CFO having a breakdown.
Next up: We will dive into Azure Policy and Blueprints — how to enforce governance rules across your entire organization so teams cannot create untagged resources, deploy to unapproved regions, or spin up oversized VMs in the first place.
