Terraform Multi-Environment — Terragrunt, Workspaces, or Directory Structure?
Every team eventually needs the same infrastructure in multiple environments — development, staging, production. The configuration is 90% identical, but the instance sizes, replica counts, and domain names differ. Terraform provides no built-in "environment" concept, so the community has developed three approaches. Each has trade-offs, and picking the wrong one for your team size and complexity leads to pain that compounds over time.
The Three Approaches
| Approach | How It Works | Best For |
|---|---|---|
| Workspaces | Single codebase, terraform workspace select switches state | Small teams, similar environments |
| Directory Structure | Separate directory per environment, duplicated (or symlinked) config | Medium teams, environments that diverge |
| Terragrunt | Wrapper tool that generates backend config and keeps modules DRY | Large teams, many environments, complex dependencies |
Approach 1 — Terraform Workspaces
Workspaces are built into Terraform. Each workspace has its own state file but shares the same configuration code:
# Create and switch workspaces
terraform workspace new staging
terraform workspace new production
terraform workspace list
# Switch between them
terraform workspace select staging
terraform plan
terraform apply
terraform workspace select production
terraform plan
terraform apply
Use terraform.workspace in your code to vary configuration per environment:
# main.tf
locals {
env_config = {
staging = {
instance_type = "t3.small"
instance_count = 1
db_class = "db.t3.micro"
multi_az = false
}
production = {
instance_type = "t3.large"
instance_count = 3
db_class = "db.r6g.large"
multi_az = true
}
}
config = local.env_config[terraform.workspace]
}
resource "aws_instance" "web" {
count = local.config.instance_count
instance_type = local.config.instance_type
ami = var.ami_id
tags = {
Name = "web-${terraform.workspace}-${count.index}"
Environment = terraform.workspace
}
}
resource "aws_db_instance" "main" {
instance_class = local.config.db_class
multi_az = local.config.multi_az
identifier = "app-db-${terraform.workspace}"
}
Pros: zero extra tooling, built into Terraform, DRY code.
Cons: shared configuration means all environments must use the same resources — you cannot have a resource that exists only in production. Accidentally running apply in the wrong workspace is a real risk. State isolation is weak — the states live in the same backend path.
Approach 2 — Directory Structure with tfvars
Each environment gets its own directory with its own backend configuration and variable values:
infra/
modules/
vpc/
compute/
database/
environments/
staging/
main.tf # Calls modules
backend.tf # Unique state path
terraform.tfvars
production/
main.tf # Same modules, different vars
backend.tf # Unique state path
terraform.tfvars
# environments/staging/backend.tf
terraform {
backend "s3" {
bucket = "company-terraform-state"
key = "staging/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-state-locks"
}
}
# environments/staging/terraform.tfvars
environment = "staging"
instance_type = "t3.small"
instance_count = 1
db_class = "db.t3.micro"
multi_az = false
domain_name = "staging.example.com"
# environments/production/terraform.tfvars
environment = "production"
instance_type = "t3.large"
instance_count = 3
db_class = "db.r6g.large"
multi_az = true
domain_name = "example.com"
# environments/staging/main.tf
module "vpc" {
source = "../../modules/vpc"
environment = var.environment
cidr = var.vpc_cidr
}
module "compute" {
source = "../../modules/compute"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnet_ids
instance_type = var.instance_type
instance_count = var.instance_count
environment = var.environment
}
# Deploy staging
cd infra/environments/staging
terraform init
terraform plan
terraform apply
# Deploy production
cd infra/environments/production
terraform init
terraform plan
terraform apply
Pros: complete isolation between environments, each environment can have unique resources, no risk of applying to the wrong environment (you cd into it).
Cons: main.tf is duplicated across environments. When you add a new module, you must update every environment directory. Drift between environments' main.tf files is common and silent.
Approach 3 — Terragrunt
Terragrunt is a thin wrapper around Terraform that eliminates the duplication of the directory approach while keeping the isolation:
infra/
terragrunt.hcl # Root config (backend, provider)
modules/
vpc/
main.tf
variables.tf
outputs.tf
compute/
main.tf
variables.tf
outputs.tf
environments/
staging/
env.hcl # Environment variables
vpc/
terragrunt.hcl # Just inputs, ~10 lines
compute/
terragrunt.hcl
production/
env.hcl
vpc/
terragrunt.hcl
compute/
terragrunt.hcl
# infra/terragrunt.hcl (root)
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite"
}
config = {
bucket = "company-terraform-state"
key = "${path_relative_to_include()}/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-state-locks"
encrypt = true
}
}
generate "provider" {
path = "provider.tf"
if_exists = "overwrite"
contents = <<EOF
provider "aws" {
region = "us-east-1"
default_tags {
tags = {
ManagedBy = "terraform"
}
}
}
EOF
}
# environments/staging/env.hcl
locals {
environment = "staging"
instance_type = "t3.small"
}
# environments/staging/vpc/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
locals {
env = read_terragrunt_config(find_in_parent_folders("env.hcl"))
}
terraform {
source = "../../../modules/vpc"
}
inputs = {
environment = local.env.locals.environment
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
}
# environments/staging/compute/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
locals {
env = read_terragrunt_config(find_in_parent_folders("env.hcl"))
}
terraform {
source = "../../../modules/compute"
}
dependency "vpc" {
config_path = "../vpc"
}
inputs = {
vpc_id = dependency.vpc.outputs.vpc_id
subnet_ids = dependency.vpc.outputs.private_subnet_ids
instance_type = local.env.locals.instance_type
instance_count = 1
environment = local.env.locals.environment
}
The dependency block is Terragrunt's killer feature. It reads outputs from another Terragrunt module's state, creating explicit dependency chains without terraform_remote_state data sources.
# Deploy everything in staging
cd infra/environments/staging
terragrunt run-all plan
terragrunt run-all apply
# Deploy a single component
cd infra/environments/staging/compute
terragrunt plan
terragrunt apply
Pros: DRY configuration, full isolation, explicit dependencies, automatic backend generation, run-all for orchestrated applies.
Cons: extra tool to learn, additional abstraction layer, slower init (generates files), debugging is harder because you are debugging generated Terraform.
Comparison Table
| Criteria | Workspaces | Directory Structure | Terragrunt |
|---|---|---|---|
| DRY code | Excellent | Poor (duplication) | Excellent |
| Isolation | Weak (shared config) | Strong | Strong |
| Risk of wrong env | High | Low | Low |
| Extra tooling | None | None | Terragrunt CLI |
| Learning curve | Low | Low | Medium |
| Unique resources per env | Hard | Easy | Easy |
| Dependency management | Manual | Manual | Built-in |
| Team size | 1-5 engineers | 5-15 engineers | 10+ engineers |
| CI/CD complexity | Low | Medium | Medium |
When to Use Each
Use workspaces when your environments are nearly identical (same resources, different sizes), your team is small, and you want zero extra tooling.
Use directory structure when environments diverge significantly (production has WAF, CDN, multi-region — staging does not), you want explicit isolation, and your team is comfortable with some duplication.
Use Terragrunt when you have many environments (dev, staging, production, DR, per-client), your infrastructure has complex inter-module dependencies, and you want DRY configuration without sacrificing isolation.
Hybrid Approaches
Many teams combine approaches:
# Terragrunt for environment orchestration + workspaces for feature branches
infra/
environments/
staging/
terragrunt.hcl # Terragrunt manages environments
production/
terragrunt.hcl
# Developers use workspaces for personal dev environments
# terraform workspace new dev-alice
Another common hybrid is directory structure with symlinks to reduce duplication:
# Shared configuration via symlinks
cd environments/staging
ln -s ../../shared/main.tf main.tf
ln -s ../../shared/variables.tf variables.tf
# Only backend.tf and terraform.tfvars are unique per environment
Closing Notes
There is no universally correct answer. Start simple: if workspaces cover your needs, use them. When you hit their limitations — environments that need different resources, risk of applying to the wrong workspace, weak isolation — graduate to directory structure or Terragrunt. The cost of migration is low because the underlying Terraform modules stay the same regardless of which approach orchestrates them. In the next post, we will tackle drift detection in depth — what causes it, how to detect it automatically, and strategies for preventing it entirely.
