Skip to main content

Terraform State Surgery — Move, Remove, and Recover State

· 7 min read
Goel Academy
DevOps & Cloud Learning Hub

Terraform state is the single source of truth for what Terraform manages. When you refactor your code — rename a resource, move it into a module, split a monolith into separate state files — the state needs to match. If it does not, Terraform sees a "delete old thing, create new thing" plan instead of recognizing it as the same resource with a new address. State surgery is how you fix that mismatch without destroying and recreating production infrastructure.

When State Surgery Is Needed

You will reach for state commands in these situations:

  1. Renaming a resource — you change aws_instance.web to aws_instance.app_server in code.
  2. Moving into a module — you refactor inline resources into a reusable module.
  3. Splitting state — you break a monolith state file into per-service or per-team state files.
  4. Adopting existing resources — infrastructure was created outside Terraform and you need to import it.
  5. Removing from management — you want Terraform to stop managing a resource without destroying it.
  6. Recovering from corruption — the state file is damaged and you need to restore from a backup.

terraform state mv — Rename and Relocate

The state mv command updates the address of a resource in state without touching the actual cloud resource:

# Rename a resource
terraform state mv aws_instance.web aws_instance.app_server

# Move into a module
terraform state mv aws_instance.web module.compute.aws_instance.web

# Move between modules
terraform state mv module.old.aws_s3_bucket.data module.new.aws_s3_bucket.data

# Move a resource that uses count (index-based)
terraform state mv 'aws_subnet.private[0]' 'aws_subnet.private["us-east-1a"]'

After running state mv, run terraform plan to verify the result shows no changes. If the plan is clean, your surgery succeeded.

Critical rule: always update your .tf code to match the new address before running terraform plan. The state tells Terraform where the resource lives; the code tells Terraform what it should look like. Both must agree.

moved Blocks (Terraform 1.1+)

Starting with Terraform 1.1, you can declare resource moves in code instead of running CLI commands. This is the preferred approach for teams because it is version-controlled and self-documenting:

# Tell Terraform the resource was renamed
moved {
from = aws_instance.web
to = aws_instance.app_server
}

# Tell Terraform the resource moved into a module
moved {
from = aws_security_group.web_sg
to = module.networking.aws_security_group.web_sg
}

# Tell Terraform a module was renamed
moved {
from = module.old_name
to = module.new_name
}

When someone runs terraform plan after pulling the code, Terraform automatically updates the state:

Terraform will perform the following actions:

# aws_instance.web has moved to aws_instance.app_server
resource "aws_instance" "app_server" {
id = "i-0abc123def456"
# (no changes)
}

Plan: 0 to add, 0 to change, 0 to destroy.

The moved block is superior to state mv because it works for everyone on the team automatically. No one needs to run manual commands.

terraform state rm — Remove Without Destroying

state rm removes a resource from Terraform state without destroying the actual cloud resource. This is useful when you want to stop managing a resource or hand it off to another state file:

# Remove a single resource
terraform state rm aws_instance.legacy_server

# Remove an entire module
terraform state rm module.deprecated_service

# Remove a resource with an index
terraform state rm 'aws_subnet.public[2]'

After removal, Terraform no longer knows about the resource. Running terraform plan will not show it at all — it will not try to create it or destroy it. The cloud resource continues to exist untouched.

Warning: if the resource is still defined in your .tf code, Terraform will try to create a new one. Remove the resource block from code before or immediately after running state rm.

terraform state pull and push

state pull downloads the current state as JSON. state push uploads a state file to the backend. These are your low-level tools for manual state repair:

# Download current state
terraform state pull > state_backup.json

# Inspect the state
cat state_backup.json | jq '.resources[] | .type + "." + .name'

# Push a modified or recovered state
terraform state push state_backup.json

Use state push with extreme caution. It overwrites the remote state with whatever you provide. If you push a stale state, you can orphan cloud resources or cause Terraform to destroy things it should not.

# Safer: push with lineage check (default)
terraform state push state_backup.json

# Force push (skips lineage check — DANGEROUS)
terraform state push -force state_backup.json

The lineage check ensures you are pushing to the same state chain. If the lineage does not match, Terraform refuses the push. Only use -force when you are intentionally replacing the state entirely, such as during a migration.

Recovering Corrupted State

If your state file gets corrupted — truncated during a failed apply, accidentally overwritten, or damaged by a concurrent write — you need to recover from a backup.

S3 backend with versioning (recommended):

# List state file versions
aws s3api list-object-versions \
--bucket my-terraform-state \
--prefix production/terraform.tfstate \
--query 'Versions[0:5].{VersionId:VersionId,Modified:LastModified,Size:Size}'

# Download a specific version
aws s3api get-object \
--bucket my-terraform-state \
--key production/terraform.tfstate \
--version-id "abc123xyz" \
recovered_state.json

# Verify the recovered state
cat recovered_state.json | jq '.serial, .lineage, (.resources | length)'

# Push the recovered state
terraform state push recovered_state.json

This is why S3 versioning is non-negotiable for state buckets. Without it, corruption means starting from scratch with terraform import for every resource.

Splitting State Files

When a monolith state file grows too large (500+ resources), it becomes slow and risky. Splitting it into per-service state files improves performance and reduces blast radius.

The process:

# Step 1: Create the new state file's backend configuration
# services/api/backend.tf
# terraform {
# backend "s3" {
# bucket = "my-terraform-state"
# key = "services/api/terraform.tfstate"
# }
# }

# Step 2: Move resources from old state to new state
terraform state mv -state=old.tfstate -state-out=new.tfstate \
module.api module.api

# Step 3: Remove from old state
terraform state rm module.api

# Step 4: Push new state to the new backend
cd services/api
terraform init
terraform state push ../../new.tfstate

Alternatively, with Terraform 1.1+, use moved blocks combined with terraform_remote_state data sources to split cleanly.

Importing After Accidental Destroy

If someone accidentally destroys a resource that still exists (they ran destroy on the wrong resource), you can re-import it:

# Import the resource back into state
terraform import aws_instance.web i-0abc123def456

# For resources in modules
terraform import module.compute.aws_instance.web i-0abc123def456

# For resources with count
terraform import 'aws_instance.web[0]' i-0abc123def456

# For resources with for_each
terraform import 'aws_instance.web["app-1"]' i-0abc123def456

After importing, run terraform plan to check for drift between the code and the actual resource. Fix any differences in your .tf files until the plan is clean.

State Surgery Safety Checklist

Before performing any state surgery:

StepCommandPurpose
1. Back up stateterraform state pull > backup_$(date +%s).jsonRecovery point
2. List current resourcesterraform state listKnow what you have
3. Inspect target resourceterraform state show <address>Verify identity
4. Perform the operationterraform state mv/rmThe surgery
5. Verify with planterraform planShould show no changes
6. Commit code changesgit add && git commitKeep code and state in sync
# Full safety script
#!/bin/bash
set -euo pipefail

echo "=== State Surgery: $(date) ==="

# Backup
terraform state pull > "backup_$(date +%s).json"
echo "Backup created"

# Show current state
terraform state list | wc -l
echo "resources in state"

# Perform operation (replace with your command)
terraform state mv aws_instance.old aws_instance.new

# Verify
echo "Running plan to verify..."
terraform plan -detailed-exitcode
if [ $? -eq 0 ]; then
echo "SUCCESS: No changes detected"
else
echo "WARNING: Plan shows changes — review carefully"
fi

Closing Notes

State surgery is Terraform's escape hatch for refactoring. Prefer moved blocks over manual state mv commands — they are version-controlled, team-friendly, and self-documenting. Always back up before any operation, and always verify with terraform plan afterward. A clean plan after surgery means your code, state, and cloud reality all agree. In the next post, we will explore Terraform module design patterns — how to build composable, reusable modules that scale across teams.