AWS Backup & Recovery Safety Nets
Even with the best guardrails, accidents can happen. Build automated backup strategies and recovery procedures so you can restore any accidentally deleted resource within minutes.
AWS Backup: Centralized Backup Management
AWS Backup is a fully managed service that centralizes and automates backups across AWS services. It supports EC2, EBS, RDS, DynamoDB, EFS, S3, and more. For AI agent safety, AWS Backup is your last line of defense.
# Create a backup vault (encrypted storage for backups)
aws backup create-backup-vault \
--backup-vault-name production-vault \
--encryption-key-arn arn:aws:kms:us-east-1:123456789012:key/mrk-xxxx
# Create a backup plan with daily backups and 35-day retention
aws backup create-backup-plan --backup-plan '{
"BackupPlanName": "production-daily",
"Rules": [
{
"RuleName": "DailyBackup",
"TargetBackupVaultName": "production-vault",
"ScheduleExpression": "cron(0 3 * * ? *)",
"StartWindowMinutes": 60,
"CompletionWindowMinutes": 180,
"Lifecycle": {
"DeleteAfterDays": 35
},
"CopyActions": [
{
"DestinationBackupVaultArn": "arn:aws:backup:us-west-2:123456789012:backup-vault:dr-vault",
"Lifecycle": {
"DeleteAfterDays": 90
}
}
]
},
{
"RuleName": "HourlyBackup",
"TargetBackupVaultName": "production-vault",
"ScheduleExpression": "cron(0 * * * ? *)",
"StartWindowMinutes": 60,
"CompletionWindowMinutes": 120,
"Lifecycle": {
"DeleteAfterDays": 7
}
}
]
}'
Assign Resources to the Backup Plan
# Assign all resources tagged Lifecycle=production to the backup plan
aws backup create-backup-selection \
--backup-plan-id "PLAN-ID-FROM-PREVIOUS-COMMAND" \
--backup-selection '{
"SelectionName": "production-resources",
"IamRoleArn": "arn:aws:iam::123456789012:role/AWSBackupDefaultServiceRole",
"ListOfTags": [
{
"ConditionType": "STRINGEQUALS",
"ConditionKey": "Lifecycle",
"ConditionValue": "production"
}
]
}'
Lifecycle=production is automatically included in the backup plan. No manual configuration needed when you add new resources.Cross-Region Backup for Disaster Recovery
If an AI agent destroys resources in one region, cross-region backups ensure you have copies in a completely separate location. The AWS Backup plan above includes a CopyActions rule that copies daily backups to us-west-2.
resource "aws_backup_vault" "primary" {
name = "production-vault"
kms_key_arn = aws_kms_key.backup.arn
}
resource "aws_backup_vault" "dr" {
provider = aws.us_west_2
name = "dr-vault"
kms_key_arn = aws_kms_key.backup_dr.arn
}
resource "aws_backup_plan" "production" {
name = "production-daily"
rule {
rule_name = "DailyBackup"
target_vault_name = aws_backup_vault.primary.name
schedule = "cron(0 3 * * ? *)"
lifecycle {
delete_after = 35
}
copy_action {
destination_vault_arn = aws_backup_vault.dr.arn
lifecycle {
delete_after = 90
}
}
}
rule {
rule_name = "HourlyBackup"
target_vault_name = aws_backup_vault.primary.name
schedule = "cron(0 * * * ? *)"
lifecycle {
delete_after = 7
}
}
}
resource "aws_backup_selection" "production" {
name = "production-resources"
iam_role_arn = aws_iam_role.backup.arn
plan_id = aws_backup_plan.production.id
selection_tag {
type = "STRINGEQUALS"
key = "Lifecycle"
value = "production"
}
}
S3 Versioning and Cross-Region Replication
S3 versioning preserves every version of every object. Cross-region replication copies objects to a bucket in another region automatically.
# Enable versioning on source bucket (required for replication)
aws s3api put-bucket-versioning \
--bucket production-assets \
--versioning-configuration Status=Enabled
# Enable versioning on destination bucket
aws s3api put-bucket-versioning \
--bucket production-assets-replica \
--versioning-configuration Status=Enabled
# Set up replication rule
aws s3api put-bucket-replication \
--bucket production-assets \
--replication-configuration '{
"Role": "arn:aws:iam::123456789012:role/S3ReplicationRole",
"Rules": [
{
"ID": "ReplicateAll",
"Status": "Enabled",
"Priority": 1,
"Filter": {},
"Destination": {
"Bucket": "arn:aws:s3:::production-assets-replica",
"StorageClass": "STANDARD_IA"
},
"DeleteMarkerReplication": {
"Status": "Disabled"
}
}
]
}'
DeleteMarkerReplication is set to Disabled. This means if an AI agent deletes objects in the source bucket, the deletions will not be replicated to the destination. Your replica bucket retains all objects even if the source is wiped.RDS Automated Backups and Point-in-Time Recovery
RDS automated backups take daily snapshots and capture transaction logs continuously. This allows you to restore to any second within the retention period.
# Set backup retention to 35 days (maximum)
aws rds modify-db-instance \
--db-instance-identifier production-postgres \
--backup-retention-period 35 \
--preferred-backup-window "03:00-04:00" \
--apply-immediately
# Restore to a specific point in time
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier production-postgres \
--target-db-instance-identifier production-postgres-restored \
--restore-time "2026-03-20T14:30:00Z" \
--db-instance-class db.t3.medium \
--tags Key=Name,Value=production-postgres-restored Key=Lifecycle,Value=recovery
# Restore from the latest automated backup
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier production-postgres \
--target-db-instance-identifier production-postgres-restored \
--use-latest-restorable-time \
--db-instance-class db.t3.medium
EC2 AMI Automation
Regular AMI backups of your EC2 instances let you restore the entire instance (OS, applications, data) quickly.
# Create an AMI from a running instance
aws ec2 create-image \
--instance-id i-0abc123def456789 \
--name "production-web-$(date +%Y-%m-%d)" \
--description "Daily backup of production web server" \
--no-reboot \
--tag-specifications 'ResourceType=image,Tags=[{Key=Name,Value=production-web-backup},{Key=Lifecycle,Value=backup}]'
# Use AWS Backup to automate AMI creation (preferred method)
# Or use Amazon Data Lifecycle Manager:
aws dlm create-lifecycle-policy \
--description "Daily AMI backup for production instances" \
--state ENABLED \
--execution-role-arn arn:aws:iam::123456789012:role/AWSDataLifecycleManagerDefaultRole \
--policy-details '{
"PolicyType": "IMAGE_MANAGEMENT",
"ResourceTypes": ["INSTANCE"],
"TargetTags": [{"Key": "Lifecycle", "Value": "production"}],
"Schedules": [{
"Name": "DailyAMI",
"CreateRule": {
"Interval": 24,
"IntervalUnit": "HOURS",
"Times": ["03:00"]
},
"RetainRule": {
"Count": 14
},
"CopyTags": true
}]
}'
Recovery Procedures When an AI Agent Deletes Resources
Despite all guardrails, an incident may occur. Here are step-by-step recovery procedures for each service:
EC2 Instance Recovery
aws ec2 describe-images --owners self --filters "Name=tag:Name,Values=production-web-backup" --query 'sort_by(Images, &CreationDate)[-1]'aws ec2 run-instances --image-id ami-xxxx --instance-type t3.medium --disable-api-terminationRDS Database Recovery
aws rds restore-db-instance-to-point-in-time --source-db-instance-identifier production-postgres --target-db-instance-identifier production-postgres-restored --use-latest-restorable-timeS3 Object Recovery
aws s3api list-object-versions --bucket production-assets and restore by deleting the delete marker.Cost Considerations for Backup Strategies
| Strategy | Monthly Cost Estimate | Recovery Speed | Data Loss Risk |
|---|---|---|---|
| S3 Versioning | Storage cost for all versions (~1.5-3x base) | Instant (seconds) | Zero (all versions preserved) |
| S3 Cross-Region Replication | 2x storage + data transfer fees | Instant from replica | Near-zero (seconds of lag) |
| RDS Automated Backups | Free up to DB size; $0.095/GB beyond | 10-30 minutes | Near-zero (transaction log coverage) |
| EC2 AMI (daily) | EBS snapshot cost (~$0.05/GB/month) | 5-15 minutes | Up to 24 hours of changes |
| AWS Backup (hourly) | Snapshot cost + cross-region transfer | 5-30 minutes | Up to 1 hour of changes |
| DynamoDB PITR | ~25% of table storage cost | Minutes to hours (depends on size) | Near-zero (continuous backup) |
Lilly Tech Systems