Deploying AWS RDS: From Console Clicks to Production Infrastructure as Code
Introduction
Amazon RDS represents one of AWS's most mature managed services—a database platform that has evolved from simple "click and deploy" origins into a sophisticated system capable of running mission-critical workloads at global scale. Yet this maturity brings a paradox: while RDS itself has grown more robust, the ways we deploy it have fragmented into dozens of approaches, each with different trade-offs for reliability, repeatability, and operational overhead.
The fundamental question isn't whether you can create an RDS instance—the AWS console makes that trivial. The question is whether you can create it consistently, securely, and reproducibly across development, staging, and production environments while maintaining the discipline needed for enterprise operations.
This document explores the landscape of RDS deployment methods, from anti-patterns that plague many organizations to production-ready infrastructure-as-code solutions. More importantly, it explains why Project Planton standardizes on specific approaches and what that means for teams building reliable database infrastructure.
The Deployment Maturity Spectrum
Level 0: Manual Console Operations (The Anti-Pattern)
The AWS Management Console offers the quickest path to a running database. Click "Create database," choose a few options from dropdown menus, and within minutes you have a functioning RDS instance. For learning AWS or running a proof-of-concept, this is perfectly reasonable.
For anything beyond that single use case, it's a trap.
Why it fails at scale: Manual console operations are fundamentally unrepeatable. When you create a database through the GUI, you're making dozens of configuration decisions—subnet groups, security groups, backup retention, parameter groups, encryption settings—and none of it is recorded in a format you can review, version control, or replicate. Six months later when you need to create a similar database for a new environment, you're starting from scratch, relying on memory or hastily written runbooks that are inevitably incomplete or outdated.
The console also makes dangerous configurations too easy. One checkbox difference—"Publicly accessible: Yes"—can expose your database to the internet. Without infrastructure-as-code reviews, these mistakes slip through. AWS's own security findings consistently show that misconfigured RDS instances (public databases with weak credentials, unencrypted storage, no backup retention) account for a significant portion of data breaches.
Verdict: Acceptable for individual learning and one-off experiments. Unacceptable for any environment that matters or needs to be recreated.
Level 1: AWS CLI and SDK Scripts (Scriptable, but Stateless)
The AWS CLI and SDKs (like Boto3 for Python) represent the first step toward automation. Instead of clicking, you run commands:
aws rds create-db-instance \
--db-instance-identifier production-db \
--engine postgres \
--engine-version 15.4 \
--db-instance-class db.r5.xlarge \
--allocated-storage 100 \
--master-username admin \
--master-user-password "SecurePassword123!" \
--vpc-security-group-ids sg-abc123 \
--db-subnet-group-name prod-subnet-group \
--multi-az \
--storage-encrypted \
--backup-retention-period 7
This is better—it's repeatable, you can save it in a script, and you can version control that script. But it's still imperative and stateless. The CLI doesn't track what already exists. Run the script twice and it fails (the instance already exists). Try to update a configuration? You need a completely different command (modify-db-instance) with different parameters. Delete resources? Yet another command, and you need to remember everything you created.
Operational challenges:
- No drift detection (if someone manually changes the instance, your script doesn't know)
- Complex dependency management (must create subnet groups, parameter groups in correct order)
- Credential handling becomes your problem (passwords in scripts or environment variables)
- State tracking is manual (you maintain lists of what exists where)
For a few instances managed by a single team, CLI scripting can work. For dozens of databases across multiple environments and teams, it becomes unmaintainable.
Verdict: Useful for one-off operations or quick prototypes. Insufficient for production infrastructure requiring lifecycle management, change tracking, and team collaboration.
Level 2: Configuration Management Tools (Ansible, Chef, Puppet)
Ansible and similar tools add declarative intent to the imperative scripting approach. Instead of "run this command," you declare "ensure this database exists with these properties."
An Ansible playbook for RDS might look like:
- name: Ensure production PostgreSQL database exists
amazon.aws.rds_instance:
id: production-postgres
state: present
engine: postgres
engineVersion: "15.4"
db_instance_class: db.r5.xlarge
allocated_storage: 100
username: admin
password: "{{ vault_db_password }}"
multiAz: true
storageEncrypted: true
vpc_security_group_ids:
- sg-abc123
Run this playbook repeatedly and Ansible ensures the database exists with the specified configuration, creating it only if needed. This is idempotent—safe to run multiple times.
Where configuration management falls short for infrastructure:
While Ansible excels at configuring servers (its original purpose), it's less suited for managing cloud infrastructure lifecycles. It lacks native understanding of cloud resource dependencies, state management is add-on rather than core, and it doesn't provide robust drift detection or plan previews before making changes.
Configuration management tools work best when combined with true infrastructure-as-code—Ansible orchestrating application deployment while Terraform/Pulumi manage the underlying RDS infrastructure.
Verdict: Valuable in an overall automation strategy, but typically paired with dedicated IaC tools rather than used alone for database infrastructure.
Level 3: Production Infrastructure as Code (Terraform, Pulumi, CloudFormation)
Modern infrastructure-as-code tools were built specifically to manage cloud resources with the rigor required for production operations. They provide:
- Declarative configuration (define desired state, not steps)
- State tracking (know what exists, detect drift from desired state)
- Dependency graphs (automatically order resource creation/updates)
- Change previewing (see what will change before applying)
- Concurrent safety (locking mechanisms to prevent conflicting changes)
These aren't just conveniences—they're the foundation of safe, scalable infrastructure management.
Comparing Production-Ready IaC Tools
When you've moved past manual operations and simple scripts, you're choosing between several mature IaC platforms. Each can successfully manage RDS at production scale, but they differ in philosophy, ecosystem, and operational model.
Terraform/OpenTofu: The Multi-Cloud Standard
Terraform has become the de facto standard for infrastructure-as-code across clouds. You define resources in HCL (HashiCorp Configuration Language):
resource "aws_db_instance" "production" {
identifier = "production-postgres"
engine = "postgres"
engine_version = "15.4"
instance_class = "db.r5.xlarge"
allocated_storage = 100
storage_encrypted = true
db_subnet_group_name = aws_db_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.database.id]
username = "admin"
manage_master_user_password = true # Use AWS Secrets Manager
multi_az = true
backup_retention_period = 7
deletion_protection = true
}
Strengths:
- Massive ecosystem with thousands of community modules for common patterns
- Multi-cloud support (manage RDS alongside GCP Cloud SQL, Azure databases with same tooling)
- Mature state management with remote backends (S3 + DynamoDB locking)
- Battle-tested at scale by enterprises managing thousands of resources
OpenTofu is a community fork maintaining full compatibility with Terraform while ensuring open-source governance after HashiCorp's license change. For users, they're functionally identical—same HCL syntax, same providers, same workflow. OpenTofu represents a commitment to open-source principles while maintaining enterprise-grade capabilities.
Considerations:
- HCL is a domain-specific language (learn HCL, not reuse existing language skills)
- State file management is your responsibility (though well-solved with remote backends)
- Secrets in state files require careful handling (use AWS Secrets Manager integration)
When to choose Terraform/OpenTofu:
- You're building multi-cloud infrastructure or want that future optionality
- Your team values a mature, well-documented ecosystem with extensive community resources
- You prefer declarative configuration over general-purpose programming
Pulumi: Infrastructure as Real Code
Pulumi takes a different approach: write infrastructure code in languages you already know (TypeScript, Python, Go, C#, Java).
import pulumi_aws as aws
db = aws.rds.Instance("production-postgres",
engine="postgres",
engine_version="15.4",
instance_class="db.r5.xlarge",
allocated_storage=100,
storage_encrypted=True,
db_subnet_group_name=subnet_group.name,
vpc_security_group_ids=[security_group.id],
username="admin",
manage_master_user_password=True,
multi_az=True,
backup_retention_period=7,
deletion_protection=True,
tags={
"Environment": "production",
"ManagedBy": "pulumi"
}
)
Strengths:
- Use familiar languages with full IDE support, type checking, testing frameworks
- Programming language power (loops, conditionals, functions) for complex infrastructure
- Excellent secrets management (encrypted in state by default)
- Multi-cloud support matching Terraform's breadth
Considerations:
- Smaller ecosystem than Terraform (growing rapidly, but fewer pre-built modules)
- Default state backend is Pulumi Cloud (or self-managed S3/Azure Blob)
- Team must choose and standardize on a language
When to choose Pulumi:
- Your team consists of software developers who prefer code over DSLs
- You need complex logic in infrastructure definitions (dynamic resource creation, sophisticated conditionals)
- You want infrastructure tightly coupled with application code in the same language/repository
CloudFormation: AWS-Native Infrastructure as Code
CloudFormation is AWS's original IaC service, predating Terraform. You define resources in YAML or JSON:
Resources:
ProductionDatabase:
Type: AWS::RDS::DBInstance
Properties:
DBInstanceIdentifier: production-postgres
Engine: postgres
EngineVersion: "15.4"
DBInstanceClass: db.r5.xlarge
AllocatedStorage: 100
StorageEncrypted: true
DBSubnetGroupName: !Ref DBSubnetGroup
VPCSecurityGroups:
- !Ref DatabaseSecurityGroup
MasterUsername: admin
ManageMasterUserPassword: true
MultiAZ: true
BackupRetentionPeriod: 7
DeletionProtection: true
Strengths:
- No external state management (AWS tracks stack state)
- Zero additional cost (CloudFormation itself is free)
- Immediate AWS feature support (new RDS features often available in CloudFormation on launch day)
- Deep AWS integration (IAM, service catalogs, change sets for preview)
- Robust rollback on failures
Considerations:
- AWS-only (no multi-cloud portability)
- Verbose templates for complex infrastructure
- Limited modularity compared to Terraform modules (though nested stacks and macros help)
When to choose CloudFormation:
- You're all-in on AWS with no multi-cloud requirements
- You want minimal external tooling dependencies
- You prefer AWS-native solutions for compliance or operational simplicity
AWS CDK: CloudFormation with Code
AWS Cloud Development Kit (CDK) gives you the best of both worlds—code in familiar languages that synthesizes to CloudFormation templates:
import * as rds from 'aws-cdk-lib/aws-rds';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
const database = new rds.DatabaseInstance(this, 'ProductionDatabase', {
engine: rds.DatabaseInstanceEngine.postgres({
version: rds.PostgresEngineVersion.VER_15_4
}),
instanceType: ec2.InstanceType.of(ec2.InstanceClass.R5, ec2.InstanceSize.XLARGE),
vpc,
vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
credentials: rds.Credentials.fromGeneratedSecret('admin'),
multiAz: true,
storageEncrypted: true,
backupRetention: Duration.days(7),
deletionProtection: true
});
Strengths:
- High-level constructs with sensible defaults (CDK automatically creates subnet groups, parameter groups when needed)
- Type safety and IDE completion from modern languages
- CloudFormation's operational benefits (managed state, rollbacks, change sets)
- AWS-first design often provides simpler APIs than raw CloudFormation
Considerations:
- AWS-only like CloudFormation
- Learning curve for the CDK abstraction layer (even though it outputs CloudFormation)
- Still relatively new compared to Terraform, though rapidly maturing
When to choose CDK:
- You want code-driven infrastructure but prefer AWS-native tooling
- Your team is comfortable with TypeScript/Python/Java and wants infrastructure in the same language as applications
- You value higher-level abstractions over granular control
The Project Planton Choice: Dual-Track IaC Support
Project Planton takes a pragmatic, inclusive approach: support both Terraform and Pulumi as first-class deployment targets.
Why This Decision Makes Sense
Different teams have different needs, histories, and preferences. Rather than forcing a choice, Project Planton's API-driven architecture allows the same declarative specification to be deployed via either tool:
-
Define once: Specify your RDS instance configuration in Project Planton's minimal, validated API schema (the 80/20 fields that matter for most use cases)
-
Deploy with your preferred tool: Project Planton generates appropriate Terraform HCL or Pulumi code from that specification
-
Maintain flexibility: Teams can choose the IaC tool that fits their culture and requirements without changing their database specifications
The Terraform Module (Default)
Terraform serves as Project Planton's default deployment path for several reasons:
- Industry momentum: Terraform remains the most widely adopted IaC tool across industries and cloud providers
- Mature ecosystem: Extensive community resources, training materials, and operational expertise
- Proven at scale: Battle-tested by enterprises managing massive infrastructure footprints
The Terraform module implements production best practices:
- AWS Secrets Manager integration for credential management (avoiding passwords in state)
- Automatic subnet group creation from provided subnet IDs
- Sensible defaults (encryption enabled, deletion protection for production profiles)
- Comprehensive outputs (endpoint, connection information, resource ARNs)
The Pulumi Alternative
Pulumi support recognizes that many development teams prefer infrastructure-as-actual-code:
- Developer-friendly: Teams already proficient in Go, TypeScript, or Python can leverage those skills
- Advanced patterns: Complex conditional logic, dynamic resource creation based on application needs
- Tight integration: Infrastructure code living alongside application code in the same repository, language, and testing framework
Project Planton's Pulumi module provides equivalent functionality to the Terraform version, ensuring feature parity regardless of deployment choice.
What We Don't Support (And Why)
CloudFormation/CDK: While excellent tools for AWS-only workloads, Project Planton's mission includes multi-cloud support. Standardizing on Terraform and Pulumi—both multi-cloud by design—aligns with that vision. Teams preferring CloudFormation can still use Project Planton's specifications as documentation and manually translate to CFN templates.
Ansible/Configuration Management: These tools excel at application deployment and configuration, not infrastructure lifecycle management. For RDS deployment specifically, dedicated IaC tools provide superior state management, drift detection, and change planning.
The 80/20 Configuration Philosophy
AWS RDS exposes hundreds of configuration parameters. Most teams need to configure about 20% of them for 80% of use cases. Project Planton's API reflects this.
The Essential Fields (What Project Planton Exposes)
Networking (Placement and Security)
- Subnet IDs (or subnet group name): Where your database lives—typically private subnets across availability zones
- Security Group IDs: Firewall rules controlling database access
- Publicly Accessible (boolean): Almost always
falsefor production; explicitly required to prevent accidental exposure
Engine and Sizing
- Engine (
postgres,mysql,mariadb,oracle-se2,sqlserver-ex, etc.): Database type - Engine Version: Specific version to run (e.g.,
15.4for PostgreSQL) - Instance Class: CPU and memory allocation (
db.t3.microfor dev,db.r5.xlargefor production) - Allocated Storage (GiB): Initial disk space
Credentials and Security
- Username: Master database user
- Password: Master password (ideally referenced from secrets management)
- Storage Encrypted (boolean): Enable encryption at rest (default: true in most cases)
- KMS Key ID (optional): Customer-managed encryption key
High Availability
- Multi-AZ (boolean): Deploy standby in different availability zone for automatic failover
Advanced Configuration (Optional)
- Port: Database port (defaults to engine-specific standard)
- Parameter Group Name: Custom database parameters
- Option Group Name: Engine-specific options (primarily Oracle/SQL Server)
What We Default or Omit
Many settings have sensible defaults or are managed at the infrastructure platform level:
- Backup retention: Defaulted to 7 days (can be overridden via profiles)
- Maintenance windows: Let AWS choose off-peak times unless specific requirements exist
- Performance Insights: Enabled by default for production profiles
- Auto minor version upgrades: Enabled by default for security patches
- Storage autoscaling: Configured automatically for production databases to prevent space exhaustion
- Enhanced monitoring: Enabled for production, disabled for dev/test
This approach keeps the API simple for common cases while allowing advanced users to specify additional parameters through IaC customization.
Production Deployment Patterns
High Availability: Multi-AZ is Non-Negotiable
For production databases, Multi-AZ deployment is fundamental. AWS maintains a synchronous standby in a different availability zone. On primary failure (instance crash, AZ outage), RDS automatically fails over to the standby—typically within 60-120 seconds for standard RDS, as fast as 30 seconds for Aurora.
What Multi-AZ costs vs. provides:
- Cost: Essentially double instance cost (you pay for the standby)
- Benefit: Automatic failover, reduced downtime during maintenance, protection against AZ failure
When to skip Multi-AZ: Development and test environments where temporary database unavailability is acceptable and cost savings matter.
Backups and Disaster Recovery
Automated backups (enabled by default with 7-day retention) provide point-in-time recovery within the retention window. AWS takes daily snapshots and continuously archives transaction logs.
For production:
- Minimum 7 days retention (14-30 days for critical systems)
- Manual snapshots before major changes (schema migrations, major version upgrades)
- Cross-region snapshot copies for geographic disaster recovery if required
- Deletion protection enabled (prevents accidental database deletion)
Final snapshots: When deleting a database, RDS can create a final snapshot. CloudFormation defaults to creating one (safe); Terraform defaults to skipping (fast but dangerous). Project Planton enforces final snapshots for production-profile databases.
Network Isolation: Private Subnets Only
Production databases should never have public IP addresses. They belong in private subnets with security groups allowing access only from:
- Application tier (via security group references)
- Bastion hosts or VPN for administrative access
- Specific CIDR ranges if connecting from on-premises via Direct Connect/VPN
The publicly_accessible = false setting is enforced by default in Project Planton's production profiles.
Encryption: Always, At Rest and In Transit
At-rest encryption (via KMS) should be enabled for all databases containing sensitive data. There's negligible performance impact and it's often required for compliance.
Once created, an unencrypted database cannot be encrypted in place—you must snapshot and restore to an encrypted instance. Start encrypted from day one.
In-transit encryption (TLS/SSL): RDS provides SSL certificates for database connections. Applications should use SSL connection strings, and database parameters can enforce SSL connections (e.g., PostgreSQL's rds.force_ssl parameter).
Monitoring and Observability
Production databases require comprehensive monitoring:
- CloudWatch metrics (CPU, storage, IOPS, connections): Set alarms for thresholds
- Enhanced Monitoring (OS-level metrics): Identify disk queue depth, memory pressure
- Performance Insights: Query-level performance analysis (which queries consume resources)
- CloudWatch Logs: Export slow query logs, general logs for analysis and alerting
Project Planton's Terraform/Pulumi modules automatically configure these for production-profile databases while keeping dev/test lightweight.
Credential Management: No Passwords in Code
Hardcoded passwords in infrastructure code is an anti-pattern. Modern approaches:
-
AWS Secrets Manager integration: Set
manage_master_user_password = trueand RDS generates a strong password, stores it in Secrets Manager, and rotates it on schedule -
IAM Database Authentication: For MySQL/PostgreSQL, allow connection using IAM tokens instead of passwords (15-minute validity, no long-lived credentials)
Project Planton's API encourages Secrets Manager integration by default, with password fields available only for specific migration scenarios requiring manual credential control.
Engine-Specific Considerations
PostgreSQL: Open Source Workhorse
PostgreSQL on RDS has become a go-to choice for modern applications requiring robust features, strong standards compliance, and rich extension support.
Why PostgreSQL:
- Extensions: PostGIS (geospatial), pg_cron (scheduled jobs), full-text search, JSON/JSONB support
- Modern versions: AWS typically supports new major versions within months of community release
- Read replicas: Up to 5 replicas including cross-region
- Strong ACID guarantees: Ideal for transactional workloads
When to consider Aurora PostgreSQL instead:
- Need more than 5 read replicas (Aurora supports 15)
- Require very fast failover (<30 seconds)
- High I/O workload where Aurora's distributed storage provides better throughput
- Global database requirements (cross-region with low lag)
MySQL: Web Application Standard
MySQL (especially 8.0) remains popular for web applications, content management systems, and high-throughput OLTP workloads.
Why MySQL:
- Ecosystem maturity: Decades of tooling, frameworks, DBA knowledge
- Performance: Modern MySQL 8.0 with InnoDB provides excellent transactional performance
- Compatibility: Many open-source applications assume MySQL
When to consider Aurora MySQL instead:
- Read-heavy workloads benefiting from many replicas
- Scaling beyond single-instance I/O limits (Aurora claims 5× MySQL throughput at scale)
- Serverless workloads (Aurora Serverless v2 auto-scales compute)
MariaDB: The MySQL Fork
MariaDB offers a fully open-source alternative to Oracle-owned MySQL, with some additional features.
When to choose MariaDB:
- Existing MariaDB applications
- Preference for fully open-source stack without Oracle involvement
- Specific MariaDB features (though increasingly, MySQL 8.0 has caught up or diverged)
Note: Aurora does not support MariaDB. If considering Aurora's benefits, stick with MySQL or migrate to PostgreSQL.
Oracle: Enterprise Legacy
RDS for Oracle serves organizations with existing Oracle workloads who want managed infrastructure without full DBA overhead.
Key considerations:
- Licensing complexity: Standard Edition 2 vs. Enterprise Edition; license-included vs. BYOL
- Edition limitations: RAC not available; RDS uses Data Guard for Multi-AZ
- Cost: Oracle licensing makes this the most expensive RDS option per instance
- When to use: Required by vendor applications, existing Oracle expertise, specific Oracle features (PL/SQL, partitioning, Oracle Spatial)
Migration consideration: Many organizations moving from Oracle to PostgreSQL (Aurora or RDS) for cost savings and open-source benefits. AWS Database Migration Service supports this.
SQL Server: Windows Ecosystem Database
RDS for SQL Server supports Microsoft's database for Windows-centric applications.
Key considerations:
- License-included only: AWS pricing includes SQL Server licenses (Microsoft ended BYOL for RDS)
- Edition choices: Express, Web, Standard, Enterprise (features and cost vary significantly)
- Windows Authentication: Can integrate with AWS Managed Microsoft AD for domain authentication
- Limitations: No SSRS, SSIS, SSAS (requires EC2 for those workloads)
When to use: .NET applications, existing SQL Server expertise, Windows integrated authentication requirements.
Cost Optimization: Running RDS Efficiently
Database costs can quickly spiral without active management. Key optimization strategies:
Right-Sizing Instances
Graviton instances (db.m6g, db.r6g, db.t4g): ~20% cheaper than Intel/AMD equivalents with comparable performance. Use unless specific x86 dependency exists.
Burstable instances (T-class): Great for dev/test or variable workloads, but monitor CPU credit balance. Sustained high CPU exhausts credits, causing throttling.
Scheduled scaling: For non-production environments, stop instances outside business hours. RDS can be stopped for up to 7 days (then auto-starts). Saves ~50% for a 12-hours-on daily pattern.
Storage Optimization
GP3 over GP2: AWS's newer gp3 storage provides better cost/performance, with baseline 3,000 IOPS and ability to provision more independently of storage size.
Storage autoscaling: Enable with a reasonable maximum to prevent storage exhaustion (which can corrupt databases) while avoiding massive over-provisioning.
Reserved Instances for Production
Production databases run 24/7 for years. RDS Reserved Instances provide:
- 1-year commitment: ~35-45% savings
- 3-year commitment: ~55-65% savings
For production, RIs are essentially free money if you're confident in instance class and region.
Multi-Environment Strategy
- Production: Right-sized, Multi-AZ, Reserved Instances
- Staging: Right-sized, single-AZ (unless testing failover), on-demand
- Development: Smaller instance classes, single-AZ, stopped outside work hours
Don't run dev environments at production scale unless actively testing scale.
Conclusion: Infrastructure as Code Maturity for Databases
The journey from clicking "Create Database" in the console to sophisticated infrastructure-as-code deployment represents a fundamental shift in how organizations approach operational maturity.
Manual operations might seem faster for a single database, but they scale linearly with human effort—every new environment, every configuration change requires someone to remember the right settings, click the right checkboxes, and hope they didn't miss anything. Infrastructure as code scales logarithmically: the initial investment in defining resources properly pays dividends across every subsequent deployment, update, and disaster recovery scenario.
Project Planton's approach—providing a minimal, validated API focused on essential configuration combined with production-ready Terraform and Pulumi implementations—aims to make that maturity accessible. Teams shouldn't need to become RDS experts, understanding every nuance of parameter groups and option groups, to deploy a secure, well-configured database. They should declare their intent (PostgreSQL 15, production-grade, private network, Multi-AZ) and trust that the infrastructure code translates that intent into AWS API calls with appropriate security defaults, monitoring, and operational best practices.
Whether you choose Terraform for its mature ecosystem, Pulumi for its developer-friendly approach, or maintain CloudFormation for AWS-native operations, the key is moving beyond ephemeral, undocumented manual operations to infrastructure as reviewable, testable, versionable code. Your databases—and your future selves—will thank you.
For detailed implementation guides:
- Terraform Module Implementation (coming soon)
- Pulumi Module Implementation (coming soon)
- Aurora vs RDS Decision Guide (coming soon)
Next article