Published on

Cost Optimization in AWS: 30% Reduction Without Sacrificing Performance

11 min read

Authors

Cost Optimization in AWS: 30% Reduction Without Sacrificing Performance

When I joined the team, we were hitting our AWS budget limits every month. Leadership flagged it as a priority issue that needed immediate attention. The challenge was clear: reduce costs by at least 25% without impacting availability or performance.

What followed was a systematic approach to cloud cost optimization that not only met our target but exceeded it. Here's the real-world case study of how we achieved a 30% cost reduction in just 60 days.

The Challenge: Rising Cloud Costs

Our AWS bill had been steadily climbing month over month, with several red flags:

  • πŸ“ˆ Monthly costs increasing by 15-20% without proportional growth in usage
  • πŸ” No visibility into cost drivers or unused resources
  • ⚠️ Leadership pressure to control spending without impacting delivery
  • 🎯 Target: Reduce costs by 25% minimum

The Strategy: Three-Pronged Optimization Approach

I implemented a systematic cost optimization strategy focusing on the biggest impact areas:

AWS Cost Optimization Strategy Flowchart

Figure 1: Complete AWS Cost Optimization Strategy - From budget constraints to 30% cost reduction through three key optimization areas

🧹 1. Unused EBS Volume Cleanup

The Problem: During my infrastructure audit, I discovered over 100 unattached EBS volumes left behind from previously terminated EC2 instances. These "zombie" volumes were accumulating monthly storage charges with zero business value.

The Impact:

  • Cost: Nearly $1,200 per month in wasted storage
  • Root Cause: Untagged, unmanaged volumes from past deployments
  • Risk: Growing monthly as teams continued deploying without cleanup processes

The Solution:

I implemented an automated cleanup system using AWS native services:

# AWS CLI script to identify unattached volumes
aws ec2 describe-volumes \
  --filters Name=status,Values=available \
  --query 'Volumes[?CreateTime<=`2024-01-01`].[VolumeId,CreateTime,Size]' \
  --output table
# Lambda function for automated cleanup
import boto3
from datetime import datetime, timedelta

def lambda_handler(event, context):
    ec2 = boto3.client('ec2')

    # Find unattached volumes older than 14 days
    cutoff_date = datetime.now() - timedelta(days=14)

    volumes = ec2.describe_volumes(
        Filters=[
            {'Name': 'status', 'Values': ['available']},
            {'Name': 'create-time', 'Values': [f'*{cutoff_date.isoformat()}*']}
        ]
    )

    for volume in volumes['Volumes']:
        # Create snapshot before deletion
        snapshot = ec2.create_snapshot(
            VolumeId=volume['VolumeId'],
            Description=f"Auto-backup before cleanup - {volume['VolumeId']}"
        )

        # Send Slack notification
        send_slack_alert(volume['VolumeId'], snapshot['SnapshotId'])

        # Delete volume after snapshot completion
        ec2.delete_volume(VolumeId=volume['VolumeId'])

Implementation Steps:

  1. Discovery: Used AWS CLI and boto3 scripts to identify unattached volumes
  2. Safety: Created automated snapshots before any deletion
  3. Notification: Integrated Slack alerts for transparency
  4. Automation: Set up weekly cleanup via Lambda and CloudWatch Events

Results:

  • βœ… 35% reduction in monthly EBS costs immediately
  • βœ… Zero service impact - only unused resources affected
  • βœ… Ongoing prevention through automated monitoring

πŸ• 2. Scheduled Shutdown of Non-Production Resources

The Problem: Development and staging environments were running 24/7, even though teams only used them during business hours (9 AM - 6 PM). This meant we were paying for compute and RDS costs overnight and weekends for environments sitting idle.

The Cost Impact:

  • Waste: ~60% of non-prod compute hours were unused
  • Annual Cost: Over $50,000 in unnecessary charges
  • Scope: 40+ EC2 instances and 15+ RDS instances across dev/staging

The Solution:

I implemented AWS Instance Scheduler with intelligent tagging and automation:

# CloudFormation template for Instance Scheduler
AWSTemplateFormatVersion: '2010-09-09'
Resources:
  OfficeHoursSchedule:
    Type: AWS::Events::Rule
    Properties:
      ScheduleExpression: 'cron(0 7 ? * MON-FRI *)' # Start at 7 AM weekdays
      State: ENABLED
      Targets:
        - Arn: !GetAtt StartInstancesFunction.Arn
          Id: 'StartInstancesTarget'

  WeekendShutdown:
    Type: AWS::Events::Rule
    Properties:
      ScheduleExpression: 'cron(0 19 ? * FRI *)' # Stop at 7 PM Friday
      State: ENABLED
      Targets:
        - Arn: !GetAtt StopInstancesFunction.Arn
          Id: 'StopInstancesTarget'
# Lambda function for intelligent start/stop
import boto3

def lambda_handler(event, context):
    ec2 = boto3.client('ec2')
    rds = boto3.client('rds')

    action = event.get('action', 'stop')  # start or stop

    # Find instances with Schedule=OfficeHours tag
    instances = ec2.describe_instances(
        Filters=[
            {'Name': 'tag:Schedule', 'Values': ['OfficeHours']},
            {'Name': 'instance-state-name', 'Values': ['running' if action == 'stop' else 'stopped']}
        ]
    )

    for reservation in instances['Reservations']:
        for instance in reservation['Instances']:
            if action == 'stop':
                ec2.stop_instances(InstanceIds=[instance['InstanceId']])
            else:
                ec2.start_instances(InstanceIds=[instance['InstanceId']])

    # Handle RDS instances similarly
    db_instances = rds.describe_db_instances()
    for db in db_instances['DBInstances']:
        tags = rds.list_tags_for_resource(ResourceName=db['DBInstanceArn'])
        if any(tag['Key'] == 'Schedule' and tag['Value'] == 'OfficeHours'
               for tag in tags['TagList']):
            if action == 'stop':
                rds.stop_db_instance(DBInstanceIdentifier=db['DBInstanceIdentifier'])
            else:
                rds.start_db_instance(DBInstanceIdentifier=db['DBInstanceIdentifier'])

Implementation Process:

  1. Team Coordination: Worked with development teams to confirm safe shutdown windows
  2. Tagging Strategy: Applied Schedule=OfficeHours tags to appropriate resources
  3. Gradual Rollout: Started with dev environment, then staging
  4. Monitoring: Set up CloudWatch alarms to ensure services restarted properly

Results:

  • βœ… 40-45% reduction in monthly EC2 and RDS costs for non-production
  • βœ… Zero complaints from development teams
  • βœ… Improved discipline around resource management

πŸ—‚οΈ 3. S3 Lifecycle Policy Implementation

The Problem: Our S3 buckets had become data graveyards - storing large amounts of log files, backups, and build artifacts that were rarely accessed after 30 days. We were paying premium Standard storage pricing for data that should have been in cheaper storage classes.

The Analysis: Using AWS S3 Storage Lens and CloudTrail logs, I discovered:

  • 75% of objects were never accessed after 30 days
  • 90% of objects were never accessed after 90 days
  • Annual waste: ~$25,000 in suboptimal storage costs

The Solution:

I designed intelligent lifecycle policies based on access patterns:

{
  "Rules": [
    {
      "ID": "LogsLifecycle",
      "Status": "Enabled",
      "Filter": {
        "Prefix": "logs/"
      },
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        },
        {
          "Days": 180,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "Expiration": {
        "Days": 365
      }
    },
    {
      "ID": "BackupsLifecycle",
      "Status": "Enabled",
      "Filter": {
        "Prefix": "backups/"
      },
      "Transitions": [
        {
          "Days": 7,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 30,
          "StorageClass": "GLACIER"
        }
      ]
    }
  ]
}

Smart Storage Strategy:

  • Standard: Active data (0-30 days)
  • Standard-IA: Infrequently accessed data (30-90 days)
  • Glacier: Archive data (90-180 days)
  • Deep Archive: Long-term retention (180+ days)
  • Intelligent Tiering: For unpredictable access patterns

Implementation Steps:

  1. Audit: Used S3 Storage Lens to analyze access patterns
  2. Categorization: Grouped data by access frequency and business requirements
  3. Policy Design: Created lifecycle rules based on data classification
  4. Testing: Applied policies to test buckets first
  5. Rollout: Gradual deployment across all S3 buckets

Results:

  • βœ… 50% reduction in S3 storage costs
  • βœ… Compliance alignment with data retention policies
  • βœ… Automated management - no manual intervention required

πŸ“Š Final Results: 30% Cost Reduction in 60 Days

The three optimization strategies combined delivered exceptional results:

Cost Impact Summary:

  • EBS Optimization: 35% reduction in storage costs
  • Instance Scheduling: 40-45% reduction in non-prod compute costs
  • S3 Lifecycle: 50% reduction in storage costs
  • Overall Impact: 30% total AWS cost reduction

Business Benefits:

  • πŸ’° $75,000+ annual savings
  • πŸ“ˆ Budget flexibility for new initiatives
  • 🎯 Performance maintained - zero impact on availability
  • πŸ”„ Sustainable practices through automation

Cultural Impact:

  • πŸ“‹ Cost awareness became part of deployment processes
  • 🏷️ Improved tagging standards across all resources
  • πŸ“Š Regular cost reviews established as team ritual
  • πŸŽ“ Knowledge sharing through documentation and training

πŸ› οΈ Tools and Technologies Used

AWS Native Services:

  • AWS Cost Explorer: Cost analysis and trends
  • AWS Trusted Advisor: Cost optimization recommendations
  • CloudWatch: Monitoring and alerting
  • Lambda: Automation functions
  • EventBridge: Scheduling and event-driven automation
  • S3 Storage Lens: Storage usage analytics

Automation Stack:

  • Python/Boto3: AWS automation scripts
  • CloudFormation: Infrastructure as Code
  • AWS CLI: Command-line operations
  • Slack Integration: Team notifications

Monitoring Tools:

  • CloudWatch Dashboards: Cost and usage visualization
  • Billing Alerts: Proactive cost monitoring
  • Custom Metrics: Resource utilization tracking

🎯 Key Lessons Learned

1. Start with the Biggest Impact

Focus on high-cost, low-complexity optimizations first:

  • Unused resources (immediate wins)
  • Scheduling (predictable savings)
  • Storage optimization (long-term benefits)

2. Automation is Essential

Manual cost optimization doesn't scale:

  • Automate discovery and cleanup
  • Build cost awareness into CI/CD
  • Create self-healing cost controls

3. Team Collaboration is Critical

Cost optimization requires buy-in:

  • Educate teams on cloud economics
  • Make cost visibility part of daily workflow
  • Celebrate cost optimization wins

4. Measure and Monitor Continuously

What gets measured gets managed:

  • Set up cost alerts and budgets
  • Regular cost review meetings
  • Track optimization metrics over time

πŸš€ Next Steps: Building a Cost-Conscious Culture

This project established cost optimization as a core DevOps practice. Here's how we're continuing the journey:

Ongoing Initiatives:

  • Reserved Instance optimization for predictable workloads
  • Spot Instance adoption for fault-tolerant applications
  • Right-sizing analysis using AWS Compute Optimizer
  • Cross-account cost allocation and chargeback systems

Process Improvements:

  • Cost review in all architecture decisions
  • Tagging standards enforced via Service Control Policies
  • Monthly cost optimization sprints
  • Team training on FinOps best practices

πŸ’‘ Actionable Takeaways

If you're facing similar AWS cost challenges, here's your action plan:

Week 1: Discovery

  • Audit unused EBS volumes and snapshots
  • Identify 24/7 non-production resources
  • Analyze S3 storage patterns with Storage Lens

Week 2-3: Quick Wins

  • Clean up unused resources manually
  • Implement basic scheduling for dev/staging
  • Apply simple S3 lifecycle policies

Week 4-6: Automation

  • Build Lambda functions for automated cleanup
  • Set up EventBridge scheduling rules
  • Implement comprehensive lifecycle policies

Week 7-8: Monitoring

  • Create cost dashboards
  • Set up billing alerts
  • Establish regular cost review processes

Cost optimization in AWS isn't just about reducing billsβ€”it's about building sustainable, efficient cloud practices that scale with your business. The 30% reduction we achieved was just the beginning of a cultural shift toward cost-conscious engineering.

Remember: every dollar saved on waste is a dollar that can be invested in innovation.

πŸ”— Want to Learn More?

If you're interested in implementing similar cost optimization strategies or need help with your AWS cost challenges, I'd love to connect and share more detailed implementation guides.

Connect with me on LinkedIn for more DevOps and cloud optimization insights, or explore my other articles on AWS best practices and infrastructure automation.


Have you implemented AWS cost optimization in your organization? Share your experiences and challenges in the comments below!

Let's learn a new thing every day
Get notified about new DevOps articles and cloud infrastructure insights
Buy Me A Coffee
Β© 2025 Bhakta Thapa