ECS Fargate
Container Compute for Backend
Amazon ECS Fargate runs the Python backend API and Celery workers as serverless containers.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ ECS Cluster: amply-prod │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Service: amply-api │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Task 1 │ │ Task 2 │ │ Task N │ │ │
│ │ │ (API) │ │ (API) │ │ (API) │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ │ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Service: amply-worker │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Task 1 │ │ Task 2 │ │ │
│ │ │ (Celery) │ │ (Celery) │ │ │
│ │ └──────────────┘ └──────────────┘ │ │
│ │ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Service: amply-beat │ │
│ │ │ │
│ │ ┌──────────────┐ │ │
│ │ │ Task 1 │ (single instance for scheduler) │ │
│ │ │ (Celery Beat)│ │ │
│ │ └──────────────┘ │ │
│ │ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Services
API Service
Runs the FastAPI application.
# Task Definition
family: amply-api
cpu: 512 # 0.5 vCPU
memory: 1024 # 1 GB
containerDefinitions:
- name: api
image: xxxx.dkr.ecr.eu-central-1.amazonaws.com/amply-backend:latest
command: ["uvicorn", "amply.main:app", "--host", "0.0.0.0", "--port", "8000"]
portMappings:
- containerPort: 8000
environment:
- name: ENVIRONMENT
value: production
secrets:
- name: DATABASE_URL
valueFrom: arn:aws:secretsmanager:...:amply/database-url
- name: STRIPE_SECRET_KEY
valueFrom: arn:aws:secretsmanager:...:amply/stripe-secret
logConfiguration:
logDriver: awslogs
options:
awslogs-group: /ecs/amply-api
awslogs-region: eu-central-1
awslogs-stream-prefix: api
healthCheck:
command: ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"]
interval: 30
timeout: 5
retries: 3
Service Configuration:
serviceName: amply-api
desiredCount: 2 # Minimum 2 for HA
launchType: FARGATE
platformVersion: LATEST
networkConfiguration:
awsvpcConfiguration:
subnets:
- subnet-private-1a
- subnet-private-1b
securityGroups:
- sg-ecs
assignPublicIp: DISABLED
loadBalancers:
- targetGroupArn: arn:aws:elasticloadbalancing:...:amply-api-tg
containerName: api
containerPort: 8000
deploymentConfiguration:
minimumHealthyPercent: 100
maximumPercent: 200
capacityProviderStrategy:
- capacityProvider: FARGATE
weight: 1
base: 2
- capacityProvider: FARGATE_SPOT
weight: 3 # Use Spot for cost savings
Worker Service
Runs Celery workers for background jobs.
# Task Definition
family: amply-worker
cpu: 512
memory: 1024
containerDefinitions:
- name: worker
image: xxxx.dkr.ecr.eu-central-1.amazonaws.com/amply-backend:latest
command: ["celery", "-A", "amply.jobs.celery_app", "worker", "--loglevel=info"]
environment:
- name: ENVIRONMENT
value: production
secrets:
- name: DATABASE_URL
valueFrom: arn:aws:secretsmanager:...:amply/database-url
logConfiguration:
logDriver: awslogs
options:
awslogs-group: /ecs/amply-worker
Service Configuration:
serviceName: amply-worker
desiredCount: 2
launchType: FARGATE
# No load balancer - workers don't receive HTTP traffic
deploymentConfiguration:
minimumHealthyPercent: 50 # Workers can scale down during deploy
maximumPercent: 200
Beat Service
Runs Celery Beat for scheduled tasks.
# Task Definition
family: amply-beat
cpu: 256 # Minimal resources
memory: 512
containerDefinitions:
- name: beat
image: xxxx.dkr.ecr.eu-central-1.amazonaws.com/amply-backend:latest
command: ["celery", "-A", "amply.jobs.celery_app", "beat", "--loglevel=info"]
Service Configuration:
serviceName: amply-beat
desiredCount: 1 # MUST be exactly 1 to avoid duplicate schedules
launchType: FARGATE
Auto Scaling
API Service
# Target Tracking Scaling
scalableTarget:
serviceNamespace: ecs
resourceId: service/amply-prod/amply-api
scalableDimension: ecs:service:DesiredCount
minCapacity: 2
maxCapacity: 10
scalingPolicy:
policyType: TargetTrackingScaling
targetTrackingScalingPolicyConfiguration:
targetValue: 70.0
predefinedMetricSpecification:
predefinedMetricType: ECSServiceAverageCPUUtilization
scaleInCooldown: 300
scaleOutCooldown: 60
Worker Service
# Scale based on SQS queue depth
scalingPolicy:
policyType: TargetTrackingScaling
customizedMetricSpecification:
metricName: ApproximateNumberOfMessagesVisible
namespace: AWS/SQS
dimensions:
- name: QueueName
value: amply-celery-queue
statistic: Average
targetValue: 100.0 # Scale when queue > 100 messages
Docker Configuration
# Dockerfile
FROM python:3.12-slim
WORKDIR /app
# Install dependencies
COPY pyproject.toml .
RUN pip install --no-cache-dir .
# Copy application
COPY src/ src/
# Create non-root user
RUN useradd -m appuser
USER appuser
# Default command (overridden per service)
CMD ["uvicorn", "amply.main:app", "--host", "0.0.0.0", "--port", "8000"]
ECR Repository
# Repository
aws ecr create-repository --repository-name amply-backend
# Lifecycle policy (keep last 10 images)
aws ecr put-lifecycle-policy \
--repository-name amply-backend \
--lifecycle-policy-text '{
"rules": [{
"rulePriority": 1,
"description": "Keep last 10 images",
"selection": {
"tagStatus": "any",
"countType": "imageCountMoreThan",
"countNumber": 10
},
"action": { "type": "expire" }
}]
}'
Deployment
Blue/Green Deployment
deploymentController:
type: ECS # Rolling update (or CODE_DEPLOY for blue/green)
deploymentConfiguration:
minimumHealthyPercent: 100
maximumPercent: 200
deploymentCircuitBreaker:
enable: true
rollback: true
Deployment Process
# 1. Build and push image
docker build -t amply-backend .
docker tag amply-backend:latest xxxx.dkr.ecr.eu-central-1.amazonaws.com/amply-backend:v1.2.3
docker push xxxx.dkr.ecr.eu-central-1.amazonaws.com/amply-backend:v1.2.3
# 2. Update task definition with new image
aws ecs register-task-definition --cli-input-json file://task-definition.json
# 3. Update service
aws ecs update-service \
--cluster amply-prod \
--service amply-api \
--task-definition amply-api:42
Health Checks
Container Health Check
healthCheck:
command: ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"]
interval: 30
timeout: 5
retries: 3
startPeriod: 60
ALB Health Check
healthCheckPath: /health
healthCheckIntervalSeconds: 30
healthCheckTimeoutSeconds: 5
healthyThresholdCount: 2
unhealthyThresholdCount: 3
Health Endpoint
# In FastAPI app
@app.get("/health")
async def health_check(db: AsyncSession = Depends(get_db)):
try:
# Check database
await db.execute(text("SELECT 1"))
# Check Redis
await redis.ping()
return {"status": "healthy"}
except Exception as e:
raise HTTPException(status_code=503, detail=str(e))
Logging
All containers log to CloudWatch Logs:
/ecs/amply-api
/ecs/amply-worker
/ecs/amply-beat
Log format (JSON for structured logging):
{
"timestamp": "2025-01-15T14:30:00Z",
"level": "INFO",
"logger": "amply.api.donations",
"message": "Donation created",
"donation_id": "don_abc123",
"amount": 5000,
"request_id": "req_xyz789"
}
Monitoring
CloudWatch Metrics
- CPU utilization
- Memory utilization
- Running task count
- Request count (via ALB)
- Response time (via ALB)
Alarms
# High CPU
alarmName: amply-api-high-cpu
metricName: CPUUtilization
namespace: AWS/ECS
threshold: 80
evaluationPeriods: 3
# Unhealthy tasks
alarmName: amply-api-unhealthy
metricName: UnhealthyHostCount
namespace: AWS/ApplicationELB
threshold: 1
Cost Optimisation
- Fargate Spot: Use for workers (can handle interruptions)
- Right-sizing: Start small, scale based on metrics
- Savings Plans: Commit to compute for discounts
- Scale to zero: In non-prod environments during off-hours
Related: