Step 5: Deploy & Scale
Your image generator works locally. Now you need to put it into production with Docker, protect it with rate limiting, control API costs, and serve images efficiently through a CDN.
Dockerize the Application
Create a Dockerfile in the project root:
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
# Install system dependencies for Pillow
RUN apt-get update && apt-get install -y \
libjpeg-dev \
libpng-dev \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Create images directory
RUN mkdir -p static/images
# Expose port
EXPOSE 8000
# Run with uvicorn
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "2"]
Create a docker-compose.yml for the full stack with Nginx as a reverse proxy:
# docker-compose.yml
version: "3.8"
services:
app:
build: .
container_name: ai-image-gen
env_file: .env
volumes:
- image_data:/app/static/images
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
nginx:
image: nginx:alpine
container_name: ai-image-gen-nginx
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/conf.d/default.conf
- image_data:/var/www/images:ro
- ./certbot/conf:/etc/letsencrypt:ro
depends_on:
- app
restart: unless-stopped
volumes:
image_data:
Nginx Configuration
Create nginx.conf to serve static images directly (bypassing FastAPI) and proxy API requests:
# nginx.conf
upstream app {
server app:8000;
}
server {
listen 80;
server_name your-domain.com;
# Serve generated images directly from nginx (much faster)
location /static/images/ {
alias /var/www/images/;
expires 30d;
add_header Cache-Control "public, immutable";
add_header X-Content-Type-Options "nosniff";
}
# Proxy API requests to FastAPI
location /api/ {
proxy_pass http://app;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_read_timeout 120s; # Image generation can take time
}
# Proxy all other requests to FastAPI
location / {
proxy_pass http://app;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
# Limit upload size for img2img and inpainting
client_max_body_size 10M;
}
Build and Run
# Build and start everything
docker-compose up -d --build
# Check logs
docker-compose logs -f app
# Check health
curl http://localhost/health
Rate Limiting
Without rate limiting, a single user could drain your API budget in minutes. Add rate limiting with the slowapi library:
# Add to requirements.txt:
# slowapi==0.1.9
# rate_limiter.py
from slowapi import Limiter
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded
from fastapi import Request
from fastapi.responses import JSONResponse
limiter = Limiter(key_func=get_remote_address)
async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
return JSONResponse(
status_code=429,
content={
"detail": "Too many requests. Please wait before generating more images.",
"retry_after": str(exc.detail),
},
)
Apply rate limits to the generation endpoints in main.py:
# main.py (updated)
from rate_limiter import limiter, rate_limit_handler
from slowapi.errors import RateLimitExceeded
from slowapi import _rate_limit_default_key_func
app = FastAPI(title="AI Image Generator")
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, rate_limit_handler)
And in your router:
from rate_limiter import limiter
from fastapi import Request
@router.post("/generate", response_model=GenerateResponse)
@limiter.limit("10/minute") # Max 10 images per minute per IP
async def generate_image(request: Request, body: GenerateRequest):
# ... existing code ...
@router.post("/batch-generate")
@limiter.limit("2/minute") # Batches are expensive, limit more
async def batch_generate(request: Request, ...):
# ... existing code ...
Cost Management
API costs can spiral quickly. Implement a cost tracking system:
# services/cost_tracker.py
import json
from pathlib import Path
from datetime import datetime, date
COST_FILE = Path("data/costs.json")
COST_FILE.parent.mkdir(parents=True, exist_ok=True)
# Cost per image by provider (approximate, in USD)
COSTS = {
"stability": 0.004, # ~$0.004 per image
"stability-img2img": 0.005,
"stability-inpaint": 0.005,
"stability-upscale": 0.003,
"replicate": 0.005, # ~$0.005 per image
"replicate-upscale": 0.003,
}
# Daily budget limit in USD
DAILY_BUDGET = 5.00
class CostTracker:
def __init__(self):
self.costs = self._load()
def _load(self) -> dict:
if COST_FILE.exists():
return json.loads(COST_FILE.read_text())
return {"daily": {}, "total": 0.0}
def _save(self):
COST_FILE.write_text(json.dumps(self.costs, indent=2))
def record(self, provider: str) -> float:
"""Record a generation cost. Returns the cost."""
cost = COSTS.get(provider, 0.005)
today = date.today().isoformat()
if today not in self.costs["daily"]:
self.costs["daily"][today] = 0.0
self.costs["daily"][today] += cost
self.costs["total"] += cost
self._save()
return cost
def get_daily_spend(self) -> float:
"""Get total spend for today."""
today = date.today().isoformat()
return self.costs["daily"].get(today, 0.0)
def check_budget(self) -> bool:
"""Return True if still within daily budget."""
return self.get_daily_spend() < DAILY_BUDGET
def get_stats(self) -> dict:
"""Return cost statistics."""
today = date.today().isoformat()
return {
"today": round(self.costs["daily"].get(today, 0.0), 4),
"total": round(self.costs["total"], 4),
"daily_budget": DAILY_BUDGET,
"budget_remaining": round(
DAILY_BUDGET - self.get_daily_spend(), 4
),
}
Add a budget check before each generation:
# In routers/generate.py
from services.cost_tracker import CostTracker
cost_tracker = CostTracker()
@router.post("/generate")
async def generate_image(request: Request, body: GenerateRequest):
# Check budget before generating
if not cost_tracker.check_budget():
raise HTTPException(
status_code=429,
detail="Daily generation budget exceeded. Try again tomorrow."
)
# ... generate image ...
# Record the cost after successful generation
cost_tracker.record(body.provider)
return result
@router.get("/costs")
async def get_costs():
"""Return current cost statistics."""
return cost_tracker.get_stats()
CDN for Image Serving
For production, serve images through a CDN like Cloudflare or AWS CloudFront. This reduces latency and bandwidth costs on your server.
Option 1: Cloudflare (Free Tier)
- Point your domain DNS to Cloudflare
- Enable caching for
/static/images/* - Set cache TTL to 30 days (images never change once generated)
- Enable image optimization (Polish) for automatic WebP conversion
Option 2: Upload to S3 + CloudFront
# services/storage_service.py
import boto3
from pathlib import Path
class S3Storage:
def __init__(self):
self.s3 = boto3.client("s3")
self.bucket = os.getenv("S3_BUCKET", "my-image-gen")
self.cdn_url = os.getenv("CDN_URL", "https://cdn.example.com")
async def upload(self, local_path: str, filename: str) -> str:
"""Upload an image to S3 and return the CDN URL."""
self.s3.upload_file(
local_path,
self.bucket,
f"images/{filename}",
ExtraArgs={
"ContentType": "image/png",
"CacheControl": "public, max-age=2592000, immutable",
},
)
return f"{self.cdn_url}/images/{filename}"
Production Checklist
Before going live, verify every item on this list:
| Category | Item | Status |
|---|---|---|
| Security | API keys in environment variables, not in code | Required |
| Security | CORS restricted to your domain only | Required |
| Security | Rate limiting on all generation endpoints | Required |
| Security | Input validation on all user inputs | Required |
| Security | HTTPS enabled via Let's Encrypt | Required |
| Cost | Daily budget limits configured | Required |
| Cost | Cost tracking and alerting set up | Recommended |
| Performance | Nginx serving static files directly | Required |
| Performance | CDN configured for image delivery | Recommended |
| Reliability | Health check endpoint responding | Required |
| Reliability | Docker restart policy set to unless-stopped | Required |
| Reliability | Log aggregation configured | Recommended |
| Storage | Image cleanup cron job for old files | Recommended |
| Storage | Disk space monitoring and alerts | Recommended |
docker-compose up -d on any server with Docker installed. In the final lesson, you will add content moderation, user accounts, and explore monetization options.
Lilly Tech Systems