Advanced

MLOps & Deployment

Bridge the gap from notebook to production. Learn model versioning, serving, containerization, CI/CD for ML, monitoring, and cloud deployment.

What is MLOps?

MLOps (Machine Learning Operations) applies DevOps principles to ML systems. It addresses the unique challenges of productionizing ML: data dependencies, model versioning, training pipelines, monitoring for data drift, and continuous retraining. Only ~10% of ML projects make it to production — MLOps aims to close this gap.

Model Versioning

Track experiments, models, and data to ensure reproducibility:

MLflow

Python (MLflow)

import mlflow
import mlflow.sklearn

# Start an experiment
mlflow.set_experiment("churn-prediction")

with mlflow.start_run():
    # Log parameters
    mlflow.log_param("n_estimators", 200)
    mlflow.log_param("max_depth", 5)
    mlflow.log_param("learning_rate", 0.1)

    # Train model
    model.fit(X_train, y_train)

    # Log metrics
    mlflow.log_metric("accuracy", accuracy)
    mlflow.log_metric("f1_score", f1)
    mlflow.log_metric("auc", auc)

    # Log model artifact
    mlflow.sklearn.log_model(model, "model")

# Load a logged model
loaded_model = mlflow.sklearn.load_model("runs:/<run_id>/model")

DVC (Data Version Control)

DVC tracks large data files and ML pipelines alongside Git. It stores metadata in Git while the actual data lives in remote storage (S3, GCS, etc.):

dvc init — Initialize DVC in a Git repo
dvc add data/train.csv — Track a data file
dvc push / dvc pull — Sync data with remote storage
dvc repro — Reproduce a pipeline

Model Serving

Expose your trained model as an API so applications can make predictions:

FastAPI (Recommended)

Python (FastAPI)

from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np

app = FastAPI()
model = joblib.load("model.joblib")

class PredictionRequest(BaseModel):
    tenure: float
    monthly_charges: float
    total_charges: float
    contract: str
    payment_method: str

class PredictionResponse(BaseModel):
    prediction: int
    probability: float

@app.post("/predict", response_model=PredictionResponse)
def predict(request: PredictionRequest):
    data = [[request.tenure, request.monthly_charges,
             request.total_charges, request.contract,
             request.payment_method]]
    prediction = model.predict(data)[0]
    probability = model.predict_proba(data)[0].max()
    return PredictionResponse(
        prediction=int(prediction),
        probability=float(probability)
    )

# Run: uvicorn app:app --host 0.0.0.0 --port 8000

Containerization with Docker

Dockerfile

FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY app.py model.joblib ./
EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

CI/CD for ML

Automate testing, training, and deployment of ML models:

Code tests: Unit tests for data processing, feature engineering, and model predictions.
Data validation: Check data schema, distributions, and quality before training. Tools: Great Expectations, Pandera.
Model validation: Ensure new models meet minimum performance thresholds before deployment.
Automated retraining: Trigger retraining when new data arrives or performance degrades.
Blue/green deployment: Deploy new model alongside old one, gradually shift traffic.

Monitoring and Drift Detection

ML models degrade over time as the real world changes. Monitor for:

Data drift: Input data distribution changes (e.g., customer demographics shift).
Concept drift: The relationship between features and target changes (e.g., buying patterns change).
Performance drift: Model accuracy degrades over time.

Tools: Evidently AI, WhyLabs, Arize, custom dashboards with Grafana.

A/B Testing Models

Compare new models against the current production model by splitting live traffic. The new model must demonstrate statistically significant improvement before full rollout. This is the gold standard for model evaluation in production.

Cloud Deployment

Platform	Key Features	Best For
AWS SageMaker	End-to-end ML platform, built-in algorithms, auto-scaling endpoints	Enterprise AWS shops
GCP Vertex AI	AutoML, managed pipelines, integrated with BigQuery	Google ecosystem, AutoML needs
Azure ML	Designer UI, automated ML, enterprise integration	Microsoft ecosystem
Hugging Face Inference	One-click deployment for Transformers	NLP models, quick deployment

✅

Start simple: Begin with a Flask/FastAPI app in a Docker container. Add MLflow for experiment tracking. Add monitoring as traffic grows. Do not over-engineer your MLOps stack before you have a model that provides business value.

← Previous Scikit-learn Next → Best Practices