Beginner

Installation & Setup

Get MLflow up and running — from a simple pip install to a production-ready tracking server.

Installing MLflow

Shell — Install MLflow

# Basic installation
pip install mlflow

# With specific extras
pip install mlflow[extras]          # Includes scikit-learn, boto3, etc.
pip install mlflow[gateway]         # For MLflow AI Gateway (LLM support)

# Verify installation
mlflow --version
python -c "import mlflow; print(mlflow.__version__)"

Starting the Tracking Server

Local (Default)

Shell — Start local tracking server

# Start with default settings (SQLite backend, local artifacts)
mlflow server --host 127.0.0.1 --port 5000

# Or simply use the UI command
mlflow ui --port 5000

# Access the UI at http://127.0.0.1:5000

SQLite vs PostgreSQL Backend

Backend	Best For	Concurrency	Setup
SQLite (default)	Local dev, single user	Limited	Zero config
PostgreSQL	Teams, production	Full	Requires PostgreSQL server
MySQL	Teams, production	Full	Requires MySQL server

Shell — Start with PostgreSQL backend

# Install PostgreSQL driver
pip install psycopg2-binary

# Start server with PostgreSQL backend
mlflow server \
  --backend-store-uri postgresql://user:password@localhost:5432/mlflow \
  --default-artifact-root s3://my-mlflow-bucket/artifacts \
  --host 0.0.0.0 \
  --port 5000

Artifact Storage

Artifacts (models, plots, data files) can be stored in various locations:

Shell — Artifact storage options

# Local filesystem (default)
--default-artifact-root ./mlruns

# Amazon S3
--default-artifact-root s3://my-bucket/mlflow-artifacts

# Google Cloud Storage
--default-artifact-root gs://my-bucket/mlflow-artifacts

# Azure Blob Storage
--default-artifact-root wasbs://container@account.blob.core.windows.net/mlflow

# HDFS
--default-artifact-root hdfs://namenode:8020/mlflow

Docker Setup

Dockerfile — MLflow tracking server

FROM python:3.11-slim

RUN pip install mlflow psycopg2-binary boto3

EXPOSE 5000

CMD ["mlflow", "server", \
     "--backend-store-uri", "postgresql://user:pass@db:5432/mlflow", \
     "--default-artifact-root", "s3://mlflow-artifacts", \
     "--host", "0.0.0.0", \
     "--port", "5000"]

YAML — docker-compose.yml

version: '3.8'
services:
  db:
    image: postgres:15
    environment:
      POSTGRES_USER: mlflow
      POSTGRES_PASSWORD: mlflow
      POSTGRES_DB: mlflow
    volumes:
      - postgres_data:/var/lib/postgresql/data

  mlflow:
    build: .
    ports:
      - "5000:5000"
    depends_on:
      - db
    environment:
      - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
      - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}

volumes:
  postgres_data:

Cloud Deployments

Databricks

MLflow is built into Databricks with managed tracking server, model registry, and model serving. Zero setup required.

AWS

Deploy on ECS/EKS with RDS (PostgreSQL) backend and S3 artifact store. Or use SageMaker's MLflow integration.

GCP

Deploy on Cloud Run or GKE with Cloud SQL backend and GCS artifact store.

Your First Experiment

Python — First MLflow experiment

import mlflow
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Point to tracking server (optional if running locally)
mlflow.set_tracking_uri("http://127.0.0.1:5000")

# Create or set experiment
mlflow.set_experiment("my-first-experiment")

# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.2, random_state=42
)

# Start a run
with mlflow.start_run(run_name="first-run"):
    # Log parameters
    n_estimators = 100
    mlflow.log_param("n_estimators", n_estimators)
    mlflow.log_param("random_state", 42)

    # Train model
    model = RandomForestClassifier(n_estimators=n_estimators, random_state=42)
    model.fit(X_train, y_train)

    # Log metrics
    accuracy = accuracy_score(y_test, model.predict(X_test))
    mlflow.log_metric("accuracy", accuracy)

    # Log model
    mlflow.sklearn.log_model(model, "model")

    print(f"Accuracy: {accuracy:.4f}")
    print(f"Run ID: {mlflow.active_run().info.run_id}")

✅

Try it now: Run the code above, then open http://127.0.0.1:5000 in your browser to see your experiment in the MLflow UI. You can compare runs, view metrics, and download artifacts.

← Previous Introduction Next → Tracking