MLflow Models & Registry
MLflow Models and the Model Registry together make up ~30% of the certification exam. This lesson covers model flavors, signatures, input examples, the Model Registry workflow, stage transitions, and includes practice questions.
MLflow Model Flavors
A flavor is MLflow's way of packaging a model so it can be loaded and used by different tools. Each ML framework has its own flavor, plus there is a universal pyfunc flavor that works with any framework.
import mlflow
from sklearn.ensemble import RandomForestClassifier
import tensorflow as tf
# Log a scikit-learn model (sklearn flavor)
with mlflow.start_run():
model = RandomForestClassifier().fit(X_train, y_train)
mlflow.sklearn.log_model(model, "model")
# Log a TensorFlow/Keras model (tensorflow flavor)
with mlflow.start_run():
tf_model = tf.keras.Sequential([...])
tf_model.fit(X_train, y_train)
mlflow.tensorflow.log_model(tf_model, "model")
# Common flavors and their modules:
flavors = {
"sklearn": "mlflow.sklearn",
"tensorflow": "mlflow.tensorflow",
"pytorch": "mlflow.pytorch",
"xgboost": "mlflow.xgboost",
"lightgbm": "mlflow.lightgbm",
"spark": "mlflow.spark",
"onnx": "mlflow.onnx",
"pyfunc": "mlflow.pyfunc" # Universal flavor
}
# Load a model using its native flavor
sk_model = mlflow.sklearn.load_model("runs:/<run_id>/model")
# Load ANY model as a generic pyfunc (universal interface)
pyfunc_model = mlflow.pyfunc.load_model("runs:/<run_id>/model")
predictions = pyfunc_model.predict(X_test)
# EXAM TIP: pyfunc is the universal flavor - it can load any model
# and provides a consistent .predict() interface regardless of framework
Model Signature and Input Examples
A model signature defines the expected input and output schema. An input example provides sample data for documentation and validation.
import mlflow
from mlflow.models import infer_signature
import pandas as pd
import numpy as np
# Infer signature automatically from training data and predictions
X_train = pd.DataFrame({"feature1": [1.0, 2.0], "feature2": [3.0, 4.0]})
predictions = model.predict(X_train)
signature = infer_signature(X_train, predictions)
# Log model with signature and input example
with mlflow.start_run():
mlflow.sklearn.log_model(
model,
"model",
signature=signature,
input_example=X_train.iloc[:3] # First 3 rows as example
)
# Manual signature creation
from mlflow.types.schema import Schema, ColSpec
input_schema = Schema([
ColSpec("double", "feature1"),
ColSpec("double", "feature2"),
])
output_schema = Schema([ColSpec("long")])
signature = mlflow.models.ModelSignature(
inputs=input_schema,
outputs=output_schema
)
# EXAM TIP: infer_signature() is the easiest way to create a signature
# Signatures enable input validation when model is served
# Input examples are stored as artifacts and used for documentation
Model Registry
The Model Registry is a centralized model store that provides model versioning, stage transitions, and annotations. It is a critical component for production ML workflows.
import mlflow
from mlflow.tracking import MlflowClient
client = MlflowClient()
# Method 1: Register a model during logging
with mlflow.start_run():
mlflow.sklearn.log_model(
model, "model",
registered_model_name="fraud-detector" # Auto-registers
)
# Method 2: Register an existing logged model
result = mlflow.register_model(
model_uri="runs:/<run_id>/model",
name="fraud-detector"
)
# Returns ModelVersion object with version number
# Method 3: Using the client
client.create_registered_model("fraud-detector")
client.create_model_version(
name="fraud-detector",
source="runs:/<run_id>/model",
run_id="<run_id>"
)
# List all registered models
models = client.search_registered_models()
for m in models:
print(f"{m.name}: {len(m.latest_versions)} versions")
# Get a specific model version
version = client.get_model_version("fraud-detector", version=1)
print(f"Stage: {version.current_stage}")
print(f"Source: {version.source}")
# EXAM TIP: Three ways to register: during log_model(), with
# mlflow.register_model(), or via MlflowClient methods
Stage Transitions
The Model Registry supports stage-based model lifecycle management. Models transition between stages as they move from development to production.
from mlflow.tracking import MlflowClient
client = MlflowClient()
# Model stages in MLflow:
stages = ["None", "Staging", "Production", "Archived"]
# None -> initial state when registered
# Staging -> model under testing/validation
# Production -> model serving live traffic
# Archived -> retired model
# Transition a model version to a new stage
client.transition_model_version_stage(
name="fraud-detector",
version=2,
stage="Staging"
)
# Promote to Production (archive existing Production version)
client.transition_model_version_stage(
name="fraud-detector",
version=2,
stage="Production",
archive_existing_versions=True # Archives current Production version
)
# Add description to a model version
client.update_model_version(
name="fraud-detector",
version=2,
description="Improved model with feature engineering v2. AUC=0.95."
)
# Add tags to a model version
client.set_model_version_tag(
name="fraud-detector",
version=2,
key="validation_status",
value="approved"
)
# Load model by stage (models:/ URI scheme)
production_model = mlflow.pyfunc.load_model("models:/fraud-detector/Production")
staging_model = mlflow.pyfunc.load_model("models:/fraud-detector/Staging")
version_model = mlflow.pyfunc.load_model("models:/fraud-detector/2")
# EXAM TIP: Know the models:/ URI scheme:
# models:/<model_name>/<stage> - load by stage
# models:/<model_name>/<version> - load by version number
# archive_existing_versions=True is important for Production transitions
Practice Questions
Test your understanding of MLflow Models and Registry with these exam-style questions.
Question 1
What is the universal MLflow model flavor that can load any model regardless of the original framework?
A) mlflow.sklearn
B) mlflow.universal
C) mlflow.pyfunc
D) mlflow.generic
Show Answer
C) mlflow.pyfunc — The pyfunc (Python function) flavor is the universal interface. It loads any MLflow model and provides a consistent .predict() method regardless of the original framework used to train the model.
Question 2
Which function automatically creates a model signature from training data and predictions?
A) mlflow.models.create_signature()
B) mlflow.models.infer_signature()
C) mlflow.models.auto_signature()
D) mlflow.schema.infer()
Show Answer
B) mlflow.models.infer_signature() — This function takes input data and (optionally) model predictions and automatically creates a ModelSignature object with the correct schema.
Question 3
What are the four stages in the MLflow Model Registry lifecycle?
A) Draft, Review, Approved, Deployed
B) None, Staging, Production, Archived
C) Dev, Test, Staging, Production
D) None, Testing, Live, Retired
Show Answer
B) None, Staging, Production, Archived — These are the four built-in stages. "None" is the default when a model is first registered. Models typically flow: None -> Staging -> Production -> Archived.
Question 4
Which URI would you use to load the Production version of a registered model named "fraud-detector"?
A) "runs:/fraud-detector/Production"
B) "registry:/fraud-detector/Production"
C) "models:/fraud-detector/Production"
D) "model://fraud-detector?stage=Production"
Show Answer
C) "models:/fraud-detector/Production" — The models:/ URI scheme is used to load models from the registry. The format is models:/<model_name>/<stage_or_version>. Do not confuse with runs:/ which loads from a specific run.
Question 5
When transitioning a model to Production, what does archive_existing_versions=True do?
A) Deletes all previous Production versions permanently
B) Moves the current Production version to the Archived stage
C) Creates a backup copy of the current Production version
D) Prevents any further changes to the model version
Show Answer
B) — Setting archive_existing_versions=True automatically transitions the current Production model version to the Archived stage, ensuring only one version is in Production at a time. The archived version is not deleted and can be restored.
Question 6
Which of the following is NOT a valid way to register a model in the MLflow Model Registry?
A) Using registered_model_name parameter in log_model()
B) Using mlflow.register_model() with a model URI
C) Using MlflowClient().create_model_version()
D) Using mlflow.push_model()
Show Answer
D) — mlflow.push_model() does not exist in the MLflow API. The three valid ways are: (A) during log_model(), (B) with mlflow.register_model(), and (C) via the MlflowClient.
Key Takeaways
- Model flavors are framework-specific packaging formats; pyfunc is the universal flavor
- Use
infer_signature()to automatically create model signatures from data - The Model Registry provides centralized versioning with four stages: None, Staging, Production, Archived
- Three ways to register:
log_model(registered_model_name=...),mlflow.register_model(), orMlflowClient - Use
models:/name/stageURI to load models by stage, ormodels:/name/versionby version number - Set
archive_existing_versions=Truewhen promoting to Production to auto-archive the current version