Intermediate
BentoML
Package and deploy ML models with BentoML for developer-friendly containerized serving, adaptive batching, and seamless Kubernetes deployment.
Defining a BentoML Service
# service.py
import bentoml
from bentoml.io import JSON, NumpyNdarray
import numpy as np
model_ref = bentoml.sklearn.get("iris_classifier:latest")
model_runner = model_ref.to_runner()
svc = bentoml.Service("iris_classifier", runners=[model_runner])
@svc.api(input=NumpyNdarray(), output=JSON())
async def classify(input_data: np.ndarray) -> dict:
prediction = await model_runner.predict.async_run(input_data)
return {"prediction": prediction.tolist()}
Building and Containerizing
# bentofile.yaml
service: "service:svc"
include:
- "*.py"
python:
packages:
- scikit-learn
- numpy
docker:
python_version: "3.11"
# Build and containerize
bentoml build
bentoml containerize iris_classifier:latest --image-tag my-registry/iris:v1
docker push my-registry/iris:v1
Deploying to Kubernetes
apiVersion: apps/v1
kind: Deployment
metadata:
name: iris-classifier
spec:
replicas: 3
selector:
matchLabels:
app: iris-classifier
template:
metadata:
labels:
app: iris-classifier
spec:
containers:
- name: bento-service
image: my-registry/iris:v1
ports:
- containerPort: 3000
resources:
limits:
cpu: "2"
memory: "4Gi"
readinessProbe:
httpGet:
path: /healthz
port: 3000
---
apiVersion: v1
kind: Service
metadata:
name: iris-classifier-svc
spec:
selector:
app: iris-classifier
ports:
- port: 80
targetPort: 3000
BentoML advantage: BentoML provides the best developer experience for model packaging. It handles dependency management, containerization, and API generation automatically. Use it when you want a fast path from trained model to production endpoint.
Lilly Tech Systems