Flux for ML Infrastructure Intermediate

Flux is a set of continuous delivery solutions for Kubernetes that are open and extensible. Unlike ArgoCD's monolithic approach, Flux is composed of specialized controllers—source, kustomize, helm, notification, and image automation—that you can combine to build the exact GitOps workflow your ML platform needs.

Installing Flux

Bootstrap Flux into your cluster, connecting it to your Git repository:

Bash
# Install the Flux CLI
curl -s https://fluxcd.io/install.sh | sudo bash

# Bootstrap Flux with your GitHub repository
flux bootstrap github \
  --owner=your-org \
  --repository=ml-infrastructure \
  --branch=main \
  --path=clusters/production \
  --personal

Source Controllers for ML Artifacts

Flux source controllers watch Git repositories, Helm repositories, and OCI registries for changes. For ML infrastructure, you typically configure multiple sources:

YAML
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
  name: ml-infra
  namespace: flux-system
spec:
  interval: 1m
  url: https://github.com/org/ml-infrastructure
  ref:
    branch: main
  secretRef:
    name: ml-infra-auth

Kustomize Overlays for ML Environments

Use Kustomize overlays to manage environment-specific configurations for your ML workloads. A common pattern is to have a base configuration with overlays for dev, staging, and production:

  • Base: Common ML deployment configuration (container image, ports, health checks)
  • Dev overlay: Single GPU, reduced replicas, debug logging enabled
  • Staging overlay: Multi-GPU, canary testing configuration, synthetic load
  • Production overlay: Full GPU allocation, auto-scaling, production monitoring

Image Automation for Model Deployments

Flux image automation controllers can automatically update container image tags in Git when new model images are pushed to a registry. This is especially useful for ML model serving, where new model versions are frequently published:

YAML
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
  name: model-serving
spec:
  imageRepositoryRef:
    name: model-registry
  policy:
    semver:
      range: ">=1.0.0 <2.0.0"

Flux vs ArgoCD for ML

Feature Flux ArgoCD
Architecture Composable controllers Monolithic application
UI CLI-first (Weave GitOps UI optional) Rich built-in web UI
Image automation Native controller Requires Argo Image Updater
Multi-tenancy Namespace-scoped controllers Project-based RBAC
Best of Both: Some organizations use Flux for infrastructure management (controllers, operators) and ArgoCD for application-level ML deployments, leveraging the strengths of each tool.

Ready to Write ML Manifests?

The next lesson covers designing Kubernetes manifests specifically for ML training jobs, model servers, and feature pipelines.

Next: ML Manifests →