Learn DVC

Master Data Version Control — the open-source tool that brings Git-like versioning to data, models, and ML pipelines for reproducible machine learning.

Start Course → View All Lessons

Lessons

✍

Hands-On Examples

🕑

Self-Paced

100%

Free

Your Learning Path

Follow these lessons in order, or jump to any topic that interests you.

Beginner

◈

1. Introduction

What is DVC, why Git alone isn't enough for ML, and how DVC solves data versioning.

Start here →

Beginner

⚡

2. Setup & Configuration

Install DVC, initialize a project, configure remote storage (S3, GCS, Azure).

10 min read →

Intermediate

⚙

3. Data Versioning

Track data with dvc add, push/pull to remotes, switch between data versions with Git.

15 min read →

Intermediate

✎

4. Pipelines

Define reproducible ML pipelines with dvc.yaml, manage dependencies, and run stages.

15 min read →

Intermediate

★

5. Experiments

Track experiments, compare metrics, manage parameters, and version model outputs.

12 min read →

Advanced

☆

6. Best Practices

CI/CD integration, team workflows, storage optimization, and production deployment.

10 min read →

What You'll Learn

By the end of this course, you'll be able to:

🧠

Version Data Like Code

Use Git-like commands to version datasets and models, with storage on S3, GCS, or Azure.

💻

Build Pipelines

Create reproducible ML pipelines that automatically track dependencies and outputs.

🛠

Run Experiments

Track, compare, and manage ML experiments with metrics and parameter versioning.

🎯

Automate with CI/CD

Integrate DVC into GitHub Actions and other CI/CD systems for automated ML workflows.