Learn Azure GPU VMs & HPC

Master Azure's GPU virtual machines and high-performance computing infrastructure for AI. From N-Series VMs and InfiniBand networking to Azure Batch and CycleCloud for large-scale distributed training.

6
Lessons
Hands-On Labs
🕑
Self-Paced
100%
Free

Your Learning Path

Follow these lessons in order, or jump to any topic that interests you.

What You'll Learn

By the end of this course, you'll be able to:

💻

Choose GPU VMs

Select the right N-Series VM family and size for your specific AI training and inference workloads.

Build HPC Clusters

Set up distributed training clusters with InfiniBand RDMA for near-linear multi-node scaling.

🔄

Orchestrate Jobs

Use Azure Batch and CycleCloud to manage large-scale GPU compute jobs with auto-scaling.

📈

Optimize Performance

Tune NCCL, GPU drivers, and network settings for maximum distributed training throughput.