Optimization for Machine Learning

Master the algorithms that train every ML model. From gradient descent fundamentals to modern optimizers like Adam, learn how to efficiently find the best model parameters. Understand convex optimization, learning rate schedules, and hyperparameter tuning.

Start Course → Jump to Adam

Lessons

35+

Examples

~2.5hr

Total Time

📊

Applied Math

What You'll Learn

By the end of this course, you'll understand how optimization algorithms train ML models and how to tune them.

📈

Gradient Descent

Master the foundational algorithm: batch, stochastic, and mini-batch gradient descent with momentum.

🔢

Modern Optimizers

Understand Adam, AdaGrad, RMSProp, and when to use each optimizer for different ML tasks.

🔭

Convex Optimization

Learn convexity, duality, and constrained optimization that underpin classical ML algorithms.

⚙

Hyperparameter Tuning

Systematic approaches to finding optimal learning rates, batch sizes, and architecture choices.

Course Lessons

Follow the lessons in order or jump to any topic you need.

Beginner

1. Introduction

Why optimization is the core of ML training. Overview of the optimization landscape and key challenges.

10 min read →

Beginner

2. Gradient Descent

The foundational algorithm: batch, stochastic, mini-batch GD. Momentum, learning rates, and convergence.

20 min read →

Intermediate

3. Adam & Optimizers

Adaptive optimizers: AdaGrad, RMSProp, Adam, AdamW. How they work and when to use each one.

20 min read →

Intermediate

4. Convex Optimization

Convex functions, duality, KKT conditions, and constrained optimization for ML.

15 min read →

Advanced

5. Hyperparameter Tuning

Grid search, random search, Bayesian optimization, learning rate schedules, and automated tuning.

15 min read →

Intermediate

6. Best Practices

Training recipes, debugging optimization, learning rate warmup, weight decay, and practical tips.

10 min read →

Prerequisites

What you need before starting this course.

Before You Begin:

Understanding of derivatives and gradients (see Calculus for ML course)
Basic linear algebra knowledge (vectors, matrices)
Python with NumPy and PyTorch installed
Recommended: Complete Linear Algebra, Calculus, and Probability courses first