Best Practices Intermediate

Applying calculus correctly in ML requires practical knowledge beyond the theory. This lesson covers gradient checking, numerical stability, leveraging autograd frameworks, and the most common mistakes practitioners make.

Gradient Checking

Always verify your analytical gradients against numerical approximations when implementing custom layers:

Python

import numpy as np

def gradient_check(f, grad_f, x, eps=1e-5):
    """Compare analytical gradient with numerical approximation."""
    analytical = grad_f(x)
    numerical = np.zeros_like(x)
    for i in range(len(x)):
        x_plus = x.copy(); x_plus[i] += eps
        x_minus = x.copy(); x_minus[i] -= eps
        numerical[i] = (f(x_plus) - f(x_minus)) / (2 * eps)

    diff = np.linalg.norm(analytical - numerical)
    diff /= np.linalg.norm(analytical) + np.linalg.norm(numerical) + 1e-8
    return diff < 1e-5, diff  # Should be True

Practical Tips

Use autograd frameworks
PyTorch, JAX, and TensorFlow compute gradients automatically. Only implement manual gradients for custom operations or learning purposes.
Monitor gradient norms
Track the L2 norm of gradients during training. Exploding (>1000) or vanishing (<1e-7) gradients indicate architecture or hyperparameter issues.
Use gradient clipping
Clip gradient norms to a maximum value (typically 1.0 or 5.0) to prevent exploding gradients, especially in RNNs.
Choose activations wisely
ReLU avoids vanishing gradients for positive inputs. Sigmoid and tanh can cause vanishing gradients in deep networks.

Common Pitfalls

Watch Out For:

Forgetting to zero gradients: In PyTorch, call optimizer.zero_grad() before each backward pass.
In-place operations: Modifying tensors in-place can break the computational graph.
Detaching when you should not: Using .detach() or .item() stops gradient flow.
Wrong loss reduction: Make sure your loss function uses the correct mean/sum reduction.
NaN gradients: Often caused by log(0), division by zero, or overflow. Add epsilon values to denominators.

Course Complete!

You have completed the Calculus for ML course. Continue with Probability for AI to round out your mathematical foundation.

Next Course: Probability for AI →

← Optimization Course Overview →