Beginner

Math Coding in ML Interviews

ML interviews test math differently from math exams. You will not be asked to prove theorems or derive formulas on a whiteboard. Instead, you will be asked to implement mathematical concepts in code. This lesson explains what to expect, what level of math is required, and how to prepare efficiently.

How Math Is Tested in ML Interviews

ML interview math questions fall into three categories, and this course focuses on the third — the one most candidates underestimate:

Category	Format	Example
Conceptual	Verbal explanation	"Explain what eigenvalues represent geometrically"
Derivation	Whiteboard math	"Derive the gradient of cross-entropy loss"
Implementation	Write working code	"Implement matrix multiplication without NumPy"

Implementation questions are the hardest because they require both mathematical understanding and programming skill. You need to know the algorithm, handle edge cases, manage numerical precision, and write clean code under time pressure.

💡

Interview reality: At companies like Google, Meta, and Amazon, ML coding interviews increasingly ask candidates to implement mathematical operations from scratch. The reasoning is simple: if you truly understand the math, you can code it. If you cannot code it, you are memorizing formulas without understanding.

What Level of Math Is Expected?

You do not need a PhD in mathematics. The math tested in ML interviews corresponds to undergraduate-level courses in:

Linear Algebra: Matrix operations, eigenvalues, decompositions (SVD, QR), projections. This is the most heavily tested area because it underpins neural networks, PCA, recommender systems, and almost every ML algorithm.
Calculus: Derivatives, partial derivatives, chain rule, gradients, Jacobians, Hessians. This is how models learn — backpropagation is just the chain rule applied systematically.
Optimization: Gradient descent and its variants, convergence conditions, learning rate schedules. Every ML model is an optimization problem.
Probability & Statistics: Bayes' theorem, distributions, sampling methods, Monte Carlo estimation. Essential for generative models, Bayesian methods, and understanding model uncertainty.

The Gap Between Theory and Implementation

Knowing the formula for matrix multiplication is not the same as implementing it correctly in code. Here is a simple example that illustrates the gap:

# Matrix multiplication: C = A * B
# Theory: C[i][j] = sum(A[i][k] * B[k][j] for all k)
# Seems simple. But implementation requires:

def matrix_multiply(A, B):
    rows_A, cols_A = len(A), len(A[0])
    rows_B, cols_B = len(B), len(B[0])

    # Edge case: dimension mismatch
    if cols_A != rows_B:
        raise ValueError(f"Cannot multiply {rows_A}x{cols_A} by {rows_B}x{cols_B}")

    # Edge case: empty matrices
    if rows_A == 0 or cols_B == 0:
        return []

    # Initialize result matrix with zeros
    C = [[0.0] * cols_B for _ in range(rows_A)]

    # The triple nested loop
    for i in range(rows_A):
        for j in range(cols_B):
            total = 0.0
            for k in range(cols_A):
                total += A[i][k] * B[k][j]
            C[i][j] = total

    return C

# Test
A = [[1, 2], [3, 4]]
B = [[5, 6], [7, 8]]
print(matrix_multiply(A, B))
# [[19, 22], [43, 50]]

This is a straightforward O(n^3) implementation. But in an interview, the follow-up questions are where it gets interesting: "Can you handle sparse matrices efficiently?", "What about numerical stability for large matrices?", "How would you parallelize this?"

Course Structure and Problem Counts

This course contains 27 implementation problems organized into five topic areas. Each problem includes a clear problem statement, a from-scratch Python solution, complexity analysis, and discussion of edge cases and numerical issues.

Matrix Operations (6 problems)

Multiply, transpose, inverse, rank, trace, and Hadamard product. These are the building blocks you will use in every other topic.

Eigenvalues & SVD (5 problems)

Power iteration, QR algorithm, SVD, low-rank approximation, and PCA. The decompositions that power dimensionality reduction and recommender systems.

Calculus & Gradients (6 problems)

Numerical gradients, automatic differentiation, chain rule, Jacobian, Hessian, and gradient checking. The math behind backpropagation.

Optimization (5 problems)

Gradient descent, Newton's method, Adam, L-BFGS, and constrained optimization. The algorithms that train every ML model.

📝

Prerequisites: You should be comfortable with Python, including nested lists (for matrices), basic classes, and lambda functions. You do not need NumPy knowledge — we implement everything from scratch first, then show NumPy equivalents for reference. If you can write a for loop and understand basic arithmetic, you have enough to start.

How to Use This Course

For each problem in this course, follow this approach:

Read the problem statement and try to solve it yourself for 15-20 minutes before looking at the solution.
Study the solution line by line. Make sure you understand why each step is necessary.
Implement it from memory. Close the solution and write it again. If you get stuck, that tells you which part you did not truly understand.
Test with edge cases. Empty matrices, single-element matrices, non-square matrices, matrices with very large or very small values.
Know the NumPy equivalent. In real interviews, they may ask you to implement from scratch AND show you know the library function.

What Makes Math Coding Different

Math coding problems differ from standard algorithm problems in several important ways:

Floating-point precision matters. You cannot compare floats with == because 0.1 + 0.2 != 0.3 in floating-point arithmetic. You must use tolerances (epsilon comparisons).
Numerical stability is critical. A mathematically correct formula can give wildly wrong results when implemented naively with floating-point numbers. You will learn to recognize and fix these situations.
Edge cases are mathematical. Division by zero, singular matrices, negative eigenvalues, ill-conditioned systems — these are not just programming edge cases, they reflect fundamental mathematical properties.
Verification is built-in. You can always check your answer: does A * A_inverse equal the identity matrix? Does the gradient match the numerical approximation? Math gives you free test cases.

Key Takeaways

ML interviews test math through implementation, not just theory. You must be able to code the formulas you know.
The four pillars of ML math are linear algebra, calculus, optimization, and probability. Linear algebra is the most heavily tested.
Floating-point precision and numerical stability are critical skills that separate passing from failing solutions.
This course covers 27 problems across all four areas, with complete from-scratch Python implementations.
For each problem, implement it from scratch, test with edge cases, and know the NumPy equivalent.

What Is Next

In the next lesson, we dive into Matrix Operations — six fundamental problems that form the foundation of all linear algebra coding. You will implement matrix multiplication, transpose, inverse, rank computation, trace, and the Hadamard product entirely from scratch.

Next → Matrix Operations