Beginner

Trees & Graphs in AI/ML

Trees and graphs are not just interview topics — they are the backbone of modern AI systems. Decision trees split data for classification, computation graphs enable automatic differentiation in PyTorch and TensorFlow, DAGs orchestrate ML pipelines, and knowledge graphs power reasoning engines. This lesson maps the territory before we dive into problems.

Why Trees Matter for AI/ML

Trees appear everywhere in machine learning, from the models themselves to the infrastructure that trains and serves them.

Decision Trees & Random Forests

A decision tree recursively partitions feature space using binary splits. Each internal node tests a feature condition, each branch represents an outcome, and each leaf holds a prediction. Random forests and gradient-boosted trees (XGBoost, LightGBM) are ensembles of decision trees and remain the top-performing models for tabular data.

Computation Graphs

PyTorch and TensorFlow represent neural networks as directed acyclic graphs (DAGs) where nodes are operations (matmul, relu, softmax) and edges carry tensors. Backpropagation traverses this graph in reverse topological order to compute gradients. Understanding tree/graph traversal is essential for debugging and optimizing neural networks.

Abstract Syntax Trees (ASTs)

Code-generating LLMs produce programs that are parsed into ASTs. Tree algorithms help validate generated code structure, detect patterns, and perform transformations. AI-powered code analysis tools traverse ASTs to find bugs and suggest refactorings.

Hierarchical Clustering

Agglomerative and divisive clustering build tree structures (dendrograms) that organize data points by similarity. Traversing these trees at different cut levels produces clusters of varying granularity, a key technique in unsupervised learning and customer segmentation.

Why Graphs Matter for AI/ML

Knowledge Graphs

Knowledge graphs represent entities and relationships (e.g., "Python" —isA→ "Programming Language"). Google's Knowledge Graph, Wikidata, and enterprise knowledge bases use graph structures. Graph neural networks (GNNs) learn embeddings from these structures for link prediction and entity classification.

ML Pipeline DAGs

Tools like Airflow, Kubeflow, and Dagster model ML pipelines as DAGs: data ingestion → feature engineering → training → evaluation → deployment. Topological sort determines execution order. Understanding DAGs is critical for building production ML infrastructure.

Social Network Analysis

Recommendation systems use graph algorithms on social networks: community detection (connected components), influence propagation (BFS/DFS), and link prediction (common neighbors). Graph-based features often improve ML model accuracy significantly.

Graph Neural Networks

GNNs generalize convolutions from grids (images) to arbitrary graphs. They learn node and edge representations for tasks like molecular property prediction, traffic forecasting, and fraud detection. The message-passing framework is fundamentally a graph traversal operation.

Core Data Structures

Before solving problems, you need to be comfortable with the standard Python representations for trees and graphs.

Binary Tree Node

class TreeNode:
    """Standard binary tree node used in most coding problems."""
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

# Example: Build a simple tree
#       1
#      / \
#     2   3
#    / \
#   4   5
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)

Graph Representations

from collections import defaultdict

# Adjacency List (most common for coding problems)
# Unweighted graph
graph = defaultdict(list)
edges = [(0, 1), (0, 2), (1, 3), (2, 3), (3, 4)]
for u, v in edges:
    graph[u].append(v)
    graph[v].append(u)  # Remove for directed graph

# Weighted graph
weighted_graph = defaultdict(list)
weighted_edges = [(0, 1, 4), (0, 2, 1), (1, 3, 2), (2, 3, 5)]
for u, v, w in weighted_edges:
    weighted_graph[u].append((v, w))
    weighted_graph[v].append((u, w))  # Remove for directed

# Adjacency Matrix (useful for dense graphs, Floyd-Warshall)
n = 5
adj_matrix = [[0] * n for _ in range(n)]
for u, v in edges:
    adj_matrix[u][v] = 1
    adj_matrix[v][u] = 1  # Remove for directed

# ML Context: Feature similarity graph
# Nodes = data points, edges = similarity > threshold
# Used in spectral clustering and label propagation
import numpy as np

def build_similarity_graph(features, threshold=0.8):
    """Build adjacency list from feature vectors using cosine similarity."""
    n = len(features)
    graph = defaultdict(list)
    for i in range(n):
        for j in range(i + 1, n):
            sim = np.dot(features[i], features[j]) / (
                np.linalg.norm(features[i]) * np.linalg.norm(features[j])
            )
            if sim >= threshold:
                graph[i].append(j)
                graph[j].append(i)
    return graph

Complexity Cheat Sheet

OperationAdjacency ListAdjacency Matrix
Add edgeO(1)O(1)
Remove edgeO(E)O(1)
Check edge existsO(degree)O(1)
Get all neighborsO(degree)O(V)
SpaceO(V + E)O(V²)
BFS/DFSO(V + E)O(V²)
💡
When to use which: Use adjacency lists (default for 90% of problems) when the graph is sparse (E << V²). Use adjacency matrices when you need O(1) edge lookup, the graph is dense, or the algorithm requires matrix operations (Floyd-Warshall, spectral methods). In ML, similarity/kernel matrices are adjacency matrices of fully-connected graphs.

What This Course Covers

This course takes you from fundamental tree traversals through advanced graph algorithms, with every problem grounded in AI/ML applications:

  • Lesson 2 – Binary Tree Problems: Traversals (inorder, preorder, postorder, level-order), max depth, symmetric tree, path sum
  • Lesson 3 – BST Problems: Validate BST, kth smallest, LCA, insert/delete, sorted array to BST
  • Lesson 4 – BFS & DFS: Islands, clone graph, course schedule, word ladder, shortest path, connected components
  • Lesson 5 – Topological Sort & DAGs: Course order, task scheduling, build order, alien dictionary, DAG shortest path
  • Lesson 6 – Dijkstra & Advanced: Weighted shortest path, network delay, cheapest flights, MST, union-find
  • Lesson 7 – Tree Construction: Build from traversals, serialize/deserialize, trie, suffix tree basics
  • Lesson 8 – Patterns & Tips: Cheat sheet, visualization, FAQ

Key Takeaways

💡
  • Trees are specialized graphs — every tree algorithm is a graph algorithm, but trees have no cycles and exactly one path between any two nodes
  • Decision trees, computation graphs, ASTs, and dendrograms make trees essential to ML
  • Knowledge graphs, ML pipeline DAGs, social networks, and GNNs make graphs essential to ML
  • Default to adjacency lists for coding problems; use matrices for dense graphs and matrix-based algorithms
  • Every problem in this course includes ML context so you understand why the algorithm matters beyond the interview