Greedy Algorithms
Greedy algorithms make the locally optimal choice at each step, hoping to reach a globally optimal solution. This lesson covers 5 classic greedy problems with complete Python solutions and complexity analysis.
Problem 1: Non-overlapping Intervals
[[start, end], ...], find the minimum number of intervals you need to remove to make the rest non-overlapping.Brute Force (Check All Subsets): O(2^n) Time
def erase_overlap_brute(intervals):
"""
Try all subsets, find largest non-overlapping subset.
Exponential - only works for tiny inputs.
"""
def is_valid(subset):
subset.sort()
for i in range(1, len(subset)):
if subset[i][0] < subset[i - 1][1]:
return False
return True
from itertools import combinations
n = len(intervals)
for size in range(n, 0, -1):
for combo in combinations(intervals, size):
if is_valid(list(combo)):
return n - size
return n - 1
Optimal (Greedy - Sort by End Time): O(n log n) Time, O(1) Space
def erase_overlap_intervals(intervals):
"""
Greedy: Sort by end time, always keep the interval that
ends earliest (leaves the most room for future intervals).
For each interval:
- If it starts after or when the previous one ends: keep it
- Otherwise: remove it (increment counter)
Why sort by end time? The earlier an interval ends, the more
room it leaves for subsequent intervals. This greedy choice
is provably optimal.
"""
if not intervals:
return 0
intervals.sort(key=lambda x: x[1]) # Sort by end time
removals = 0
prev_end = intervals[0][1]
for i in range(1, len(intervals)):
if intervals[i][0] < prev_end:
# Overlap detected - remove this interval
removals += 1
else:
# No overlap - update end time
prev_end = intervals[i][1]
return removals
# Test
print(erase_overlap_intervals([[1, 2], [2, 3], [3, 4], [1, 3]])) # 1
print(erase_overlap_intervals([[1, 2], [1, 2], [1, 2]])) # 2
print(erase_overlap_intervals([[1, 2], [2, 3]])) # 0
AI/ML context: Interval scheduling directly maps to GPU time-slot allocation in shared training clusters. When multiple training jobs compete for GPU time, the greedy “earliest end time” strategy maximizes the number of jobs completed.
Problem 2: Task Scheduler
n, find the minimum number of intervals the CPU needs to finish all tasks. The same task must have at least n intervals between two executions.Optimal (Greedy - Math Formula): O(m) Time, O(1) Space
from collections import Counter
def least_interval(tasks, n):
"""
Greedy insight: The most frequent task determines the structure.
If the most frequent task appears f times, we need at least
(f - 1) chunks of size (n + 1), plus the last partial chunk.
Example: tasks = [A,A,A,B,B,B], n = 2
A _ _ A _ _ A
A B _ A B _ A B
Intervals = (3-1) * (2+1) + 2 = 8
Formula:
- max_freq = frequency of most common task
- max_count = number of tasks with max_freq
- result = (max_freq - 1) * (n + 1) + max_count
- But if we have many tasks, we might not need idle slots:
result = max(len(tasks), formula_result)
"""
freq = Counter(tasks)
max_freq = max(freq.values())
max_count = sum(1 for v in freq.values() if v == max_freq)
# Minimum intervals from the formula
formula_result = (max_freq - 1) * (n + 1) + max_count
# If we have enough different tasks, no idle time needed
return max(len(tasks), formula_result)
# Test
print(least_interval(["A", "A", "A", "B", "B", "B"], 2)) # 8
print(least_interval(["A", "A", "A", "B", "B", "B"], 0)) # 6
print(least_interval(["A", "A", "A", "A", "A", "A",
"B", "C", "D", "E", "F", "G"], 2)) # 16
AI/ML context: Task scheduling with cooldowns models GPU kernel scheduling where certain operations cannot run back-to-back (e.g., memory-intensive operations that need cooling periods). It also models rate-limited API calls when using multiple AI services in a pipeline.
Problem 3: Jump Game
Brute Force (Backtracking): O(2^n) Time
def can_jump_brute(nums):
"""Try all possible jumps recursively."""
def helper(pos):
if pos >= len(nums) - 1:
return True
max_jump = min(pos + nums[pos], len(nums) - 1)
for next_pos in range(pos + 1, max_jump + 1):
if helper(next_pos):
return True
return False
return helper(0)
Optimal (Greedy): O(n) Time, O(1) Space
def can_jump(nums):
"""
Track the farthest position we can reach.
At each position, update the farthest reachable index.
If we ever reach a position beyond our farthest reach,
we are stuck and cannot proceed.
Greedy choice: always extend the farthest reach.
"""
farthest = 0
for i in range(len(nums)):
if i > farthest:
return False # Cannot reach this position
farthest = max(farthest, i + nums[i])
if farthest >= len(nums) - 1:
return True # Can reach the end
return True
# Test
print(can_jump([2, 3, 1, 1, 4])) # True (2->3->4 or 2->1->1->4)
print(can_jump([3, 2, 1, 0, 4])) # False (stuck at index 3)
print(can_jump([0])) # True (already at end)
print(can_jump([2, 0, 0])) # True (jump directly from 0 to 2)
Jump Game II: Minimum Jumps
def jump(nums):
"""
Find MINIMUM number of jumps to reach the end.
BFS-like greedy: treat each "level" as a range of indices
reachable with the current number of jumps.
At each level, find the farthest we can reach in one more jump.
When we exhaust the current level, increment jump count.
"""
if len(nums) <= 1:
return 0
jumps = 0
current_end = 0 # End of current jump's reach
farthest = 0 # Farthest we can reach with one more jump
for i in range(len(nums) - 1):
farthest = max(farthest, i + nums[i])
if i == current_end:
jumps += 1
current_end = farthest
if current_end >= len(nums) - 1:
break
return jumps
# Test
print(jump([2, 3, 1, 1, 4])) # 2 (jump to index 1, then jump to end)
print(jump([2, 3, 0, 1, 4])) # 2
AI/ML context: The jump game's greedy approach mirrors beam search in sequence generation, where you greedily extend the most promising candidates. It also models layer-skipping in neural architecture search, where you determine which layers to skip for efficient inference.
Problem 4: Minimum Platforms
Brute Force: O(n²) Time
def min_platforms_brute(arrivals, departures):
"""For each train, count overlapping trains."""
n = len(arrivals)
max_platforms = 1
for i in range(n):
count = 1
for j in range(n):
if i != j:
# Train j overlaps with train i
if arrivals[j] <= departures[i] and departures[j] >= arrivals[i]:
count += 1
max_platforms = max(max_platforms, count)
return max_platforms
Optimal (Sort + Sweep): O(n log n) Time, O(1) Space
def min_platforms(arrivals, departures):
"""
Sort arrivals and departures separately.
Use two pointers to simulate a timeline sweep.
- If next event is an arrival: need one more platform
- If next event is a departure: free one platform
- Track the maximum platforms needed at any point
Why this works: We only care about how many trains are
present simultaneously, not which specific trains.
"""
arrivals.sort()
departures.sort()
platforms = 0
max_platforms = 0
i, j = 0, 0
while i < len(arrivals):
if arrivals[i] <= departures[j]:
# Train arrives before the next departure
platforms += 1
max_platforms = max(max_platforms, platforms)
i += 1
else:
# A train departs, freeing a platform
platforms -= 1
j += 1
return max_platforms
# Test
arrivals = [900, 940, 950, 1100, 1500, 1800]
departures = [910, 1200, 1120, 1130, 1900, 2000]
print(min_platforms(arrivals, departures)) # 3
arrivals2 = [200, 210, 300, 320, 350, 500]
departures2 = [230, 340, 320, 430, 400, 520]
print(min_platforms(arrivals2, departures2)) # 2
AI/ML context: This problem directly models GPU allocation in training clusters. Each “train” is a training job with a start and end time, and each “platform” is a GPU. The minimum platforms needed tells you how many GPUs your cluster requires to avoid job queuing.
Problem 5: Activity Selection (Maximum Non-overlapping Activities)
n activities with start and end times, select the maximum number of non-overlapping activities that a single person can perform.Optimal (Greedy - Earliest Finish): O(n log n) Time, O(1) Space
def activity_selection(activities):
"""
Classic greedy algorithm: select the activity that finishes earliest.
Proof of optimality: By always choosing the earliest-finishing
activity, we leave the maximum remaining time for future activities.
Any other choice cannot do better.
Steps:
1. Sort activities by end time
2. Select first activity
3. For each remaining activity:
- If it starts after the last selected activity ends, select it
"""
# Sort by end time
activities.sort(key=lambda x: x[1])
selected = [activities[0]]
last_end = activities[0][1]
for i in range(1, len(activities)):
start, end = activities[i]
if start >= last_end:
selected.append(activities[i])
last_end = end
return selected
# Test
activities = [(1, 4), (3, 5), (0, 6), (5, 7), (3, 9), (5, 9),
(6, 10), (8, 11), (8, 12), (2, 14), (12, 16)]
result = activity_selection(activities)
print(f"Maximum activities: {len(result)}")
# Maximum activities: 4
for start, end in result:
print(f" Activity: [{start}, {end})")
# Activity: [1, 4)
# Activity: [5, 7)
# Activity: [8, 11)
# Activity: [12, 16)
Variant: Weighted Activity Selection (DP approach for comparison)
import bisect
def weighted_activity_selection(activities):
"""
When activities have different weights (profits), greedy
does NOT work. We need dynamic programming.
activities: list of (start, end, weight)
This shows when greedy fails and DP is needed.
Time: O(n log n), Space: O(n)
"""
# Sort by end time
activities.sort(key=lambda x: x[1])
n = len(activities)
# dp[i] = max weight using activities[0..i]
dp = [0] * n
dp[0] = activities[0][2] # weight of first activity
# Find latest non-overlapping activity using binary search
ends = [a[1] for a in activities]
for i in range(1, n):
# Option 1: Don't include activity i
exclude = dp[i - 1]
# Option 2: Include activity i
include = activities[i][2]
# Find latest activity that ends before i starts
j = bisect.bisect_right(ends, activities[i][0] - 1) - 1
if j >= 0:
include += dp[j]
dp[i] = max(exclude, include)
return dp[n - 1]
# Test
weighted = [(1, 3, 5), (2, 5, 6), (4, 6, 5), (6, 7, 4), (5, 8, 11), (7, 9, 2)]
print(f"Maximum weight: {weighted_activity_selection(weighted)}")
# Maximum weight: 17
AI/ML context: Activity selection models experiment scheduling in ML research. Each experiment has a time window and expected value (information gain). Weighted activity selection helps decide which experiments to run when resources are limited, maximizing total scientific insight.
Summary: Complexity Comparison
| Problem | Brute Force | Optimal | Greedy Strategy |
|---|---|---|---|
| Non-overlapping Intervals | O(2^n) | O(n log n) | Sort by end time, keep earliest |
| Task Scheduler | O(n * k) | O(m) | Fill slots around most frequent task |
| Jump Game | O(2^n) | O(n) | Track farthest reachable position |
| Minimum Platforms | O(n²) | O(n log n) | Sort + sweep line with two pointers |
| Activity Selection | O(2^n) | O(n log n) | Earliest finish time first |
Lilly Tech Systems