Intermediate
Merge K Patterns
The K-way merge pattern uses a min-heap of size K to efficiently merge multiple sorted sequences. Instead of merging two lists at a time (O(nK)), the heap approach processes all K sources simultaneously in O(n log K). This pattern is used in distributed sorting (merge-sort of sharded data), external sorting, and merging ranked results from multiple ML models.
Problem 1: Merge K Sorted Lists
Problem: You are given an array of
Example:
k linked lists, each sorted in ascending order. Merge all the linked lists into one sorted linked list and return it.Example:
lists = [[1,4,5],[1,3,4],[2,6]] → Output: [1,1,2,3,4,4,5,6]import heapq
from typing import Optional
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
def mergeKLists(lists: list[Optional[ListNode]]) -> Optional[ListNode]:
"""Merge k sorted linked lists using a min-heap.
Strategy:
1. Push the head of each list into a min-heap
2. Pop the smallest node, add to result
3. Push the next node from that list (if exists)
4. Repeat until heap is empty
Time: O(N log K) where N = total nodes, K = number of lists
Space: O(K) for the heap
ML Application: Merging ranked results from K different
recommendation models (ensemble ranking).
"""
heap = []
counter = 0 # tiebreaker for equal values
# Initialize heap with head of each list
for lst in lists:
if lst:
heapq.heappush(heap, (lst.val, counter, lst))
counter += 1
dummy = ListNode(0)
current = dummy
while heap:
val, _, node = heapq.heappop(heap)
current.next = node
current = current.next
if node.next:
heapq.heappush(heap, (node.next.val, counter, node.next))
counter += 1
return dummy.next
# Helper to test with arrays
def to_linked_list(arr):
dummy = ListNode(0)
curr = dummy
for val in arr:
curr.next = ListNode(val)
curr = curr.next
return dummy.next
def to_array(head):
result = []
while head:
result.append(head.val)
head = head.next
return result
# Test
lists = [to_linked_list([1,4,5]), to_linked_list([1,3,4]), to_linked_list([2,6])]
print(to_array(mergeKLists(lists))) # [1, 1, 2, 3, 4, 4, 5, 6]
Problem 2: Smallest Range Covering Elements from K Lists
Problem: You have
Example:
k sorted lists of integers. Find the smallest range [a, b] that includes at least one number from each of the k lists.Example:
nums = [[4,10,15,24,26],[0,9,12,20],[5,18,22,30]] → Output: [20, 24]import heapq
def smallestRange(nums: list[list[int]]) -> list[int]:
"""Find smallest range covering at least one element from each list.
Strategy:
1. Push first element from each list into a min-heap
2. Track the current maximum across all heap elements
3. The range is [heap_min, current_max]
4. Pop min, push its next element, update max
5. Track the smallest range seen
Time: O(N log K) where N = total elements
Space: O(K)
ML Application: Finding the tightest confidence interval
that covers predictions from K different models.
"""
heap = []
current_max = float('-inf')
# Initialize: push first element from each list
for i, lst in enumerate(nums):
heapq.heappush(heap, (lst[0], i, 0))
current_max = max(current_max, lst[0])
best_range = [float('-inf'), float('inf')]
while heap:
current_min, list_idx, elem_idx = heapq.heappop(heap)
# Update best range if current range is smaller
if current_max - current_min < best_range[1] - best_range[0]:
best_range = [current_min, current_max]
# If we've exhausted one list, we can't cover all K lists
if elem_idx + 1 >= len(nums[list_idx]):
break
# Push next element from the same list
next_val = nums[list_idx][elem_idx + 1]
heapq.heappush(heap, (next_val, list_idx, elem_idx + 1))
current_max = max(current_max, next_val)
return best_range
# Test
nums = [[4,10,15,24,26], [0,9,12,20], [5,18,22,30]]
print(smallestRange(nums)) # [20, 24]
Problem 3: Kth Smallest Element in a Sorted Matrix
Problem: Given an
Example:
n x n matrix where each row and column is sorted in ascending order, return the kth smallest element.Example:
matrix = [[1,5,9],[10,11,13],[12,13,15]], k = 8 → Output: 13import heapq
def kthSmallest(matrix: list[list[int]], k: int) -> int:
"""Find kth smallest element in a row/col sorted matrix.
Strategy: Treat each row as a sorted list and do K-way merge.
1. Push first element of each row into min-heap
2. Pop the smallest k times
3. Each time we pop from row i, push the next element in row i
Time: O(K log N) where N = number of rows
Space: O(N)
ML Application: In collaborative filtering, the user-item
score matrix is often partially sorted; finding top-K items
for a user is a variant of this problem.
"""
n = len(matrix)
heap = []
# Push first element of each row (up to k rows)
for i in range(min(n, k)):
heapq.heappush(heap, (matrix[i][0], i, 0))
result = 0
for _ in range(k):
result, row, col = heapq.heappop(heap)
if col + 1 < n:
heapq.heappush(heap, (matrix[row][col + 1], row, col + 1))
return result
# Test
matrix = [[1,5,9],[10,11,13],[12,13,15]]
print(kthSmallest(matrix, 8)) # 13
print(kthSmallest(matrix, 1)) # 1
Problem 4: Find K Pairs with Smallest Sums
Problem: Given two sorted arrays
Example:
nums1 and nums2, and an integer k, return the k pairs (u, v) with the smallest sums where u is from nums1 and v is from nums2.Example:
nums1 = [1,7,11], nums2 = [2,4,6], k = 3 → Output: [[1,2],[1,4],[1,6]]import heapq
def kSmallestPairs(nums1: list[int], nums2: list[int], k: int) -> list[list[int]]:
"""Find k pairs with smallest sums from two sorted arrays.
Strategy: Think of it as K-way merge of sorted sequences:
- Sequence 0: (nums1[0]+nums2[0]), (nums1[0]+nums2[1]), ...
- Sequence 1: (nums1[1]+nums2[0]), (nums1[1]+nums2[1]), ...
Push the first pair from each "row" and expand lazily.
Time: O(K log K) - at most K pushes and pops
Space: O(K)
ML Application: Finding the K best feature-weight combinations
when combining features from two independently scored sets.
"""
if not nums1 or not nums2:
return []
heap = []
result = []
visited = set()
# Start with the smallest possible sum
heapq.heappush(heap, (nums1[0] + nums2[0], 0, 0))
visited.add((0, 0))
while heap and len(result) < k:
total, i, j = heapq.heappop(heap)
result.append([nums1[i], nums2[j]])
# Expand: try (i+1, j) and (i, j+1)
if i + 1 < len(nums1) and (i + 1, j) not in visited:
heapq.heappush(heap, (nums1[i + 1] + nums2[j], i + 1, j))
visited.add((i + 1, j))
if j + 1 < len(nums2) and (i, j + 1) not in visited:
heapq.heappush(heap, (nums1[i] + nums2[j + 1], i, j + 1))
visited.add((i, j + 1))
return result
# Test
print(kSmallestPairs([1,7,11], [2,4,6], 3)) # [[1,2],[1,4],[1,6]]
print(kSmallestPairs([1,1,2], [1,2,3], 2)) # [[1,1],[1,1]]
Merge-K Pattern Summary: The core template is always the same: (1) initialize heap with one element from each source, (2) pop the smallest, (3) push the next element from the same source. This gives O(N log K) for N total elements across K sources. The key insight is that the heap size stays at K, not N.
Lilly Tech Systems