Intermediate

Merge K Patterns

The K-way merge pattern uses a min-heap of size K to efficiently merge multiple sorted sequences. Instead of merging two lists at a time (O(nK)), the heap approach processes all K sources simultaneously in O(n log K). This pattern is used in distributed sorting (merge-sort of sharded data), external sorting, and merging ranked results from multiple ML models.

Problem 1: Merge K Sorted Lists

📝
Problem: You are given an array of k linked lists, each sorted in ascending order. Merge all the linked lists into one sorted linked list and return it.

Example: lists = [[1,4,5],[1,3,4],[2,6]] → Output: [1,1,2,3,4,4,5,6]
import heapq
from typing import Optional

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next

def mergeKLists(lists: list[Optional[ListNode]]) -> Optional[ListNode]:
    """Merge k sorted linked lists using a min-heap.

    Strategy:
    1. Push the head of each list into a min-heap
    2. Pop the smallest node, add to result
    3. Push the next node from that list (if exists)
    4. Repeat until heap is empty

    Time: O(N log K) where N = total nodes, K = number of lists
    Space: O(K) for the heap

    ML Application: Merging ranked results from K different
    recommendation models (ensemble ranking).
    """
    heap = []
    counter = 0  # tiebreaker for equal values

    # Initialize heap with head of each list
    for lst in lists:
        if lst:
            heapq.heappush(heap, (lst.val, counter, lst))
            counter += 1

    dummy = ListNode(0)
    current = dummy

    while heap:
        val, _, node = heapq.heappop(heap)
        current.next = node
        current = current.next

        if node.next:
            heapq.heappush(heap, (node.next.val, counter, node.next))
            counter += 1

    return dummy.next

# Helper to test with arrays
def to_linked_list(arr):
    dummy = ListNode(0)
    curr = dummy
    for val in arr:
        curr.next = ListNode(val)
        curr = curr.next
    return dummy.next

def to_array(head):
    result = []
    while head:
        result.append(head.val)
        head = head.next
    return result

# Test
lists = [to_linked_list([1,4,5]), to_linked_list([1,3,4]), to_linked_list([2,6])]
print(to_array(mergeKLists(lists)))  # [1, 1, 2, 3, 4, 4, 5, 6]

Problem 2: Smallest Range Covering Elements from K Lists

📝
Problem: You have k sorted lists of integers. Find the smallest range [a, b] that includes at least one number from each of the k lists.

Example: nums = [[4,10,15,24,26],[0,9,12,20],[5,18,22,30]] → Output: [20, 24]
import heapq

def smallestRange(nums: list[list[int]]) -> list[int]:
    """Find smallest range covering at least one element from each list.

    Strategy:
    1. Push first element from each list into a min-heap
    2. Track the current maximum across all heap elements
    3. The range is [heap_min, current_max]
    4. Pop min, push its next element, update max
    5. Track the smallest range seen

    Time: O(N log K) where N = total elements
    Space: O(K)

    ML Application: Finding the tightest confidence interval
    that covers predictions from K different models.
    """
    heap = []
    current_max = float('-inf')

    # Initialize: push first element from each list
    for i, lst in enumerate(nums):
        heapq.heappush(heap, (lst[0], i, 0))
        current_max = max(current_max, lst[0])

    best_range = [float('-inf'), float('inf')]

    while heap:
        current_min, list_idx, elem_idx = heapq.heappop(heap)

        # Update best range if current range is smaller
        if current_max - current_min < best_range[1] - best_range[0]:
            best_range = [current_min, current_max]

        # If we've exhausted one list, we can't cover all K lists
        if elem_idx + 1 >= len(nums[list_idx]):
            break

        # Push next element from the same list
        next_val = nums[list_idx][elem_idx + 1]
        heapq.heappush(heap, (next_val, list_idx, elem_idx + 1))
        current_max = max(current_max, next_val)

    return best_range

# Test
nums = [[4,10,15,24,26], [0,9,12,20], [5,18,22,30]]
print(smallestRange(nums))  # [20, 24]

Problem 3: Kth Smallest Element in a Sorted Matrix

📝
Problem: Given an n x n matrix where each row and column is sorted in ascending order, return the kth smallest element.

Example: matrix = [[1,5,9],[10,11,13],[12,13,15]], k = 8 → Output: 13
import heapq

def kthSmallest(matrix: list[list[int]], k: int) -> int:
    """Find kth smallest element in a row/col sorted matrix.

    Strategy: Treat each row as a sorted list and do K-way merge.
    1. Push first element of each row into min-heap
    2. Pop the smallest k times
    3. Each time we pop from row i, push the next element in row i

    Time: O(K log N) where N = number of rows
    Space: O(N)

    ML Application: In collaborative filtering, the user-item
    score matrix is often partially sorted; finding top-K items
    for a user is a variant of this problem.
    """
    n = len(matrix)
    heap = []

    # Push first element of each row (up to k rows)
    for i in range(min(n, k)):
        heapq.heappush(heap, (matrix[i][0], i, 0))

    result = 0
    for _ in range(k):
        result, row, col = heapq.heappop(heap)
        if col + 1 < n:
            heapq.heappush(heap, (matrix[row][col + 1], row, col + 1))

    return result

# Test
matrix = [[1,5,9],[10,11,13],[12,13,15]]
print(kthSmallest(matrix, 8))  # 13
print(kthSmallest(matrix, 1))  # 1

Problem 4: Find K Pairs with Smallest Sums

📝
Problem: Given two sorted arrays nums1 and nums2, and an integer k, return the k pairs (u, v) with the smallest sums where u is from nums1 and v is from nums2.

Example: nums1 = [1,7,11], nums2 = [2,4,6], k = 3 → Output: [[1,2],[1,4],[1,6]]
import heapq

def kSmallestPairs(nums1: list[int], nums2: list[int], k: int) -> list[list[int]]:
    """Find k pairs with smallest sums from two sorted arrays.

    Strategy: Think of it as K-way merge of sorted sequences:
    - Sequence 0: (nums1[0]+nums2[0]), (nums1[0]+nums2[1]), ...
    - Sequence 1: (nums1[1]+nums2[0]), (nums1[1]+nums2[1]), ...
    Push the first pair from each "row" and expand lazily.

    Time: O(K log K) - at most K pushes and pops
    Space: O(K)

    ML Application: Finding the K best feature-weight combinations
    when combining features from two independently scored sets.
    """
    if not nums1 or not nums2:
        return []

    heap = []
    result = []
    visited = set()

    # Start with the smallest possible sum
    heapq.heappush(heap, (nums1[0] + nums2[0], 0, 0))
    visited.add((0, 0))

    while heap and len(result) < k:
        total, i, j = heapq.heappop(heap)
        result.append([nums1[i], nums2[j]])

        # Expand: try (i+1, j) and (i, j+1)
        if i + 1 < len(nums1) and (i + 1, j) not in visited:
            heapq.heappush(heap, (nums1[i + 1] + nums2[j], i + 1, j))
            visited.add((i + 1, j))
        if j + 1 < len(nums2) and (i, j + 1) not in visited:
            heapq.heappush(heap, (nums1[i] + nums2[j + 1], i, j + 1))
            visited.add((i, j + 1))

    return result

# Test
print(kSmallestPairs([1,7,11], [2,4,6], 3))  # [[1,2],[1,4],[1,6]]
print(kSmallestPairs([1,1,2], [1,2,3], 2))   # [[1,1],[1,1]]
Merge-K Pattern Summary: The core template is always the same: (1) initialize heap with one element from each source, (2) pop the smallest, (3) push the next element from the same source. This gives O(N log K) for N total elements across K sources. The key insight is that the heap size stays at K, not N.