patternModerate
Heap - Give an $O(n \lg k)$ time algorithm to merge $k$ sorted lists into one sorted list
Viewed 0 times
mergeintogivetimeheapalgorithmonelistssortedlist
Problem
Most probably, this question is asked before. It's from CLRS (2nd Ed) problem 6.5-8 --
Give an $O(n \lg k)$ time algorithm to merge $k$ sorted lists into one sorted list, where $n$ is the total number of elements in all the input lists. (Hint: Use a min-heap for $k$-way merging.)
As there are $k$ sorted lists and total of $n$ values, let us assume each list contains $\frac{n}{k}$ numbers, moreover each of the lists are sorted in strictly ascending order, and the results will also be stored in the ascending order.
My pseudo-code looks like this --
My overall complexity becomes $O(k) + O(k) + O(n(k + 2 \lg k)) \approx O(nk+n \lg k) \approx O(nk)$. I could not find any way to avoid the $O(k)$ loop inside the $O(n)$ loop to find the next minimum element from k lists. Is there any other way around? How to get an $O(n \lg k)$ algorithm?
Give an $O(n \lg k)$ time algorithm to merge $k$ sorted lists into one sorted list, where $n$ is the total number of elements in all the input lists. (Hint: Use a min-heap for $k$-way merging.)
As there are $k$ sorted lists and total of $n$ values, let us assume each list contains $\frac{n}{k}$ numbers, moreover each of the lists are sorted in strictly ascending order, and the results will also be stored in the ascending order.
My pseudo-code looks like this --
list[k] ; k sorted lists
heap[k] ; an auxiliary array to hold the min-heap
result[n] ; array to store the sorted list
for i := 1 to k ; O(k)
do
heap[i] := GET-MIN(list[i]) ; pick the first element
; and keeps track of the current index - O(1)
done
BUILD-MIN-HEAP(heap) ; build the min-heap - O(k)
for i := 1 to n
do
array[i] := EXTRACT-MIN(heap) ; store the min - O(logk)
nextMin := GET-MIN(list[1]) ; get the next element from the list 1 - O(1)
; find the minimum value from the top of k lists - O(k)
for j := 2 to k
do
if GET-MIN(list[j]) < nextMin
nextMin := GET-MIN(list[j])
done
; insert the next minimum into the heap - O(logk)
MIN-HEAP-INSERT(heap, nextMin)
doneMy overall complexity becomes $O(k) + O(k) + O(n(k + 2 \lg k)) \approx O(nk+n \lg k) \approx O(nk)$. I could not find any way to avoid the $O(k)$ loop inside the $O(n)$ loop to find the next minimum element from k lists. Is there any other way around? How to get an $O(n \lg k)$ algorithm?
Solution
The purpose of the heap is to give you the minimum, so I'm not sure what the purpose of this for-loop is -
My take on the pseudo-code:
The total time complexity is thus $O(k \log k + n 2 \log k) = O(n \log k)$
You can also, instead of
Example:
(using value rather than index and list index and heap represented as a sorted array for clarity)
for j := 2 to k.My take on the pseudo-code:
lists[k][?] // input lists
c = 0 // index in result
result[n] // output
heap[k] // stores index and applicable list and uses list value for comparison
// if i is the index and k is the list
// it has functions - insert(i, k) and deleteMin() which returns i,k
// the reason we use the index and the list, rather than just the value
// is so that we can get the successor of any value
// populate the initial heap
for i = 1:k // runs O(k) times
heap.insert(0, k) // O(log k)
// keep doing this - delete the minimum, insert the next value from that list into the heap
while !heap.empty() // runs O(n) times
i,k = heap.deleteMin(); // O(log k)
result[c++] = lists[k][i]
i++
if (i < lists[k].length) // insert only if not end-of-list
heap.insert(i, k) // O(log k)The total time complexity is thus $O(k \log k + n 2 \log k) = O(n \log k)$
You can also, instead of
deleteMin and insert, have a getMin ($O(1)$) and an incrementIndex ($O(\log k)$), which will reduce the constant factor, but not the complexity.Example:
(using value rather than index and list index and heap represented as a sorted array for clarity)
Input: [1, 10, 15], [4, 5, 6], [7, 8, 9]
Initial heap: [1, 4, 7]
Delete 1, insert 10
Result: [1]
Heap: [4, 7, 10]
Delete 4, insert 5
Result: [1, 4]
Heap: [5, 7, 10]
Delete 5, insert 6
Result: [1, 4, 5]
Heap: [6, 7, 10]
Delete 6, insert nothing
Result: [1, 4, 5, 6]
Heap: [7, 10]
Delete 7, insert 8
Result: [1, 4, 5, 6, 7]
Heap: [8, 10]
Delete 8, insert 9
Result: [1, 4, 5, 6, 7, 8]
Heap: [9, 10]
Delete 9, insert nothing
Result: [1, 4, 5, 6, 7, 8, 9]
Heap: [10]
Delete 10, insert 15
Result: [1, 4, 5, 6, 7, 8, 9, 10]
Heap: [15]
Delete 15, insert nothing
Result: [1, 4, 5, 6, 7, 8, 9, 10, 15]
Heap: []
DoneCode Snippets
lists[k][?] // input lists
c = 0 // index in result
result[n] // output
heap[k] // stores index and applicable list and uses list value for comparison
// if i is the index and k is the list
// it has functions - insert(i, k) and deleteMin() which returns i,k
// the reason we use the index and the list, rather than just the value
// is so that we can get the successor of any value
// populate the initial heap
for i = 1:k // runs O(k) times
heap.insert(0, k) // O(log k)
// keep doing this - delete the minimum, insert the next value from that list into the heap
while !heap.empty() // runs O(n) times
i,k = heap.deleteMin(); // O(log k)
result[c++] = lists[k][i]
i++
if (i < lists[k].length) // insert only if not end-of-list
heap.insert(i, k) // O(log k)Input: [1, 10, 15], [4, 5, 6], [7, 8, 9]
Initial heap: [1, 4, 7]
Delete 1, insert 10
Result: [1]
Heap: [4, 7, 10]
Delete 4, insert 5
Result: [1, 4]
Heap: [5, 7, 10]
Delete 5, insert 6
Result: [1, 4, 5]
Heap: [6, 7, 10]
Delete 6, insert nothing
Result: [1, 4, 5, 6]
Heap: [7, 10]
Delete 7, insert 8
Result: [1, 4, 5, 6, 7]
Heap: [8, 10]
Delete 8, insert 9
Result: [1, 4, 5, 6, 7, 8]
Heap: [9, 10]
Delete 9, insert nothing
Result: [1, 4, 5, 6, 7, 8, 9]
Heap: [10]
Delete 10, insert 15
Result: [1, 4, 5, 6, 7, 8, 9, 10]
Heap: [15]
Delete 15, insert nothing
Result: [1, 4, 5, 6, 7, 8, 9, 10, 15]
Heap: []
DoneContext
StackExchange Computer Science Q#12853, answer score: 15
Revisions (0)
No revisions yet.