snippetpythonMinor
Classic merge sort
Viewed 0 times
sortclassicmerge
Problem
Here is my code for classic merge sort. My question is, each time I create a new list
Besides my above major question, any advice for code improvement, bugs, performance (algorithm time complexity perspective) is highly appreciated.
result to merge sub-parts, is there a way to save the space -- not creating a new list result?Besides my above major question, any advice for code improvement, bugs, performance (algorithm time complexity perspective) is highly appreciated.
def merge_sort(numbers, start, end):
if not numbers:
return None
if start == end:
return [numbers[start]]
mid = (start + end) // 2
sub_part1 = merge_sort(numbers, start, mid)
sub_part2 = merge_sort(numbers, mid+1, end)
result = []
i = 0
j = 0
while i < len(sub_part1) and j < len(sub_part2):
if sub_part1[i] <= sub_part2[j]:
result.append(sub_part1[i])
i += 1
else:
result.append(sub_part2[j])
j += 1
if i < len(sub_part1) and j == len(sub_part2):
result.extend(sub_part1[i:])
if j < len(sub_part2) and i == len(sub_part1):
result.extend(sub_part2[j:])
return result
if __name__ == "__main__":
print merge_sort([1,4,2,5,6,8,3,4,0], 0, 8)Solution
Step 1:
I would change the computation of
Although python won't give you the wrong answer if you use yours, you theoretically lose performance with lots of elements (>263).
Also, the code when you are doing the final part of the merge can be simplified to:
Which is just cleaner. This can be further improved to:
because the cost of the if statement is more than the cost of appending nothing.
Step 2:
One of the tricks most implementations of mergesort use is they call a slightly different version of themselves at the beginning who's signature is:
The advantage of this is that you can alternate which list is your numbers and which list stores the result to avoid allocation or appending to arrays. (because you merge onto an existing result array).
Finally, if you want a fast implementation that feels like cheating (why I love Python), you can use this version from wikibooks.
I would change the computation of
mid to:mid = start + (start - end) // 2Although python won't give you the wrong answer if you use yours, you theoretically lose performance with lots of elements (>263).
Also, the code when you are doing the final part of the merge can be simplified to:
if i < len(sub_part1):
result.extend(sub_part1[i:])
else:
result.extend(sub_part2[j:])Which is just cleaner. This can be further improved to:
result.extend(sub_part1[i:])
result.extend(sub_part2[j:])because the cost of the if statement is more than the cost of appending nothing.
Step 2:
One of the tricks most implementations of mergesort use is they call a slightly different version of themselves at the beginning who's signature is:
def merge_sort(numbers, start, end, result):
...The advantage of this is that you can alternate which list is your numbers and which list stores the result to avoid allocation or appending to arrays. (because you merge onto an existing result array).
Finally, if you want a fast implementation that feels like cheating (why I love Python), you can use this version from wikibooks.
from heapq import merge
def mergesort(w):
if len(w)<2:
return w
else:
mid=len(w)//2
return merge(mergesort(w[:mid]), mergesort(w[mid:]))Code Snippets
mid = start + (start - end) // 2if i < len(sub_part1):
result.extend(sub_part1[i:])
else:
result.extend(sub_part2[j:])result.extend(sub_part1[i:])
result.extend(sub_part2[j:])def merge_sort(numbers, start, end, result):
...from heapq import merge
def mergesort(w):
if len(w)<2:
return w
else:
mid=len(w)//2
return merge(mergesort(w[:mid]), mergesort(w[mid:]))Context
StackExchange Code Review Q#149597, answer score: 4
Revisions (0)
No revisions yet.