snippetpythonMinor
Mergesort vs Quicksort Timing in Python
Viewed 0 times
pythontimingmergesortquicksort
Problem
I have implemented Mergesort and Quicksort in Python and have noticed that Quicksort runs two times as fast as Mergesort. Is this a property of the algorithm or is Mergesort coded inefficiently? Are there any other suggestions?
(I am aware that the timing includes the creation of the random list.)
Quicksort:
Mergesort:
(I am aware that the timing includes the creation of the random list.)
Quicksort:
import time, random
def quicksort_benchmark(n=2**24):
def quicksort(array, low, high):
if low = pivot:
break
while True:
j -= 1
if array[j] = j:
return j
temp = array[i]
array[i] = array[j]
array[j] = temp
start = time.time()
# initialize array
array = [random.randint(0, n) for i in range(0,n)]
quicksort(array, 0, len(array)-1)
end = time.time()
elapsed = end - start
return(elapsed)Mergesort:
def mergesort_benchmark(n=2**24):
def mergesort(m):
if len(m) <= 1:
return m
left, right = [], []
for i in range(0, len(m)):
if i < len(m)//2:
left.append(m[i])
else:
right.append(m[i])
left = mergesort(left)
right = mergesort(right)
return(merge(left,right))
def merge(left,right):
result = []
i, j = 0, 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
for i in range(i, len(left)):
result.append(left[i])
for i in range(j, len(right)):
result.append(right[i])
return result
start = time.time()
m = [random.randint(0,n) for i in range(0,n)]
m = mergesort(m)
end = time.time()
elapsed = end - start
return(elapsed)Solution
The tl;dr answer to your questions are: Timing differences are related to your coding, and not the algorithm itself. Mostly due to how you handle arrays in-place, or methods for copying.
With that out of the question, let us review the code and in particular the code smells:
-
Try to avoid
do this:
Note that the initial values has changed, and the condition has been flipped around. But the end result is the same, and the latter is easier to read and understand.
-
Swapping of variables – The technique you use for swapping the array elements, can be replaced by a neater statement in Python:
Behind the scenes, this will work exactly like your swapping, but it slightly easier on the eye, and avoids the unnecessary
-
Avoid multiple
-
Looping vs slicing – Python has some really nice slice operators which means that stuff like:
can be replaced with:
Note that I've also switched to
Speed implications
When trying to compare the various implementations I had to make some slight changes to original code, so here are the simple changes with the new names of the main sorting method:
The timing of these using timeit for
As can be seen the quicksort was improved quite a bit with even relatively small changes in your code, but better implementations does exist. For the mergesort I didn't look for alternative implementations, but the changes I made it ran in half the time, resulting in a faster time compared to your
With that out of the question, let us review the code and in particular the code smells:
- In general nicely written code – Let us start with the positive. Your code is mostly clean and easy to read. Good spacing, and reasonable names for variables and methods.
- Functions within functions – Hiding the actual sorting methods within the benchmark methods, removes the potential for using your methods outside of the actual benchmark methods. Not wise. I could understand hiding the
partition()method as that is intrinsic to thequicksort()implementation, and similar for themerge()to be included withinmergesort(). As that would still expose the main sorting functions.
- In-place vs new arrays – Your
quicksort()changes the array in-place, and doesn't use extra storage, whilst themergesort()creates a bunch of extra arrays whilst it is dividing and conquering. This gives thequicksort()and unfair advantage, and renders thequicksort()with a nasty sideeffect of changing the array. This could be a wanted effect in some cases, but mostly you do not want to change the array passed into the function.
-
Try to avoid
While True: – In general, a lot of these can be avoided by flipping and turning the statements a little. Especially when used to loop to a certain point, where you use break to exit the loop, it is a lot nicer to flip it around. Instead of this:i = low - 1
while True:
i += 1
if array[i] >= pivot
breakdo this:
i = low
while array[i] < pivot:
i += 1Note that the initial values has changed, and the condition has been flipped around. But the end result is the same, and the latter is easier to read and understand.
-
Swapping of variables – The technique you use for swapping the array elements, can be replaced by a neater statement in Python:
array[i], array[j] = array[j], array[i]Behind the scenes, this will work exactly like your swapping, but it slightly easier on the eye, and avoids the unnecessary
temp variable.-
Avoid multiple
len(m) on the same array – There is actually time to spare by not doing this operation multiple times when the array in question doesn't change.range()vsxrange()– In Python 2, there is a bit of a difference between these two. The first generates the list, whilst the latter is a generator giving each number as we move on. This has a memory and performance impact when the range is getting large, like for2**24...
-
Looping vs slicing – Python has some really nice slice operators which means that stuff like:
for i in range(j, len(left)):
result.append(left[i])can be replaced with:
result.extend(left[j:len(left)]Note that I've also switched to
extend, and not append, to proper handle the case if the slice is empty, and not actually add the slice to result.Speed implications
When trying to compare the various implementations I had to make some slight changes to original code, so here are the simple changes with the new names of the main sorting method:
- quicksort – Your original code but with a duplication of the array to be sorted, to even out the playing field, and giving all sorters the same array to sort.
- hquicksort – My implementation flipping the
while Truestatements around, and implementing the slightly simpler swap method. I also removed references toiandj, withlowandright. Code included at end.
- sort – An quicksort implementation as implemented in Quicksort with Python. A rather fast recursive, new array, no indices implementation.
- SOquicksort – The in-place solution of the same thread (linked above).
- mergesort – Your implementation of mergesort.
- hmergesort – Your implementation with replacing of the
len(m)with precalculated values, and usingxrange()instead ofrange(). Code not included.
- imergesort – Taking it even one step further, using slicing instead of the loop appending of the results. Code included at end.
- sorted – Using the Python default sorting methods of
sorted().
The timing of these using timeit for
n=2**14 with 10 testruns gave the following (when run using repl.it (Do note that the code given there are messy and more of a sandbox rather than a good example of coding :-)) result:quicksort : 0.645922899246
hquicksort : 0.563888788223
sort : 0.527070045471
SOquicksort: 0.455115795135
mergesort : 1.35985612869
hmergesort : 0.970368146896
imergesort : 0.714356899261
sorted : 0.0575048923492
As can be seen the quicksort was improved quite a bit with even relatively small changes in your code, but better implementations does exist. For the mergesort I didn't look for alternative implementations, but the changes I made it ran in half the time, resulting in a faster time compared to your
Code Snippets
i = low - 1
while True:
i += 1
if array[i] >= pivot
breaki = low
while array[i] < pivot:
i += 1array[i], array[j] = array[j], array[i]for i in range(j, len(left)):
result.append(left[i])result.extend(left[j:len(left)]Context
StackExchange Code Review Q#162121, answer score: 6
Revisions (0)
No revisions yet.