snippetpythonMinor
Number of Inversions found in Selection sort vs Exchange sort
Viewed 0 times
numberselectioninversionsexchangefoundsort
Problem
Exchange sort is similar to selection sort, just swaps "overly eagerly" instead of finding the minimum and doing only one swap. And the swaps affect how often the
Why is it so much more often? Does someone have a good explanation? I did expect some noticeable difference, but not this much.
What's the expected number of trues for each?. For exchange sort it looks quadratic, for selection sort maybe linearithmic?
Code (Try it online!):
This was inspired by another question, from which I also took the code.
if ... is true. For example for 8000 random numbers, in exchange sort it's true ~270 times more often:trues in trues in
n exchangeSort selectionSort ratio
----------------------------------------
1000 244828 5368 45.6
2000 1002555 12341 81.2
4000 4017151 27691 145.1
8000 16359262 60394 270.9Why is it so much more often? Does someone have a good explanation? I did expect some noticeable difference, but not this much.
What's the expected number of trues for each?. For exchange sort it looks quadratic, for selection sort maybe linearithmic?
Code (Try it online!):
def exchangeSort(nums):
trues = 0
for i in range(len(nums)-1):
for j in range(i + 1, len(nums)):
if nums[j] < nums[i]:
trues += 1
nums[j], nums[i] = nums[i], nums[j]
return trues
def selectionSort(nums):
trues = 0
for i in range(len(nums)-1):
idx_min = i
for j in range(i + 1, len(nums)):
if nums[j] < nums[idx_min]:
trues += 1
idx_min = j
nums[idx_min], nums[i] = nums[i], nums[idx_min]
return trues
from random import random
print(' trues in trues in')
print(' n exchangeSort selectionSort ratio')
print('----------------------------------------')
for n in 1000, 2000, 4000, 8000:
nums = [random() for _ in range(n)]
trues1 = exchangeSort(nums.copy())
trues2 = selectionSort(nums.copy())
print(f'{n} {trues1:10} {trues2:10} {trues1/trues2:10.1f}')This was inspired by another question, from which I also took the code.
Solution
Here's another perspective. Less mathematical, but I think also insightful. Using a fabulous sort visualizer (taken from this), here's random data and how the data looks midway through sorting with the two algorithms.
It visualizes sorting many arrays in parallel, each array is a column, index 0 is on top, and dark colors are "smaller" than bright colors. Try it yourself, select the sorting algorithms on the right, then click "Start".
For selection sort, we see what John L. said: The still-to-be-sorted part
Look at the exchange sort visualization instead. There we see that the still-to-be-sorted bottom half already looks somewhat (reverse-)sorted. A lot of bright colors, i.e., large values, near index
Now imagine that for example the block of numbers containing values from 31 to infinity actually contains the number 31. And that the other two later blocks similarly contain the number 21 and 11, respectively. Before the swaps, we see the order 30-31-20-21-10-11. So going up-down-up-down-up. But after the swaps, we see the order 10-31-30-21-20-11. So going up-down-down-down-down-down. The part after the 10 is all going down. We've taken the subsequence 30-20 and shifted it to the right so that it "falls more in line" (order-wise) with the rest, so that the remainder is now more sorted (descendingly) than before. And these longer descending subsequences are then found and handled in later rounds (for larger
So to summarize: Selection sort keeps the remainder unsorted, which is good. While exchange sort shoots itself in the foot by partially sorting the remainder in a way that's detrimental for it (causes more inversions / more work).
It visualizes sorting many arrays in parallel, each array is a column, index 0 is on top, and dark colors are "smaller" than bright colors. Try it yourself, select the sorting algorithms on the right, then click "Start".
For selection sort, we see what John L. said: The still-to-be-sorted part
nums[i:n] looks completely random. And it is: It is at the very beginning, and what's done inside the outer loop keeps this property. If nums[i] is the smallest in nums[i:n], then we don't really swap and nums[i+1:n] remains as it is, i.e., remains completely random. If on the other hand nums[i] isn't the smallest, then its value plays no role whatsoever in determining idx_min. That index is entirely determined by wherever the smallest element is. And since nums[i] plays no role for this, its relationship with all other values in nums[i+1:n] (other than the smallest) can be anything. Hence swapping it it to idx_min makes nums[i+1:n] completely random as well.Look at the exchange sort visualization instead. There we see that the still-to-be-sorted bottom half already looks somewhat (reverse-)sorted. A lot of bright colors, i.e., large values, near index
i. And getting darker going down. Why is that? Consider what the inner loop does. It goes through the remaining nums[i:n], puts the smallest value at nums[i], and whenever it finds an even smaller value, it swaps nums[i] there. That finds a descending subsequence in nums[i:n] and "rotates" it to the right. Here's an example where the descending subsequence is 30, 20, 10 (and what the other values between/after might be):Now imagine that for example the block of numbers containing values from 31 to infinity actually contains the number 31. And that the other two later blocks similarly contain the number 21 and 11, respectively. Before the swaps, we see the order 30-31-20-21-10-11. So going up-down-up-down-up. But after the swaps, we see the order 10-31-30-21-20-11. So going up-down-down-down-down-down. The part after the 10 is all going down. We've taken the subsequence 30-20 and shifted it to the right so that it "falls more in line" (order-wise) with the rest, so that the remainder is now more sorted (descendingly) than before. And these longer descending subsequences are then found and handled in later rounds (for larger
i), causing more work there.So to summarize: Selection sort keeps the remainder unsorted, which is good. While exchange sort shoots itself in the foot by partially sorting the remainder in a way that's detrimental for it (causes more inversions / more work).
Context
StackExchange Computer Science Q#151297, answer score: 6
Revisions (0)
No revisions yet.