patternswiftMinor
Quickselect algorithm in Swift
Viewed 0 times
swiftalgorithmquickselect
Problem
I recently answered a question
on Stack Overflow about finding the k-largest element in an array, and
present my implementation of the Quickselect algorithm in Swift for review.
This is essentially the iterative version described in
Wikipedia: Quickselect, only that the
paritioning does not move the pivot element to the front of the second partition.
That saves some swap operations but requires an additional check when updating the lower bound.
The language is Swift 3, compiled with Xcode 8.1.
Examples:
Feedback on all aspects of the code is welcome, such as (but not limited to):
in consideration of the
Swift API Design Guidelines?
an infinite loop can occur, e.g. for an array with all elements equal.
Is there a better solution?
Remark: To make this code compile with Swift 4 (or later), just replace the line
with
on Stack Overflow about finding the k-largest element in an array, and
present my implementation of the Quickselect algorithm in Swift for review.
This is essentially the iterative version described in
Wikipedia: Quickselect, only that the
paritioning does not move the pivot element to the front of the second partition.
That saves some swap operations but requires an additional check when updating the lower bound.
The language is Swift 3, compiled with Xcode 8.1.
extension Array where Element: Comparable {
public func kSmallest(_ k: Int) -> Element {
precondition(1 1 {
// Choose random pivot element:
let pivotElement = a[low + Int(arc4random_uniform(UInt32(high - low)))]
// Partition elements such that:
// a[i] = pivotElement for pivotIndex Element {
return kSmallest(count + 1 - k)
}
}Examples:
let a = [ 9, 7, 6, 3, 4, 2, 5, 1, 8 ]
for k in 1 ... a.count {
print(a.kSmallest(k))
}
// 1 2 3 4 5 6 7 8 9
let b = [ "b", "a", "c", "h" ]
print(b.kLargest(2))
// "c"Feedback on all aspects of the code is welcome, such as (but not limited to):
- Possible improvements (speed, clarity, swiftyness, ...).
- Naming. In particular: what would be a better name for
kSmallest(_ k: Int)
in consideration of the
Swift API Design Guidelines?
- The check
if a[low] == pivotElementlooks artificial, but without that
an infinite loop can occur, e.g. for an array with all elements equal.
Is there a better solution?
Remark: To make this code compile with Swift 4 (or later), just replace the line
swap(&a[pivotIndex], &a[i])with
a.swapAt(pivotIndex, i)Solution
Let's start by updating this line to Swift 4.2 :
In my tests,
Efficiency
1st improvement
There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:
The first two conditions are equiprobable. The least probable case is left for last.
Benchmarks
The benchmarking code is the following:
It prints the average time for looking up one kth smallest element.
In the worst case, in my tests,
2nd improvement
The following improvement concerns arrays with duplicates, and avoids unnecessary loops:
Instead of hopping by one index alone:
Benchmarks
The following code was used:
Here are the results:
In an array with a high number of duplicates, the original code is now ~
3rd improvement (a fix?)
There are unnecessary loops where the the random index is that of a pivot element which is already in its rightful place/order. These elements aren't swapped by the code, since they are already well-placed. These elements are the ones that fall in the case of
public func nthSmallest(_ n: Int) -> Element {
precondition(count > 0, "No elements to choose from")
let pivotElement = a[Int.random(in: low..<high)]In my tests,
Int.random(in:) is way faster than Int(arc4random_uniform), and the comparisons won't take that into consideration.Efficiency
1st improvement
There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:
if k pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}The first two conditions are equiprobable. The least probable case is left for last.
Benchmarks
The benchmarking code is the following:
let a = Array(1...10_000).shuffled()
var timings: [Double] = []
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")
var timings2: [Double] = []
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")It prints the average time for looking up one kth smallest element.
kSmallest is the original, kSmallest2 is the new one. They both operate on the same array a to ensure fairness. kSmallest2 is up to 7μs faster per lookup. The fluctuation is due to the randomness of the arrangement of the elements of the array. Which translates into up to ~70ms execution time gain for a 10.000-element array:kSmallest 1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)In the worst case, in my tests,
kSmallest2 may rarely be 2μs slower per lookup, and it is to be blamed on the randomness of choosing a pivot. Comparisons should probabilistically favor the second version.2nd improvement
The following improvement concerns arrays with duplicates, and avoids unnecessary loops:
while a[low] == pivotElement, k - low > 1 {
low += 1
}Instead of hopping by one index alone:
if a[low] == pivotElement {
low += 1
}Benchmarks
The following code was used:
//As suggested by Tim Vermeulen
let a = (0..<100).flatMap { Array(repeating: $0, count: Int.random(in: 10..<30)) }
.shuffled()
var timings1: [Double] = []
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings1.append(Double(end - start)/Double(1e6))
}
let average1: Double = timings1.reduce(0, +) / Double(timings1.count)
print("kSmallest", average1, "ms")
var timings2: [Double] = []
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e6))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print("kSmallest2", average2, "ms")
var timings3: [Double] = []
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest3(k)
let end = mach_absolute_time()
timings3.append(Double(end - start)/Double(1e6))
}
let average3: Double = timings3.reduce(0, +) / Double(timings3.count)
print("kSmallest3", average3, "ms")kSmallest3 has both the 1st and 2nd improvements.Here are the results:
kSmallest 0.0272 ms
kSmallest2 0.0267 ms
kSmallest3 0.0236 msIn an array with a high number of duplicates, the original code is now ~
13% faster by implementing both improvements. That percentage will grow with the richness in duplicates, and a higher array count. If the array has unique elements, kSmallest2 is naturally the fastest since it'll be avoiding unnecessary checks.3rd improvement (a fix?)
There are unnecessary loops where the the random index is that of a pivot element which is already in its rightful place/order. These elements aren't swapped by the code, since they are already well-placed. These elements are the ones that fall in the case of
k > pivotIndex + 1 and the low index is equal to pivotIndex. An endless loop may occur if Int.random(in: low..
nth instead of n denotes ordinality
- Since the comparison predicate isn't provided,
nthSmallest(_ n: Int) would be preferable to nthElement(_ n: Int) since it means that we'll be comparing elements in an ascending order.
Readability/Alternative approach
If readability and conciseness are paramount, here is an alternative that uses the partition(by:) method applied to the array slice mutableArrayCopy[low..<high] :
``public func nthSmallest(_ n: Int) -> Element {
precondition(count > 0, "No elements to choose from")
Code Snippets
let pivotElement = a[Int.random(in: low..<high)]if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k > pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}let a = Array(1...10_000).shuffled()
var timings: [Double] = []
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")
var timings2: [Double] = []
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")kSmallest 1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)while a[low] == pivotElement, k - low > 1 {
low += 1
}Context
StackExchange Code Review Q#145877, answer score: 3
Revisions (0)
No revisions yet.