patternpythonMinor
Binary search: first/last/random occurrence
Viewed 0 times
randomlastsearchfirstbinaryoccurrence
Problem
I've written some code to "binary search" a list, and return the first occurrence of the target:
Is there a more elegant/cleaner way to give this function the ability to find the first, last, or random occurrence of the target, without use of a string to represent the cases (as I've done)? Suggestions would be great.
def bsearch(a, left, right, target, first_or_last = 'first'):
"""
Use binary search to find the first, last, or random index
of target.
>>> a = [1,1,1,1,1,1]
>>> bsearch(a, 0, len(a)-1, 1, 'first')
0
"""
if left > right:
return -1
mid = (left + right)//2
val = a[mid]
if val == target:
first_occur = mid
if first_or_last == 'first'
later_occur = bsearch(a, left, mid - 1, target) # check if there's an earlier occurrence in the left sub-list
if later_occur != -1:
return later_occur
else:
return first_occur
elif first_or_last == 'last':
later_occur = bsearch(a, mid + 1, right, target)
if later_occur != -1:
return later_occur
else:
return first_occur
else:
return first_occur
if target val:
return bsearch(a, mid + 1, right, target)Is there a more elegant/cleaner way to give this function the ability to find the first, last, or random occurrence of the target, without use of a string to represent the cases (as I've done)? Suggestions would be great.
Solution
You want to have the ability to choose between finding the first, last and a random occurrence of an element in a list. Since you have three options, you cannot use true/false, and numbers aren't very intuitive (and they usually end up magic numbers). The remaining options are strings, functions or some other data structure. Since you don't like strings, and other data structures wouldn't make much sense (why use a complex object when a simple one will suffice?), then let's stick to functions.
But wait, you don't want any code duplication. That's perfectly ok. We note that all 3 of these options involve variations of calling the basic binary search.
Let's create a basic binary search that outputs a few extras:
Then, let's create the new functions:
This has the advantage of being readable, not susceptible to spelling errors ("First" vs "first" vs "fist") and not having much in the way of duplicated code.
As a fun fact: this entire problem can be easily solved by making wrappers around
Update:
But wait, you don't want any code duplication. That's perfectly ok. We note that all 3 of these options involve variations of calling the basic binary search.
Let's create a basic binary search that outputs a few extras:
def _binary_search(array, element, low, high):
"""Binary search that returns (index, low, high) of the final search step"""
while low element:
high = mid - 1
return (-1, 0, 0)Then, let's create the new functions:
def lower_bound(array, element):
"""Returns the index of the first occurrence of element in array"""
index = -1
first_index, low, high = _binary_search(array, element, 0, len(array)-1)
index = first_index
while first_index != -1:
index = first_index
first_index, low, high = _binary_search(array, element, low, first_index-1)
return index
def upper_bound(array, element):
"""Returns the index of the last occurence of element in array"""
index = -1
first_index, low, high = _binary_search(array, element, 0, len(array)-1)
index = first_index
while first_index != -1:
index = first_index
first_index, low, high = _binary_search(array, element, first_index+1, high)
return index
def binary_search(array, element):
"""Basic binary search"""
lower_bound = 0
upper_bound = len(array) - 1
return _binary_search(array, element, lower_bound, upper_bound)[0]This has the advantage of being readable, not susceptible to spelling errors ("First" vs "first" vs "fist") and not having much in the way of duplicated code.
As a fun fact: this entire problem can be easily solved by making wrappers around
bisect_left and bisect_right from the bisect module like this very convenient SortedCollection Python recipe has done here.Update:
- Renamed functions to comply with standards as per comments
- Changed algorithms to not create new arrays and instead do an in-place binary search
- Changed algorithms to produce the correct results
- Added reference to an easy to use Python recipe that implements this, and other features.
Code Snippets
def _binary_search(array, element, low, high):
"""Binary search that returns (index, low, high) of the final search step"""
while low <= high:
mid = (low + high) // 2
current_element = array[mid]
if current_element == element:
return (mid, low, high)
elif current_element < element:
low = mid + 1
elif current_element > element:
high = mid - 1
return (-1, 0, 0)def lower_bound(array, element):
"""Returns the index of the first occurrence of element in array"""
index = -1
first_index, low, high = _binary_search(array, element, 0, len(array)-1)
index = first_index
while first_index != -1:
index = first_index
first_index, low, high = _binary_search(array, element, low, first_index-1)
return index
def upper_bound(array, element):
"""Returns the index of the last occurence of element in array"""
index = -1
first_index, low, high = _binary_search(array, element, 0, len(array)-1)
index = first_index
while first_index != -1:
index = first_index
first_index, low, high = _binary_search(array, element, first_index+1, high)
return index
def binary_search(array, element):
"""Basic binary search"""
lower_bound = 0
upper_bound = len(array) - 1
return _binary_search(array, element, lower_bound, upper_bound)[0]Context
StackExchange Code Review Q#59855, answer score: 4
Revisions (0)
No revisions yet.