HiveBrain v1.2.0
Get Started
← Back to all entries
patternrubyMinor

BinarySearch Kata in Ruby

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
katarubybinarysearch

Problem

I've just completed this binary search kata - which has some weird requirements - and I spent a little while "cleaning" up my code and making it more compact. I would appreciate some feedback on the readability of the BinarySearch class. I've packed a lot of logic into some of the lines and I'd like to know if that makes it more or less readable.

A Ruby Kata for Binary Search of Arrays:

```
Binary Search Kata
Create a binary search class that is initialized with an array of integers
- detail: Do not use built in array functions
- detail: The initial array is [1,3]
- detail: return the index of the search value if found
- example: searching for 3 returns 1

completed (Y|n):

Successfully deal with values that are not in the data set
- example: searching for 5 indicates to the caller the value is not found

completed (Y|n):

Binary Search handles an odd number of elements in the data set
- detail: Allows the data set to be redefined with [1,3,5,7,9]
- example: Searching for 5 returns 2
- example: Searching for 7 returns 3

completed (Y|n):

Handles more than trivial sized data set
- detail: consider a data set of the first 10000 integers
- example: Searching for 899 returns 898

completed (Y|n):

Supports duplicate elements in the data set
- detail: Use [1,3,5,5,7,9] as the data set
- example: Searching for 3 returns 1
- example: Searching for 5 returns 2

completed (Y|n):

Handles expired call for duplicate elements
- detail: Use [1,3,5,5,7,9] as the data set
- example: Third call to search for 5 is handled like missing element

completed (Y|n):

Supports sorted array of strings
- detail: Use ["a", "b", "c", "d"] as the data set
- example: Searching for "c" returns 2

completed (Y|n):

Supports duplicate strings
- detail: Use ["a", "b", "c", "c", "d"] as the data set
- example: First search for "c" returns 2
- example: Se

Solution

This is pretty good code. It is a little compact, but that only hurts readability in a few places. I will make some
suggestions that fluff it out just a bit, and might make it more
understandable for some readers. Most of what I bring up here is
quite minor. The important stuff is at the end, where I suggest
reorganizing #find, and describe a bug.

Prefer {...} for single-line blocks

{...} is preferred for single-line blocks; begin...end for
multi-line. So, instead of this:

2.times do binary_search.find('c') end


this:

2.times { binary_search.find(5) }


Implicit vs. explicit return

The value of the last executed expression in a method becomes the
return value of the method; this allows you to omit many instances of
return. So, instead of this:

...
  return pivot
end


you can do this:

...
  pivot
end


The value of an "if" expression is the value of the last expression it
executed, so instead of this:

if data[pivot] > target
  return bsearch(target, data[0..pivot - 1])
else
  offset = bsearch(target, data[pivot..-1])
  return offset ? pivot + offset : nil
end


you can do this:

if data[pivot] > target
  bsearch(target, data[0..pivot - 1])
else
  offset = bsearch(target, data[pivot..-1])
  offset ? pivot + offset : nil
end


Typo in a "describe" description

I think this is just a typo. Since find is an instance method,
this:

describe '.find' do


should instead be:

describe '#find' do


Testing that the constructor does not die

Since every one of the specs implicitly tests that the constructor
does not die, and because that is the normal expectation for a
constructor, there is no need to implicitly test that the constructor
does not raise an exception. This can safely be removed from the
spec:

describe "#initialize" do
  it "instantiates" do
    expect { binary_search }.to_not raise_exception
  end
end


Names

-
In class BinarySearch, the instance variable @search_hash needs
a better name. This one is a bit tough (naming is hard!), but until
a truly good name is found, I suggest @found_indices as a name
that gives the reader a better clue about what the hash is being
used for.

-
A better name for BinarySearch#find might be index. This is
consistent with the built-in Array#index, and so less surprising.

-
Similarly, consider renaming BinarySearch#first_find to
first_index.

Creating a hash

It is more usual, when creating an empty hash, to do this:

@found_indices = {}


rather than this:

@found_indices = Hash.new


Use && to eliminate a use of the trinary operator

This:

offset ? pivot + offset : nil


may be more succinctly stated as:

offset && pivot + offset


Expanding BinarySearch#find

I had trouble following the flow of control here:

def find(target)
  return @search_hash[target] = first_find(target) unless @search_hash[target]
  return @data[@search_hash[target] + 1] == target ? @search_hash[target] += 1 : nil
end


We can make it easier to follow, and solve another problem: It isn't
obvious at first that this binary search handles duplicate values by
return successive indices. Let's draw it in crayon:

def find(target)
  unless @found_indices[target]
    find_first(target)
  else
    find_next(target)
  end
end

private

def find_first(target)
  @found_indices[target] = first_index(target)
end

def find_next(target)
  next_index = @found_indices[target] + 1
  return nil unless @data[next_index] == target
  @found_indices[target] = next_index
  next_index
end


The result of @found_indices[target] = next_index is next_index, so the last line that explicitly returns next_index is, strictly speaking, unnecessary. However, it does make the code more clear.

A bug with repeating values

If the array has only one distinct value, whether repeated or not,
then there's a bug. This spec exposes it:

context 'array of identical values' do
  let(:data) { [1, 1] }
  specify do
    binary_search.find(1).should eq 0  # => Expected 0; got -2
    binary_search.find(1).should eq 1
    binary_search.find(1).should be_nil
  end
end


The problem is here:

def first_index(target)
  return nil unless pivot = bsearch(target, @data)
  pivot -= 1 while @data[pivot - 1] == target        # The bug is here
  pivot
end


When pivot is 0, the expression @data[pivot - 1] becomes
@data[-1], which gets the last value in @data. The code ends up
wrappting from the beginning to the end of the array. The fix is to
check that pivot is greater than 0 before decrementing it:

pivot -= 1 while pivot > 0 && @data[pivot - 1] == target

Code Snippets

2.times do binary_search.find('c') end
2.times { binary_search.find(5) }
...
  return pivot
end
...
  pivot
end
if data[pivot] > target
  return bsearch(target, data[0..pivot - 1])
else
  offset = bsearch(target, data[pivot..-1])
  return offset ? pivot + offset : nil
end

Context

StackExchange Code Review Q#44912, answer score: 2

Revisions (0)

No revisions yet.