HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Binary search a 2D array for multiple matching criteria fields

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
searchcriteriaarrayfieldsbinaryformultiplematching

Problem

Here is what I came up with for a binary search of an array. It allows for searching one or many columns to match. It assumes that array being searched is sorted with the same sort priorities as the order in which the parameters are submitted.

It seems to work well but any advice on performance or structure would be appreciated.

```
Option Explicit
Option Compare Text

Private Function BinFind(inArray As Variant, hasHeader As Boolean, ParamArray WhatValueWhatColumn() As Variant) As Long
'== =============================================================================================================
'== conducts a binary tree search in a 1 or 2 dimensional array
'==
'== looks for matches defined in 'WhatValueWhatColumn'
'== supports matching of multiple columns.
'==
'== each ParamArray pair submitted must be an array stating what is being seeked and in which column it is found
'== i.e. BinFind(arr1, True, Array(firstName, 2), Array(lastName, 1), Array(birthdate, 6))
'==
'== the array being searched, 'inArray' must be sorted in ascending order
'== where the sort priority matches the order of the 'WhatValueWhatColumn' submissions
'==
'== {TODO} - test that array is sorted correctly prior to execution, raise error if not.
'== =============================================================================================================

Dim High As Long 'highest boundary of range being searched
Dim Low As Long 'lowest boundary of range being searched
Dim Mid As Long 'current split point
Dim v As Variant 'WhatValueWhatColumn variant
Dim found As Boolean

'== initialize lower and upper bounds
High = UBound(inArray)
Low = LBound(inArray) - hasHeader

'== test to see if the primary column is outside the lower or upper bounds of the entire test array
v = WhatValueWhatColumn(0)
If v(0) inArray(High, v(1)) Then
BinFind = -1
Exit Function
End If

Solution

Option Compare Text - I don't like it. I also don't like Option Base 1 for the same reason. It causes excel to behave in a way other than the default way, which makes maintenance more difficult. I don't really see the need for the Text when the default Binary would be fine if you pass your arguments through UCase for instance. It's much easier to follow and explicitly shows the reader that case is irrelevant.
Naming

Your naming leaves something to be desired. It's hard to follow when you could use a more descriptive name that would allow easier identification.

  • If High is "highest boundary of range being searched" why not call it highSearchBoundary or something similar? It doesn't cost anything and it removes the need for explanatory comments. Same goes for Low



  • v does nothing to describe itself. Try to avoid single letter variables other than (maybe) i for iteration. What is v - WhatValueColumnVariant? I'm not sure what that would be.



  • found is boolean, just call it isFound similar to your hasHeader boolean



  • On the same token, inArray looks like it should be a boolean, not a variant.



  • Your function can be called BinaryTreeSearch if that's what it is



  • Your inArray and hasHeader arguments are being passed ByRef when they could be passed ByVal



  • I think ParamArray is always passed ByVal - but I'm not too familiar with it



Speaking of comments

Comments - "code tell you how, comments tell you why". The code should speak for itself, if it needs a comment, it might need to be made more clear. If not, the comment should describe why you're doing something rather than how you're doing it. Here are a few reasons to avoid comments all together. You have a lot of commentary which should indicate that your code isn't as clean as it could be. I understand some UDF documentation up top, but other than that it seems superfluous.

But, with comments, I see twice that you return -1 - but I don't know why. Maybe that needs an explanation.

The comment "the array being searched, 'inArray' must be sorted in ascending order" tells me that your function will break with unexpected input. If you need it in ascending order, write a sub to do so. That may be difficult though, considering how ParamArray would define the sort order. All of that leaves a lot of room for unhandled errors, I think. That may not be the best way to put it.

You may need to handle any errors where a passed ParamArray has a higher column that the count of columns in the searchArray though.
ParamArray

I'm no expert in using this, but it seems to work pretty well in your application of it. But the naming could be improved findValueInWhichColumn because that's what it does, right? It finds each value in whichever column of the inArray is specified?
Loop

this information might not be 100% accurate because of a misinterpretation

Your loop is hard to follow.

Using a For Each on an object is less than ideal - try to use a For Next instead.

Then you take each ParamArray and compare it to the mid of the searchArray and iterate down or up. The ParamArray already specifies a column in the searchArray, so why not pull the column into an array and find it there in a refactored function? Maybe even use a Scripting.Dictionary's exists's for eliminating the loop within the refactored function?

Private Function SearchArrayColumn(ByVal searchColumn As Variant, ByVal searchValue As String) As Boolean
'Do your thing
End Function


Then for each ParamArray you can use the function, something similar to

For Each v in WhatValueWhatColumn
isfound = SearchArrayColumn(Application.WorksheetFunction.Index(searcharray, , v(0)), v(1))
if isfound then 'your result
Next


Now, maybe you could find a way to get around the sort condition by passing sorted columns or sorting the column as passed. You could keep iterating through the refactored function until not isFound and then exit your binary search as not found, otherwise go to the next ParamArray and if isFound never goes False you have your True condition.

Just spitballing here, but perhaps if you index all the rows of your array that match the first condition, you can pass entire rows to the refactored function and check all of the conditions at the same time, either passing or failing and stop when it passes or exhausts the matches. You could utilize Find and FindNext to pass each possible match.

Overall, I think this is a clever function and a great question. I'm sure you'll get some better answers than mine.

Code Snippets

Private Function SearchArrayColumn(ByVal searchColumn As Variant, ByVal searchValue As String) As Boolean
'Do your thing
End Function
For Each v in WhatValueWhatColumn
isfound = SearchArrayColumn(Application.WorksheetFunction.Index(searcharray, , v(0)), v(1))
if isfound then 'your result
Next

Context

StackExchange Code Review Q#139785, answer score: 4

Revisions (0)

No revisions yet.