patternMinor
Filtering a list based on conditions
Viewed 0 times
conditionslistbasedfiltering
Problem
I'm currently filtering a list based on several conditions, which may or may not be present:
For instance,
While this is readable and works, it seems like a poor solution and probably won't be very performant on a large dataset. Is there a Groovy idiom for this kind of series of selective filters?
def requests = Request
.list()
.findAll { if (exportFrom) { it.tsCreated.millis >= exportFrom.millis } else { it } }
.findAll { if (exportTo) { it.tsCreated.millis <= exportTo.millis } else { it } }
.findAll { if (clientId) { it.resolvedClient?.clientId == clientId } else { it } }
.findAll { if (params.exportType) { it.typeName == params.exportType } else { it } }
.findAll { if (params.exportStatus) { it.status.toString() == params.exportStatus } else { it } }For instance,
exportFrom is a date coming from the user interface. If it's present, we only want to export requests that occurred on that date or later. If not, we can disregard that test.it will never be 0 or "", as it from a list of objects that have been loaded from the database and so have already been validated.While this is readable and works, it seems like a poor solution and probably won't be very performant on a large dataset. Is there a Groovy idiom for this kind of series of selective filters?
Solution
Few problems here.
Point 1: Fragility
Just return true (oh, and not { true }). That's what you want. By returning the object you risk some odd circumstance where the object evaluates to false. Why expose yourself to that risk? And returning true is just cheaper.
Point 2: Repeated iteration
Each call to findAll is returning a new collection, which you iterate over with the next call. That's a waste. You can do it with one findAll and one closure. The simplest (and most naive) way to do that would be this:
That will do just what your code does only without creating four intermediate copies of the list. It is, however, ugly. So one approach would be to turn each of those criteria into a function which takes a parameter and returns true or false. So you would have
Which is at least cleaner. However, let's consider
Point 3: Duplication of a pattern
You have a set of basic conditions. Each, if true, has a matching predicate with which you test the list item for validity. If the condition is not valid/present, you accept the list item. So...
Imagine you create a Criteria class. It should have two properties
The Criteria class should also have a reject method which might look like this
(I used Object because I'm not sure what type is actually in your list)
Now you can create a list of Criteria objects, each containing a criterion and a matching test. Lets say you call it criteriaList. Now the main bit of code can look like this:
If none of the criteria objects reject the item, then the find call returns null (empty list) and ! null evaluates to true. No rejections, so we like this item.
On the other hand, if a list item is going to fail one of the tests, then find will return the first Criteria object in the list which rejected the item (and not try any others after that). So in that case ! criteriaList.find will evaluate to false and the item will be discarded.
Can you see why I used a reject method (which returns true if the test fails) rather than an accept method (which would return true if the test succeeds)? I did that because find returns as soon as it finds a match, so we want to return as soon as one Criteria object rejects the list item.
There are other ways to do this (e.g. closure composition, although that's a more functional approach which you said isn't your style) but I hope this is a good example which will help you think of your own solutions.
- Fragility and possible inefficiency by returning {it} rather than true
- Iterating over the collection more often than is necessary (even though the later iterations may be over smaller sets).
- Duplication of the same pattern
Point 1: Fragility
Just return true (oh, and not { true }). That's what you want. By returning the object you risk some odd circumstance where the object evaluates to false. Why expose yourself to that risk? And returning true is just cheaper.
Point 2: Repeated iteration
Each call to findAll is returning a new collection, which you iterate over with the next call. That's a waste. You can do it with one findAll and one closure. The simplest (and most naive) way to do that would be this:
def requests = Request
.list()
.findAll { (exportFrom ? it.tsCreated.millis >= exportFrom.millis : true)
&& (exportTo ? it.tsCreated.millis <= exportTo.millis : true)
&& (clientId ? it.resolvedClient?.clientId == clientId : true)
&& (params.exportType ? it.typeName == params.exportType : true)
&& (params.exportStatus ? it.status.toString() == params.exportStatus : true)
&& true }That will do just what your code does only without creating four intermediate copies of the list. It is, however, ugly. So one approach would be to turn each of those criteria into a function which takes a parameter and returns true or false. So you would have
def requests = Request
.list()
.findAll { cond1(it) && cond2(it) && cond3(it) && cond4(it) && cond5(it) }Which is at least cleaner. However, let's consider
Point 3: Duplication of a pattern
You have a set of basic conditions. Each, if true, has a matching predicate with which you test the list item for validity. If the condition is not valid/present, you accept the list item. So...
Imagine you create a Criteria class. It should have two properties
- criterion: the condition which, if true, requires a list element to be tested. Should probably be a closure.
- predicate: the test (another closure) to apply to the list element
The Criteria class should also have a reject method which might look like this
boolean reject (Object item) {
if (this.criterion(item)) ! this.predicate(item) else false
}(I used Object because I'm not sure what type is actually in your list)
Now you can create a list of Criteria objects, each containing a criterion and a matching test. Lets say you call it criteriaList. Now the main bit of code can look like this:
def requests = Request
.list()
.findAll { x -> ! criteriaList.find { c -> c.reject(x) } }If none of the criteria objects reject the item, then the find call returns null (empty list) and ! null evaluates to true. No rejections, so we like this item.
On the other hand, if a list item is going to fail one of the tests, then find will return the first Criteria object in the list which rejected the item (and not try any others after that). So in that case ! criteriaList.find will evaluate to false and the item will be discarded.
Can you see why I used a reject method (which returns true if the test fails) rather than an accept method (which would return true if the test succeeds)? I did that because find returns as soon as it finds a match, so we want to return as soon as one Criteria object rejects the list item.
There are other ways to do this (e.g. closure composition, although that's a more functional approach which you said isn't your style) but I hope this is a good example which will help you think of your own solutions.
Code Snippets
def requests = Request
.list()
.findAll { (exportFrom ? it.tsCreated.millis >= exportFrom.millis : true)
&& (exportTo ? it.tsCreated.millis <= exportTo.millis : true)
&& (clientId ? it.resolvedClient?.clientId == clientId : true)
&& (params.exportType ? it.typeName == params.exportType : true)
&& (params.exportStatus ? it.status.toString() == params.exportStatus : true)
&& true }def requests = Request
.list()
.findAll { cond1(it) && cond2(it) && cond3(it) && cond4(it) && cond5(it) }boolean reject (Object item) {
if (this.criterion(item)) ! this.predicate(item) else false
}def requests = Request
.list()
.findAll { x -> ! criteriaList.find { c -> c.reject(x) } }Context
StackExchange Code Review Q#83211, answer score: 6
Revisions (0)
No revisions yet.