HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Find values in list which sum to a given value within tolerance

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
findwhichvaluewithintolerancesumvalueslistgiven

Problem

This is really a follow on from a question I asked earlier this year on Stack Overflow. I received a great answer to the following problem:


I'm trying to code up something simple and pythonic to identify
combinations of values from a list which sum to a defined value,
within some tolerance.


For example:


if A=[0.4,2,3,1.4,2.6,6.3] and the target value is 5 +/- 0.5, then the
output I want is (2,3), (1.4,2.6), (2,2.6), (0.4,2,3), (0.4,3,1.4)
etc. if no combinations are found then the function should return 0 or
none or something similar.

I implemented the suggestion in my code. However, the method has quickly become the performance limiting step in my code. It's fairly quick to run each iteration but it is being run many, many times.

Can you see any way to optimise this function or replace it with something much faster?

def findSum(self, dataArray, target, tolerance=0.5):
    for i in xrange(1, len(dataArray)+1):

        results = [list(comb) for comb in list(itertools.combinations(dataArray, i)) 
                   if target-tolerance < sum(map(float, comb)) < target+tolerance]

        if len(results) != 0:
            return results

Solution

In keeping the same algorithm, there's a bunch of extra work you're doing that's slowing you down. I am using the word slow loosely, since the one example you provided is still more or less instantaneous and the timings I performed were based on 100,000 runs.

Leave iterables alone

You wrap your combinations() call with list(). This means that you have to do actually create all the combinations up front, then iterate over them. But you never actually need all of them at once. You just need them one at a time. Changing your body to:

results = [list(comb) for comb in itertools.combinations(dataArray, i)
           if target-tolerance < sum(map(float, comb)) < target+tolerance]


drops runtime from 2.64s to 2.45s. Furthermore, comb is going to be a tuple, so that's fine as is too:

results = [comb for comb in itertools.combinations(dataArray, i)
           if target-tolerance < sum(map(float, comb)) < target+tolerance]


Down to 2.24s. Another 10%.

Building lists for no reason

Our numbers are already summable, no need to cast them to float first. You're making a whole bunch of extra lists that you're immediately discarding - just to be able to do a sum. But you can already do a sum:

results = [comb for comb in itertools.combinations(dataArray, i)
           if target-tolerance < sum(comb) < target+tolerance]


That alone brings us down to 1.10s. An enormous savings.

Comparisons

One comparison is faster than two comparisons, so let's just do the one:

results = [comb for comb in itertools.combinations(dataArray, i)
           if abs(sum(comb) - target) < tolerance]


Down to 1.04s.

What do you actually want

This doesn't yield all of the potential partial sums. It will yield some sums if any exist, and will return None otherwise - but it won't give you all of them. If you want all of them, you'll have to loop through all of the is. If you just want the first one, then this:

for i in xrange(1, len(dataArray)+1):
    for comb in itertools.combinations(dataArray, i): 
       if abs(sum(comb) - target) < tolerance:
        return comb


just got reduced to 0.67s.

As Joe Wallis suggests, you probably want:

for i in xrange(1, len(dataArray)+1):
    for comb in itertools.combinations(dataArray, i): 
       if abs(sum(comb) - target) < tolerance:
        yield comb


That way the caller can decide if one is sufficient or if all of them are needed.

Code Snippets

results = [list(comb) for comb in itertools.combinations(dataArray, i)
           if target-tolerance < sum(map(float, comb)) < target+tolerance]
results = [comb for comb in itertools.combinations(dataArray, i)
           if target-tolerance < sum(map(float, comb)) < target+tolerance]
results = [comb for comb in itertools.combinations(dataArray, i)
           if target-tolerance < sum(comb) < target+tolerance]
results = [comb for comb in itertools.combinations(dataArray, i)
           if abs(sum(comb) - target) < tolerance]
for i in xrange(1, len(dataArray)+1):
    for comb in itertools.combinations(dataArray, i): 
       if abs(sum(comb) - target) < tolerance:
        return comb

Context

StackExchange Code Review Q#107729, answer score: 6

Revisions (0)

No revisions yet.