patternpythonMinor
Find values in list which sum to a given value within tolerance
Viewed 0 times
findwhichvaluewithintolerancesumvalueslistgiven
Problem
This is really a follow on from a question I asked earlier this year on Stack Overflow. I received a great answer to the following problem:
I'm trying to code up something simple and pythonic to identify
combinations of values from a list which sum to a defined value,
within some tolerance.
For example:
if A=[0.4,2,3,1.4,2.6,6.3] and the target value is 5 +/- 0.5, then the
output I want is (2,3), (1.4,2.6), (2,2.6), (0.4,2,3), (0.4,3,1.4)
etc. if no combinations are found then the function should return 0 or
none or something similar.
I implemented the suggestion in my code. However, the method has quickly become the performance limiting step in my code. It's fairly quick to run each iteration but it is being run many, many times.
Can you see any way to optimise this function or replace it with something much faster?
I'm trying to code up something simple and pythonic to identify
combinations of values from a list which sum to a defined value,
within some tolerance.
For example:
if A=[0.4,2,3,1.4,2.6,6.3] and the target value is 5 +/- 0.5, then the
output I want is (2,3), (1.4,2.6), (2,2.6), (0.4,2,3), (0.4,3,1.4)
etc. if no combinations are found then the function should return 0 or
none or something similar.
I implemented the suggestion in my code. However, the method has quickly become the performance limiting step in my code. It's fairly quick to run each iteration but it is being run many, many times.
Can you see any way to optimise this function or replace it with something much faster?
def findSum(self, dataArray, target, tolerance=0.5):
for i in xrange(1, len(dataArray)+1):
results = [list(comb) for comb in list(itertools.combinations(dataArray, i))
if target-tolerance < sum(map(float, comb)) < target+tolerance]
if len(results) != 0:
return resultsSolution
In keeping the same algorithm, there's a bunch of extra work you're doing that's slowing you down. I am using the word slow loosely, since the one example you provided is still more or less instantaneous and the timings I performed were based on 100,000 runs.
Leave iterables alone
You wrap your
drops runtime from 2.64s to 2.45s. Furthermore,
Down to 2.24s. Another 10%.
Building lists for no reason
Our numbers are already summable, no need to cast them to
That alone brings us down to 1.10s. An enormous savings.
Comparisons
One comparison is faster than two comparisons, so let's just do the one:
Down to 1.04s.
What do you actually want
This doesn't yield all of the potential partial sums. It will yield some sums if any exist, and will return
just got reduced to 0.67s.
As Joe Wallis suggests, you probably want:
That way the caller can decide if one is sufficient or if all of them are needed.
Leave iterables alone
You wrap your
combinations() call with list(). This means that you have to do actually create all the combinations up front, then iterate over them. But you never actually need all of them at once. You just need them one at a time. Changing your body to:results = [list(comb) for comb in itertools.combinations(dataArray, i)
if target-tolerance < sum(map(float, comb)) < target+tolerance]drops runtime from 2.64s to 2.45s. Furthermore,
comb is going to be a tuple, so that's fine as is too:results = [comb for comb in itertools.combinations(dataArray, i)
if target-tolerance < sum(map(float, comb)) < target+tolerance]Down to 2.24s. Another 10%.
Building lists for no reason
Our numbers are already summable, no need to cast them to
float first. You're making a whole bunch of extra lists that you're immediately discarding - just to be able to do a sum. But you can already do a sum:results = [comb for comb in itertools.combinations(dataArray, i)
if target-tolerance < sum(comb) < target+tolerance]That alone brings us down to 1.10s. An enormous savings.
Comparisons
One comparison is faster than two comparisons, so let's just do the one:
results = [comb for comb in itertools.combinations(dataArray, i)
if abs(sum(comb) - target) < tolerance]Down to 1.04s.
What do you actually want
This doesn't yield all of the potential partial sums. It will yield some sums if any exist, and will return
None otherwise - but it won't give you all of them. If you want all of them, you'll have to loop through all of the is. If you just want the first one, then this:for i in xrange(1, len(dataArray)+1):
for comb in itertools.combinations(dataArray, i):
if abs(sum(comb) - target) < tolerance:
return combjust got reduced to 0.67s.
As Joe Wallis suggests, you probably want:
for i in xrange(1, len(dataArray)+1):
for comb in itertools.combinations(dataArray, i):
if abs(sum(comb) - target) < tolerance:
yield combThat way the caller can decide if one is sufficient or if all of them are needed.
Code Snippets
results = [list(comb) for comb in itertools.combinations(dataArray, i)
if target-tolerance < sum(map(float, comb)) < target+tolerance]results = [comb for comb in itertools.combinations(dataArray, i)
if target-tolerance < sum(map(float, comb)) < target+tolerance]results = [comb for comb in itertools.combinations(dataArray, i)
if target-tolerance < sum(comb) < target+tolerance]results = [comb for comb in itertools.combinations(dataArray, i)
if abs(sum(comb) - target) < tolerance]for i in xrange(1, len(dataArray)+1):
for comb in itertools.combinations(dataArray, i):
if abs(sum(comb) - target) < tolerance:
return combContext
StackExchange Code Review Q#107729, answer score: 6
Revisions (0)
No revisions yet.