patternModerate
Finding the sum of all the multiples of 3 or 5 below 1000, using list comprehension
Viewed 0 times
comprehensiontheallfindingmultiplesusingbelowsum1000list
Problem
I'm trying to compare my own implementation with another solution of Project Euler problem #1, which uses a list comprehension:
Mine is directly inspired by user nicocarlos' PHP solution on the PE site. It avoids using hard-coded constants, instead computing most of the values (with as much algebraic simplification as possible):
I'm new to Haskell and inexperienced with functional programming; I'd like suggestions for improving this code's performance and style.
Currently criterion is showing similar mean benchmark times for each solution, calculated for
Updated with current code:
This solves the bug mentioned in the accepted answer, and takes much of its advice on style (as well as discarding the simplified algebra.)
```
-- Using arithmetic progression formula for the Nth element aN, we can determin
module Progression1 (sumProgressions) where
import Prelude
sumProgressions :: Integer -> Integer -> Integer -> Integer
sumProgressions d1 d2 limit =
sum [n | n <- [1 .. limit], (n `mod` d1) == 0 || (n `mod` d2) == 0]Mine is directly inspired by user nicocarlos' PHP solution on the PE site. It avoids using hard-coded constants, instead computing most of the values (with as much algebraic simplification as possible):
module Progression2 (sumProgressions2) where
aN :: Integer -> Integer -> Integer -> Integer
pSum :: Integer -> Integer -> Integer -> Integer
sumProgressions2 :: Integer -> Integer -> Integer -> Integer
aN a1 d n = a1 + (n - 1) * d -- Calculating the value of the Nth term
pSum 0 d lim = pSum d d lim -- a1 == 0 produces invalid pSum() result
pSum a1 d lim =
let n1 = (lim `div` d) -- Using simplified method for finding n
n2 = ((lim - a1) `div` d + 1) -- Use non-simplified formula for calculation
in if (a1 == d)
then (d * n1 * (n1 + 1)) `div` 2 -- Assumes n1 is simplified
else (n2 `div` 2) * (a1 + aN a1 d n2) -- Non-simplified calculation
sumProgressions2 d1 d2 lim = let d3 = d1 * d2
in (pSum d1 d1 lim) + (pSum d2 d2 lim) - (pSum d3 d3 lim)I'm new to Haskell and inexperienced with functional programming; I'd like suggestions for improving this code's performance and style.
Currently criterion is showing similar mean benchmark times for each solution, calculated for
d1 = 3, d2 = 5, limit = 100000-1. Updated with current code:
This solves the bug mentioned in the accepted answer, and takes much of its advice on style (as well as discarding the simplified algebra.)
```
-- Using arithmetic progression formula for the Nth element aN, we can determin
Solution
Correctness:
Many reviewers will not bother to comment on code that is not correct.
I might have kept this brief, myself, except that it was only in the process of considering optimization that I discovered that this code contains two bugs that prevent correct answers.
It shares one of those bugs with the simpler sumProgressions. The benefit of the simplicity of sumProgressions is that the bug has fewer places to hide there.
Hint: Why does sumProgressions 3 5 1000 give the same answer as sumProgressions 3 5 1001?
Nothing about this bug changes my comments below about style or optimization, so I might have pointed it out and moved on, even if I had caught it up front.
The more serious bug is unique to sumProgressions2.
Hint: Why does sumProgressions 3 6 20 give a different answer from sumProgressions2 3 6 20?
If I had noticed this earlier, my answer might have ended here.
But it seemed a shame to throw the rest of this away, so, (mostly) ignoring the cases in which the code fails...
Style:
I don't see the benefit to the function type declarations all at the top separated from their implementations where they could have had some minor self-documentation effect.
You seem to be following the Haskell tradition of using one-letter symbols and minimal explanatory comments, as if it was obvious from the context that a1 is the first element of a progression and d is its delta and n is the number of elements. To me, these things were not all obvious, especially when n is either n1 or n2 and the '1' in n1 has nothing to do with the '1' in a1. I think that n' and n would have been a more idiomatic naming than n1 and n2, since they are variants of each other. Function header comments describing the input arguments would not have been out of place, nor would spelling out "delta" and "limit", but that may be an "m o" (meaning "minority opinion" -- obviously).
The formulas used in pSum are sufficiently complicated and their syntax not quite algebraic, so their intent is obscured. A comment explaining the intent in plain language like "multiplying ... first and last elements..." or pure algebra would help.
Since both aN and pSum are short and used only within pSum and sumProgressions2, respectively, I'd have opted for defining them under a where clause. In the case of pSum, that allows lim to be referenced in the parent scope so it doesn't need to be explicitly passed. This makes the structure of the code more readable, making it clear that aN and pSum serve sumProgressions2 and sumProgressions2 only. I'm guessing (mostly from the presence of a1) that this may not have been your intent, in which case you should have made that intent clear by exporting these functions or showing other uses within the module.
In fact, the purpose of a1 is confounded by the fact that a1 is only ever passed the same value as d by the 3 callers of pSum, and since pSum isn't exported from the module, those seem to be the only cases. From that perspective, a1 is only used once, in an equality test that always passes, so it seems that you could eliminate it and with it half the complexity of pSum.
Even if your intent is to provide pSum as a useful function outside of sumProgressions2, the usage of pSum in sumProgressions2 is so specialized and falls so cleanly into the "simplified" case that it still warrants its own separate truly simplified pSum function that takes no a1 argument.
Whether the existence of another more general pSum implementation is justified and whether its a1 == d case is worth optimizing -- or (as I suspect) similar simplification applies equally to the general case -- is outside the scope of the current problem.
Yet I think that there is a point to be made here about over-generalization.
You could argue that you are doing a good thing by providing and using a more general solution that does not require a1 to equal d, but I don't think that argument holds up. It sounds like the same argument that justifies writing sumProgressions2 as a general solution that does not depend on the values 3, 5, and 1000. But consider the following:
or even
Do you consider these to have the same benefits of generality as your original solution?
To me,
There is danger in over-generalization. Testing and correctness is more difficult for a general function than for a more constrained or hard-co
Many reviewers will not bother to comment on code that is not correct.
I might have kept this brief, myself, except that it was only in the process of considering optimization that I discovered that this code contains two bugs that prevent correct answers.
It shares one of those bugs with the simpler sumProgressions. The benefit of the simplicity of sumProgressions is that the bug has fewer places to hide there.
Hint: Why does sumProgressions 3 5 1000 give the same answer as sumProgressions 3 5 1001?
Nothing about this bug changes my comments below about style or optimization, so I might have pointed it out and moved on, even if I had caught it up front.
The more serious bug is unique to sumProgressions2.
Hint: Why does sumProgressions 3 6 20 give a different answer from sumProgressions2 3 6 20?
If I had noticed this earlier, my answer might have ended here.
But it seemed a shame to throw the rest of this away, so, (mostly) ignoring the cases in which the code fails...
Style:
I don't see the benefit to the function type declarations all at the top separated from their implementations where they could have had some minor self-documentation effect.
You seem to be following the Haskell tradition of using one-letter symbols and minimal explanatory comments, as if it was obvious from the context that a1 is the first element of a progression and d is its delta and n is the number of elements. To me, these things were not all obvious, especially when n is either n1 or n2 and the '1' in n1 has nothing to do with the '1' in a1. I think that n' and n would have been a more idiomatic naming than n1 and n2, since they are variants of each other. Function header comments describing the input arguments would not have been out of place, nor would spelling out "delta" and "limit", but that may be an "m o" (meaning "minority opinion" -- obviously).
The formulas used in pSum are sufficiently complicated and their syntax not quite algebraic, so their intent is obscured. A comment explaining the intent in plain language like "multiplying ... first and last elements..." or pure algebra would help.
Since both aN and pSum are short and used only within pSum and sumProgressions2, respectively, I'd have opted for defining them under a where clause. In the case of pSum, that allows lim to be referenced in the parent scope so it doesn't need to be explicitly passed. This makes the structure of the code more readable, making it clear that aN and pSum serve sumProgressions2 and sumProgressions2 only. I'm guessing (mostly from the presence of a1) that this may not have been your intent, in which case you should have made that intent clear by exporting these functions or showing other uses within the module.
In fact, the purpose of a1 is confounded by the fact that a1 is only ever passed the same value as d by the 3 callers of pSum, and since pSum isn't exported from the module, those seem to be the only cases. From that perspective, a1 is only used once, in an equality test that always passes, so it seems that you could eliminate it and with it half the complexity of pSum.
Even if your intent is to provide pSum as a useful function outside of sumProgressions2, the usage of pSum in sumProgressions2 is so specialized and falls so cleanly into the "simplified" case that it still warrants its own separate truly simplified pSum function that takes no a1 argument.
Whether the existence of another more general pSum implementation is justified and whether its a1 == d case is worth optimizing -- or (as I suspect) similar simplification applies equally to the general case -- is outside the scope of the current problem.
Yet I think that there is a point to be made here about over-generalization.
You could argue that you are doing a good thing by providing and using a more general solution that does not require a1 to equal d, but I don't think that argument holds up. It sounds like the same argument that justifies writing sumProgressions2 as a general solution that does not depend on the values 3, 5, and 1000. But consider the following:
sumProgressions3_5_1000 3 5 1000 = correctAnswerForEuler1
sumProgressions3_5_1000 d1 d2 lim = let d3 = d1 * d2
in (pSum d1 d1 lim) + (pSum d2 d2 lim) - (pSum d3 d3 lim)or even
sumProgressions3 3 d2 lim = specialCaseOptimizedFor3 d2 lim
sumProgressions3 d1 d2 lim = let d3 = d1 * d2
in (pSum d1 d1 lim) + (pSum d2 d2 lim) - (pSum d3 d3 lim)Do you consider these to have the same benefits of generality as your original solution?
To me,
if (a1==d) in pSum defeats the purpose of the generalized solution just as much as this last example does. Your pSum just hard-codes for an arbitrary argument relationship rather than for an arbitrary argument value, but it's still arbitrary hard-coding for an arbitrary case.There is danger in over-generalization. Testing and correctness is more difficult for a general function than for a more constrained or hard-co
Code Snippets
sumProgressions3_5_1000 3 5 1000 = correctAnswerForEuler1
sumProgressions3_5_1000 d1 d2 lim = let d3 = d1 * d2
in (pSum d1 d1 lim) + (pSum d2 d2 lim) - (pSum d3 d3 lim)sumProgressions3 3 d2 lim = specialCaseOptimizedFor3 d2 lim
sumProgressions3 d1 d2 lim = let d3 = d1 * d2
in (pSum d1 d1 lim) + (pSum d2 d2 lim) - (pSum d3 d3 lim)Context
StackExchange Code Review Q#7629, answer score: 10
Revisions (0)
No revisions yet.