patternMinor
Efficient Prime Sieve - Haskell
Viewed 0 times
primeefficientsievehaskell
Problem
I'm new to Haskell, and am trying to learn the basics with a simple function that will calculate the list of primes up to a given value (using the Sieve of Eratosthenes). Here is my code so far, which has the problem of becoming very slow for large inputs
I am aware of this resource which gives implementations of Eratosthenes' Sieve, however, their methods are beyond me as I'm only just beginning. I'm also aware of this this question, but they have used entirely different methods again. I would like to know what it is about my code that is so inefficient.
Example timings:
I suspect the problem lies in checking if
nmax:primes :: Integer -> [Integer]
primes nmax
| nmax [Integer]
sieve [] = []
sieve [a] = [a]
sieve (p:xs) = p : sieve [multiple*p+offset | multiple <- [1..(last xs) `quot` p], offset <-[1..(p-1)], multiple*p+offset `elem` xs]I am aware of this resource which gives implementations of Eratosthenes' Sieve, however, their methods are beyond me as I'm only just beginning. I'm also aware of this this question, but they have used entirely different methods again. I would like to know what it is about my code that is so inefficient.
Example timings:
primes 100takes0.01sand uses0 bytesof RAM.
primes 1000takes0.41sand uses ~38.5MBof RAM.
primes 2000takes3.15sand uses ~153MBof RAM.
I suspect the problem lies in checking if
multiple*p+offset is an element of the list xs, but I'm not sure how to circumvent this check.Solution
The underlying complexity
I suspect the problem lies …
Ah, but we don't like to suspect. Just as our compiler tells us that we're missing a semicolon, our profiler will tell us where we lose time. Let's add a
Now, it's easier to measure:
Before we have a look at the profile, we already see that something is amiss. For twice the maximum, your code takes almost 7 times the time: 0.375s for numbers up to 1000, and 2.297 seconds for 2000. That indicates that your algorithm doesn't have the right complexity to begin with. Let's check 4000:
For twice the elements, we have 8 times the time. This indicates that you have \$\mathcal O(n^3)\$ complexity.
An unhelpful profile
Let's have a look at the generated profile. In case you don't know GHC's profiling feature yet:
This will create a
```
Sun Sep 11 08:24 2016 Time and Allocation Profiling Report (Final)
Primes +RTS -s -p -RTS
total time = 15.19 secs (15193 ticks @ 1000 us, 1 processor)
total alloc = 185,477,264 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
primes.sieve Main 100.0 99.9
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 44 0 0.0 0.0 100.0 100.0
main Main 89 0 0.0 0.0 100.0 100.0
primes Main 91 1 0.0 0
I suspect the problem lies …
Ah, but we don't like to suspect. Just as our compiler tells us that we're missing a semicolon, our profiler will tell us where we lose time. Let's add a
main first:main :: IO ()
main = readLn >>= print . length . primesNow, it's easier to measure:
$ ghc -prof -auto-all -O2 Primes.hs
$ echo 1000 | ./Primes +RTS -s -p
168
23,717,912 bytes allocated in the heap
1,235,856 bytes copied during GC
111,776 bytes maximum residency (2 sample(s))
26,880 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 44 colls, 0 par 0.000s 0.002s 0.0000s 0.0002s
Gen 1 2 colls, 0 par 0.000s 0.001s 0.0005s 0.0008s
INIT time 0.000s ( 0.002s elapsed)
MUT time 0.375s ( 0.372s elapsed)
GC time 0.000s ( 0.003s elapsed)
RP time 0.000s ( 0.000s elapsed)
PROF time 0.000s ( 0.000s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 0.375s ( 0.377s elapsed)
%GC time 0.0% (0.8% elapsed)
Alloc rate 63,247,765 bytes per MUT second
Productivity 100.0% of total user, 99.5% of total elapsed
$ echo 2000 | ./Primes +RTS -s -p
303
84,381,928 bytes allocated in the heap
9,612,184 bytes copied during GC
137,048 bytes maximum residency (2 sample(s))
28,592 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 161 colls, 0 par 0.062s 0.015s 0.0001s 0.0005s
Gen 1 2 colls, 0 par 0.000s 0.001s 0.0003s 0.0004s
INIT time 0.000s ( 0.001s elapsed)
MUT time 2.297s ( 2.399s elapsed)
GC time 0.062s ( 0.015s elapsed)
RP time 0.000s ( 0.000s elapsed)
PROF time 0.000s ( 0.000s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 2.359s ( 2.416s elapsed)
%GC time 2.6% (0.6% elapsed)
Alloc rate 36,737,710 bytes per MUT second
Productivity 97.4% of total user, 95.1% of total elapsedBefore we have a look at the profile, we already see that something is amiss. For twice the maximum, your code takes almost 7 times the time: 0.375s for numbers up to 1000, and 2.297 seconds for 2000. That indicates that your algorithm doesn't have the right complexity to begin with. Let's check 4000:
$ echo 4000 | ./Primes +RTS -s -p
550
302,844,912 bytes allocated in the heap
70,455,392 bytes copied during GC
431,144 bytes maximum residency (37 sample(s))
36,976 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 546 colls, 0 par 0.016s 0.093s 0.0002s 0.0006s
Gen 1 37 colls, 0 par 0.031s 0.012s 0.0003s 0.0013s
INIT time 0.000s ( 0.001s elapsed)
MUT time 16.766s ( 17.134s elapsed)
GC time 0.047s ( 0.105s elapsed)
RP time 0.000s ( 0.000s elapsed)
PROF time 0.000s ( 0.000s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 16.812s ( 17.240s elapsed)
%GC time 0.3% (0.6% elapsed)
Alloc rate 18,063,443 bytes per MUT second
Productivity 99.7% of total user, 97.2% of total elapsedFor twice the elements, we have 8 times the time. This indicates that you have \$\mathcal O(n^3)\$ complexity.
An unhelpful profile
Let's have a look at the generated profile. In case you don't know GHC's profiling feature yet:
-prof enables profiling during compilation, but you still need to tell GHC what to profile (in your code). -auto-all enables profiling of your functions without changing your code. And +RTS -p enables profiling at runtime.This will create a
.prof file. However, it's not very helpful, since most of your logic is in a list comprehension, and that's not profiled:```
Sun Sep 11 08:24 2016 Time and Allocation Profiling Report (Final)
Primes +RTS -s -p -RTS
total time = 15.19 secs (15193 ticks @ 1000 us, 1 processor)
total alloc = 185,477,264 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
primes.sieve Main 100.0 99.9
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 44 0 0.0 0.0 100.0 100.0
main Main 89 0 0.0 0.0 100.0 100.0
primes Main 91 1 0.0 0
Code Snippets
main :: IO ()
main = readLn >>= print . length . primes$ ghc -prof -auto-all -O2 Primes.hs
$ echo 1000 | ./Primes +RTS -s -p
168
23,717,912 bytes allocated in the heap
1,235,856 bytes copied during GC
111,776 bytes maximum residency (2 sample(s))
26,880 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 44 colls, 0 par 0.000s 0.002s 0.0000s 0.0002s
Gen 1 2 colls, 0 par 0.000s 0.001s 0.0005s 0.0008s
INIT time 0.000s ( 0.002s elapsed)
MUT time 0.375s ( 0.372s elapsed)
GC time 0.000s ( 0.003s elapsed)
RP time 0.000s ( 0.000s elapsed)
PROF time 0.000s ( 0.000s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 0.375s ( 0.377s elapsed)
%GC time 0.0% (0.8% elapsed)
Alloc rate 63,247,765 bytes per MUT second
Productivity 100.0% of total user, 99.5% of total elapsed
$ echo 2000 | ./Primes +RTS -s -p
303
84,381,928 bytes allocated in the heap
9,612,184 bytes copied during GC
137,048 bytes maximum residency (2 sample(s))
28,592 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 161 colls, 0 par 0.062s 0.015s 0.0001s 0.0005s
Gen 1 2 colls, 0 par 0.000s 0.001s 0.0003s 0.0004s
INIT time 0.000s ( 0.001s elapsed)
MUT time 2.297s ( 2.399s elapsed)
GC time 0.062s ( 0.015s elapsed)
RP time 0.000s ( 0.000s elapsed)
PROF time 0.000s ( 0.000s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 2.359s ( 2.416s elapsed)
%GC time 2.6% (0.6% elapsed)
Alloc rate 36,737,710 bytes per MUT second
Productivity 97.4% of total user, 95.1% of total elapsed$ echo 4000 | ./Primes +RTS -s -p
550
302,844,912 bytes allocated in the heap
70,455,392 bytes copied during GC
431,144 bytes maximum residency (37 sample(s))
36,976 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 546 colls, 0 par 0.016s 0.093s 0.0002s 0.0006s
Gen 1 37 colls, 0 par 0.031s 0.012s 0.0003s 0.0013s
INIT time 0.000s ( 0.001s elapsed)
MUT time 16.766s ( 17.134s elapsed)
GC time 0.047s ( 0.105s elapsed)
RP time 0.000s ( 0.000s elapsed)
PROF time 0.000s ( 0.000s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 16.812s ( 17.240s elapsed)
%GC time 0.3% (0.6% elapsed)
Alloc rate 18,063,443 bytes per MUT second
Productivity 99.7% of total user, 97.2% of total elapsedSun Sep 11 08:24 2016 Time and Allocation Profiling Report (Final)
Primes +RTS -s -p -RTS
total time = 15.19 secs (15193 ticks @ 1000 us, 1 processor)
total alloc = 185,477,264 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
primes.sieve Main 100.0 99.9
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 44 0 0.0 0.0 100.0 100.0
main Main 89 0 0.0 0.0 100.0 100.0
primes Main 91 1 0.0 0.1 100.0 100.0
primes.sieve Main 92 550 100.0 99.9 100.0 99.9
CAF GHC.Read 76 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.CodePage 73 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding 68 0 0.0 0.0 0.0 0.0
CAF Text.Read.Lex 61 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.Text 57 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD 56 0 0.0 0.0 0.0 0.0
CAF:main1 Main 52 0 0.0 0.0 0.0 0.0
main Main 88 1 0.0 0.0 0.0 0.0
CAF:lvl5_r3TO Main 51 0 0.0 0.0 0.0 0.0
main Main 90 0 0.0 0.0 0.0 0.0sieve (p:xs) = p : sieve [multiple*p+offset
| multiple <- {-# SCC multiple #-} [1..(last xs) `quot` p]
, offset <- {-# SCC offset #-} [1..(p-1)]
, {-# SCC elem #-} multiple*p+offset `elem` xs]Context
StackExchange Code Review Q#141020, answer score: 7
Revisions (0)
No revisions yet.