patternMinor

Project Euler #14 (Longest Collatz Sequence) in Haskell

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

sequenceprojecteulerhaskelllongestcollatz

Problem

You can check the problem here:
http://projecteuler.net/problem=14

My first approach in Haskell was this:

import Data.Ord
import Data.List

computeCollatzSequenceLength n = let
    compute l n
        | n == 1 = l+1
        | even n = compute (l+1) (n `div` 2)
        | otherwise = compute (l+1) (3*n+1)
    in generate 0 n

main = print $ fst $ maximumBy (comparing snd) $ zip [1..1000000] $ map computeCollatzSequenceLength [1..1000000]

Too slow it was, so I tried to use a sort of memoization by using an unboxed array:

import Data.Ord
import Data.List
import Data.Int
import Data.Array.Unboxed

computeCollatzSequenceLength :: (UArray Int64 Int64) -> Int64 -> (Int64, [Int64])
computeCollatzSequenceLength a n = let
    compute :: Int64 -> [Int64] -> Int64 -> (Int64, [Int64])
    compute l s n'
        | n' == 1 = (l+1, reverse s)
        | v > 0 = (l+v, reverse s)
        | even n' = compute (l+1) (n':s) (n' `div` 2)
        | otherwise = compute (l+1) (n':s) (3*n'+1)
        where v = if n' > (snd $ bounds a) then 0 else a!n'
    in compute 0 [] n

computeMax m = let
    compute :: (Int64,Int64) -> Int64 -> (UArray Int64 Int64) -> (Int64,Int64)
    compute candidate n a
        | n == (m+1) = candidate
        | otherwise = let
            (l, u) = computeCollatzSequenceLength a n
            a' = a//(filter (\(n',_) -> n' <= m) (zip u [l,l-1..]))
            in if (snd candidate) < l then compute (n,l) (n+1) a' else compute candidate (n+1) a' 
    in compute (0,0) 1 (array (1,m) [ (i,0) | i <- [1..m] ]) 

main = print $ fst $ computeMax 1000000

But, this was also very slow...(Both took more than a few minutes on my machine. Actually, the latter seems to take much longer...) I don't know what I've done wrong(I'm still a novice in Haskell.) In theory, the memoized version should be faster. I want to avoid using one of existing memoization packages out there.

What are issues with the current approach and how can I optimize this while still keeping i

Solution

The main reason why this is slow is that every update to your array is making a new copy of it. The version you linked to on the wiki avoids this problem by using a boxed array instead, which allows you to define the array in one go and let lazy evaluation take care of evaluating the elements in an appropriate order and modifying the array behind the scenes.

A boxed array is less memory efficient since it's basically an array of pointers to individually-allocated integers, but for the size of this problem it's not too bad. If you want to use an unboxed array for improved space efficiency, you would probably want to use a mutable one in the ST monad instead. There's an example of how how to do that in an old answer of mine on Stack Overflow.

Regarding your second attempt:

First off, let's get this code to compile.

-
Since the else branch of this if is returning a statement in the ST monad, so must the then branch. A quick fix is to add return:

v  n then return 0 else (readArray seqLengths n')
                    ^^^^^^

-
And there shouldn't be a $ when passing the comparison function
to maximumBy here:

main = print $ fst $ maximumBy  (comparing snd) $ assocs $ getCollatzSequenceLengthUpto 1000000
                               ^

With those small changes, it compiles and it runs in about 4 seconds on my netbook. Not too bad!

However, it's a bit messy, let's see if we can fix that. First of all, let's reduce that ball of nested if's.

-
We can get rid of the outermost if by using a separate equation for when n' is 1:

compute l s 1  = return (l+1, reverse s)

-
The two branches of the innermost if are almost the same. We can move the if inside to the only place they differ. Also `do out

Here is the full working code after these adjustments. I still think the way
you're using a list to deal with the updates is somewhat complicated, but I think fixing that in a satisfactory way would require more substantial changes to your algorithm, so I've left it the way it is.

import Control.Monad (forM_)
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Unboxed
import Data.Int (Int64)
import Data.List (maximumBy)
import Data.Ord (comparing)

getCollatzSequenceLengthUpto :: Int64 -> (UArray Int64 Int64)
getCollatzSequenceLengthUpto n = runSTUArray $ do
    seqLengths  do
        let compute l s 1  = return (l+1, reverse s)
            compute l s n' = do
                v  n then return 0 else readArray seqLengths n'
                if v > 0
                    then return (l+v, reverse s)
                    else compute (l+1) (n':s) (if even n' then n' `div` 2 else (3*n'+1))
        (l, u)  n'  do
            writeArray seqLengths ix e 
    return seqLengths

main = print $ fst $ maximumBy (comparing snd) $ assocs $ getCollatzSequenceLengthUpto 1000000

Code Snippets

v <- if n' > n then return 0 else (readArray seqLengths n')
                    ^^^^^^

main = print $ fst $ maximumBy  (comparing snd) $ assocs $ getCollatzSequenceLengthUpto 1000000
                               ^

compute l s 1  = return (l+1, reverse s)

compute l s n' = do
    v <- if n' > n then return 0 else readArray seqLengths n'
    if v > 0
        then return (l+v, reverse s)
        else compute (l+1) (n':s) (if even n' then n' `div` 2 else (3*n'+1))

import Control.Monad (forM_)
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Unboxed
import Data.Int (Int64)
import Data.List (maximumBy)
import Data.Ord (comparing)

getCollatzSequenceLengthUpto :: Int64 -> (UArray Int64 Int64)
getCollatzSequenceLengthUpto n = runSTUArray $ do
    seqLengths <- newArray (1,n) 0
    forM_ [1..n] $ \i -> do
        let compute l s 1  = return (l+1, reverse s)
            compute l s n' = do
                v <- if n' > n then return 0 else readArray seqLengths n'
                if v > 0
                    then return (l+v, reverse s)
                    else compute (l+1) (n':s) (if even n' then n' `div` 2 else (3*n'+1))
        (l, u) <- compute 0 [] i
        forM_ (filter (\(n',_) -> n' <= n) $ zip u [l,l-1..]) $ \(ix, e) -> do
            writeArray seqLengths ix e 
    return seqLengths

main = print $ fst $ maximumBy (comparing snd) $ assocs $ getCollatzSequenceLengthUpto 1000000

Context

StackExchange Code Review Q#27059, answer score: 3

Revisions (0)

No revisions yet.