HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Ordinary Data Processing Task in Haskell: Vague Misgivings

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
vaguemisgivingsordinaryhaskellprocessingdatatask

Problem

This is an ordinary data processing task: read a list of dates and amounts (for example, deposits and withdrawals from a bank account) and report the date on which the lowest balance was recorded and what that balance was. While I do get the correct answer--always a virtue--with the code below, I can't help thinking that it could be done better (e.g. more concisely). I've included the preliminary input parsing code so that the whole thing can be compiled and run as is.

module Main where

import Text.ParserCombinators.Parsec
import Data.Time
import Numeric
import Control.Applicative ((), (), (*>))
import Control.Monad.State as CM

--The purpose is to sum up a list of dated values (example: transactions in a bank
--account) and produce the date and amount of the lowest recorded balance.
--For the three lines below, the output would be: 
--     DtdBal {runningBal = 4859.96, lowDt = 2013-01-23, lowBal = 4859.96} 

--12/1/12,  6844.79
--1/1/13,   992.41
--1/23/13,  -2977.24

data DtdVal = DtdVal { dt :: Day
                     , amt :: Double 
                     } deriving Show

data DtdBal = DtdBal { runningBal :: Double
                     , lowDt :: Day
                     , lowBal :: Double
                     } deriving Show

main = do res  putStrLn . show $ process l
            Left _ -> error "Oops"
    where process (x:xs) = execState (minBal xs) (initState x) 
          initState x = DtdBal { runningBal = amt x, lowDt = dt x, lowBal = amt x }

minBal :: [DtdVal] -> CM.State DtdBal ()
minBal [] = return ()
minBal (x:xs) = do
  rBal  get
  lBal  get
  lDt   get
  if amt x + rBal  dts  (char ',' >> spaces *> readSgnd)

dts :: CharParser () Day
dts = toDay  many digit `sepBy` char '/'
    where toDay [x, y, z] = fromGregorian (2000 + read z) (read x) (read y)

readSgnd :: CharParser () Double
readSgnd = do
  optional (char '+')
  inp <- getInput
  let [(num, inp')] = readSigned readFloat inp
  setInput inp'
  return num

Solution


  • recursive part of minBal is mapM_



  • multiple get is better to refactor into one



  • get - process - put part of minBal is modify



  • state monad is not necessary - you can use plain foldr instead of mapM_ and modify



  • runningBal is better to implement using scanr. Then zip running balances and source list and foldr the resulting list of tuples



  • use minimum to find the minimum



Here is a more idiomatic approach to minBal:

runningBal' = scanl1 (+) . map amt

rb l = zip (runningBal' l) l

mb l = minimumBy (compare `on` fst) l

minDay = mb . rb

minBal' l = DtdBal { 
    runningBal = fst $ last runningBals, 
    lowDt = dt $ snd md, 
    lowBal = fst md } 
    where
        runningBals = rb l
        md = minDay l


You can inline all the small functions for the final code but that's the order I created them in. Note that rb is calculated twice, but I was not sure if you really need the final balance. In your algorithm it was a byproduct.

main can be rewritten too:

main' = (parseFromFile sched >=> process >=> print) "escrow.txt" where
    process (Right l) = return $ minBal' l
    process (Left _) = error "Oops"


Basically, putStrLn . show is print and composition is your friend. >=> is monadic variant of .. It's advantage over >>= is associativity - it lets you refactor your code more easily.

Here is a shorter version of minBal' which also saves a few calculations, but last is still O(N):

minBal' l = DtdBal { 
    runningBal = last runningBals, 
    lowDt = dt $ snd md, 
    lowBal = fst md } 
    where
        md = minimumBy (compare `on` fst) $ zip runningBals l
        runningBals = scanl1 (+) $ map amt l


Let me know how this works for very large files.

Code Snippets

runningBal' = scanl1 (+) . map amt

rb l = zip (runningBal' l) l

mb l = minimumBy (compare `on` fst) l

minDay = mb . rb

minBal' l = DtdBal { 
    runningBal = fst $ last runningBals, 
    lowDt = dt $ snd md, 
    lowBal = fst md } 
    where
        runningBals = rb l
        md = minDay l
main' = (parseFromFile sched >=> process >=> print) "escrow.txt" where
    process (Right l) = return $ minBal' l
    process (Left _) = error "Oops"
minBal' l = DtdBal { 
    runningBal = last runningBals, 
    lowDt = dt $ snd md, 
    lowBal = fst md } 
    where
        md = minimumBy (compare `on` fst) $ zip runningBals l
        runningBals = scanl1 (+) $ map amt l

Context

StackExchange Code Review Q#16212, answer score: 2

Revisions (0)

No revisions yet.