HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

In a list of pairs, replace snds with median of all snds with the same fst

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
samethesndswithallreplacelistmedianpairsfst

Problem

Given is a list of pairs. I want to replace the second element of all pairs with the median of the second elements of all pairs with the same first element.

For example:

[("a",1.0),("b",3.0),("a",2.0),("a",4.0)]
->
[("a",2.0),("b",3.0),("a",2.0),("a",2.0)]


The order of the elements in the resulting list is not important. So the following result would be equally valid:

[("a",2.0),("a",2.0),("a",2.0),("b",3.0)]


My current solution looks as follows:

import Control.Arrow
import qualified Data.Map as M

median :: Fractional a => [a] -> a
median xs | null xs  = error "empty list"
          | odd  len = xs !! mid
          | even len = meanMedian
                where  len = length xs
                       mid = len `div` 2
                       meanMedian = (xs !! mid + xs !! (mid+1)) / 2

solve :: [(String, Float)] -> [(String, Float)]
solve pairs = map replaceWithMedian pairs
    where
        medianMap = map (mapSnd (:[])) >>> M.fromListWith (++) >>> M.toList
                        >>> map (mapSnd median) >>> M.fromList
        mapSnd f (x, y) = (x, f y)
        replaceWithMedian (x, y) = (x, M.findWithDefault 0 x (medianMap pairs))

main :: IO ()
main = print $ solve [("a",1.0),("b",3.0),("a",2.0),("a",4.0)]


solve feels unnecessary long to me. I'm looking for more elegant/terse solution, not character-count-wise but with a simpler algorithm.

Solution

Fixing things first

First of all, median is wrong:

median [1,2,3,4,5] /= median [3,1,4,5,2]


Both should return 3, but on the first list median returns 3 and on the latter 4. Make sure to sort your list, or state in the (currently missing) documentation that your list has to be sorted.
Also, using !! twice is rather slow. Instead, drop the first elements up to mid and then take the mean:

median :: (Fractional a, Ord a) => [a] -> a
median [] = error "median: empty list"
median xs
 | odd len  = xs' !! mid
 | otherwise = mean $ take 2 $ drop mid $ xs'
 where 
     len = length xs
     mid = len `div` 2
     xs' = sort xs

mean :: (Fractional a) => [a] -> Maybe a
mean xs = -- exercise; it's not trivial to get this right

-- Alternatively, just use meanDuo in median:
meanDuo :: Fractional a => [a] -> a
meanDuo [x,y] = -- ...


Also, if you use error, make sure to include the function's name in the error message; best practice include the completely qualified name, i.e.

error "Data.TupleMagic.median: empty list"


After all, there are (usually) no stack traces, and it's really nice to know where things went wrong. Note that you should add this behaviour in a documentation, since the type doesn't really tell what would happen on an empty list. Since Fractional a is also an instance of Num, any fixed number could be feasible. Alternatively use Maybe a as return type or NonEmpty as input.

Use the tools you have at hand

mapSnd is second from Control.Arrow. Also, Data.Map is an instance of Functor, which means you can replace

M.toList >>> map (second f) >>> M.fromList


with

fmap f


Furthermore, be honest. Although medianMap is called *Map, it's actually a function. While GHC might call it only once, it's not guaranteed, so remember the map. And last, since you know that every key x will be contained in toMap pairs, you can use Data.Map.! instead of Data.Map.findWithDefault:

solve :: [(String, Float)] -> [(String, Float)]
solve pairs = map replaceWithMedian pairs
  where
    toMap     = map (second pure) >>> M.fromListWith (++) >>> fmap median
    medianMap = toMap pairs
    replaceWithMedian (x, _) = (x, medianMap M.! x)


Furthermore, there's usually no reason to use Float nowadays, unless you've checked

  • that it's faster,



  • that it's reduced accuracy fulfils your needs, and



  • you're having memory constraints (which usually also contradicts String and lists).



Instead, Double is used most of the time.

Code Snippets

median [1,2,3,4,5] /= median [3,1,4,5,2]
median :: (Fractional a, Ord a) => [a] -> a
median [] = error "median: empty list"
median xs
 | odd len  = xs' !! mid
 | otherwise = mean $ take 2 $ drop mid $ xs'
 where 
     len = length xs
     mid = len `div` 2
     xs' = sort xs

mean :: (Fractional a) => [a] -> Maybe a
mean xs = -- exercise; it's not trivial to get this right

-- Alternatively, just use meanDuo in median:
meanDuo :: Fractional a => [a] -> a
meanDuo [x,y] = -- ...
error "Data.TupleMagic.median: empty list"
M.toList >>> map (second f) >>> M.fromList
solve :: [(String, Float)] -> [(String, Float)]
solve pairs = map replaceWithMedian pairs
  where
    toMap     = map (second pure) >>> M.fromListWith (++) >>> fmap median
    medianMap = toMap pairs
    replaceWithMedian (x, _) = (x, medianMap M.! x)

Context

StackExchange Code Review Q#119549, answer score: 3

Revisions (0)

No revisions yet.