patternMinor
In a list of pairs, replace snds with median of all snds with the same fst
Viewed 0 times
samethesndswithallreplacelistmedianpairsfst
Problem
Given is a list of pairs. I want to replace the second element of all pairs with the median of the second elements of all pairs with the same first element.
For example:
The order of the elements in the resulting list is not important. So the following result would be equally valid:
My current solution looks as follows:
For example:
[("a",1.0),("b",3.0),("a",2.0),("a",4.0)]
->
[("a",2.0),("b",3.0),("a",2.0),("a",2.0)]The order of the elements in the resulting list is not important. So the following result would be equally valid:
[("a",2.0),("a",2.0),("a",2.0),("b",3.0)]My current solution looks as follows:
import Control.Arrow
import qualified Data.Map as M
median :: Fractional a => [a] -> a
median xs | null xs = error "empty list"
| odd len = xs !! mid
| even len = meanMedian
where len = length xs
mid = len `div` 2
meanMedian = (xs !! mid + xs !! (mid+1)) / 2
solve :: [(String, Float)] -> [(String, Float)]
solve pairs = map replaceWithMedian pairs
where
medianMap = map (mapSnd (:[])) >>> M.fromListWith (++) >>> M.toList
>>> map (mapSnd median) >>> M.fromList
mapSnd f (x, y) = (x, f y)
replaceWithMedian (x, y) = (x, M.findWithDefault 0 x (medianMap pairs))
main :: IO ()
main = print $ solve [("a",1.0),("b",3.0),("a",2.0),("a",4.0)]solve feels unnecessary long to me. I'm looking for more elegant/terse solution, not character-count-wise but with a simpler algorithm.Solution
Fixing things first
First of all,
Both should return
Also, using
Also, if you use
After all, there are (usually) no stack traces, and it's really nice to know where things went wrong. Note that you should add this behaviour in a documentation, since the type doesn't really tell what would happen on an empty list. Since
Use the tools you have at hand
with
Furthermore, be honest. Although
Furthermore, there's usually no reason to use
Instead,
First of all,
median is wrong:median [1,2,3,4,5] /= median [3,1,4,5,2]Both should return
3, but on the first list median returns 3 and on the latter 4. Make sure to sort your list, or state in the (currently missing) documentation that your list has to be sorted.Also, using
!! twice is rather slow. Instead, drop the first elements up to mid and then take the mean:median :: (Fractional a, Ord a) => [a] -> a
median [] = error "median: empty list"
median xs
| odd len = xs' !! mid
| otherwise = mean $ take 2 $ drop mid $ xs'
where
len = length xs
mid = len `div` 2
xs' = sort xs
mean :: (Fractional a) => [a] -> Maybe a
mean xs = -- exercise; it's not trivial to get this right
-- Alternatively, just use meanDuo in median:
meanDuo :: Fractional a => [a] -> a
meanDuo [x,y] = -- ...Also, if you use
error, make sure to include the function's name in the error message; best practice include the completely qualified name, i.e. error "Data.TupleMagic.median: empty list"After all, there are (usually) no stack traces, and it's really nice to know where things went wrong. Note that you should add this behaviour in a documentation, since the type doesn't really tell what would happen on an empty list. Since
Fractional a is also an instance of Num, any fixed number could be feasible. Alternatively use Maybe a as return type or NonEmpty as input.Use the tools you have at hand
mapSnd is second from Control.Arrow. Also, Data.Map is an instance of Functor, which means you can replaceM.toList >>> map (second f) >>> M.fromListwith
fmap fFurthermore, be honest. Although
medianMap is called *Map, it's actually a function. While GHC might call it only once, it's not guaranteed, so remember the map. And last, since you know that every key x will be contained in toMap pairs, you can use Data.Map.! instead of Data.Map.findWithDefault:solve :: [(String, Float)] -> [(String, Float)]
solve pairs = map replaceWithMedian pairs
where
toMap = map (second pure) >>> M.fromListWith (++) >>> fmap median
medianMap = toMap pairs
replaceWithMedian (x, _) = (x, medianMap M.! x)Furthermore, there's usually no reason to use
Float nowadays, unless you've checked - that it's faster,
- that it's reduced accuracy fulfils your needs, and
- you're having memory constraints (which usually also contradicts
Stringand lists).
Instead,
Double is used most of the time.Code Snippets
median [1,2,3,4,5] /= median [3,1,4,5,2]median :: (Fractional a, Ord a) => [a] -> a
median [] = error "median: empty list"
median xs
| odd len = xs' !! mid
| otherwise = mean $ take 2 $ drop mid $ xs'
where
len = length xs
mid = len `div` 2
xs' = sort xs
mean :: (Fractional a) => [a] -> Maybe a
mean xs = -- exercise; it's not trivial to get this right
-- Alternatively, just use meanDuo in median:
meanDuo :: Fractional a => [a] -> a
meanDuo [x,y] = -- ...error "Data.TupleMagic.median: empty list"M.toList >>> map (second f) >>> M.fromListsolve :: [(String, Float)] -> [(String, Float)]
solve pairs = map replaceWithMedian pairs
where
toMap = map (second pure) >>> M.fromListWith (++) >>> fmap median
medianMap = toMap pairs
replaceWithMedian (x, _) = (x, medianMap M.! x)Context
StackExchange Code Review Q#119549, answer score: 3
Revisions (0)
No revisions yet.