patternMinor
String splitting function in Haskell
Viewed 0 times
functionstringsplittinghaskell
Problem
I need to write a function which selects a run of repeated characters from the start of a string, with the run comprising at most nine characters.
I can't define any new functions (except helper ones), however, I can use all of the haskell built-in ones. Is there a nicer way to write the
Examples
chomp :: String -> String
chomp str = takeWhile (== head str) str
munch :: String -> String
munch = take 9 . chomp
runs :: String -> [String]
runs string = extract string [] where
extract str xs
| length str == 0 = xs
| otherwise = extract (drop (length (munch str)) str) (xs ++ [(munch str)])I can't define any new functions (except helper ones), however, I can use all of the haskell built-in ones. Is there a nicer way to write the
runs function?Examples
runs "aaaaabbbbcc"
["aaaaa","bbbb","cc"]
runs "aaaaaaaaaaaaaaabbbbcc"
["aaaaaaaaa","aaaaaa","bbbb","cc"]
runs "dddddddddddd"
["ddddddddd","ddd"]
runs "aaabbbaaa"
["aaa","bbb","aaa"]Solution
Another variant of
Let us start with the finished product first, and then I will show you how to get there:
Use pattern matching
In your
Use bindings to keep repetition low
However, we repeat ourselves here and use
Our line got a little bit shorter. Great. Now let us have a look at your accumulator.
Avoid repeated
Your results looks like this:
However, with parentheses, we actually have
Since
At that point
Replace
Other functions
You could rewrite
Then, you could rewrite
If you do that,
However, the types of
runsLet us start with the finished product first, and then I will show you how to get there:
runs :: String -> [String]
runs "" = []
runs xs = munched : runs (drop (length munched) xs)
where munched = munch xsUse pattern matching
In your
extract function, you've checked str's length. However, that is not necessary and is very expensive. If you want to check whether a list is empty either use null or pattern matching. With pattern matching, we end up withextract [] xs = xs
extract str xs = extract (drop (length (munch str)) str) (xs ++ [(munch str)])Use bindings to keep repetition low
However, we repeat ourselves here and use
munch str twice. Let's get rid of that:extract [] xs = xs
extract str xs = extract (drop (length munched) str) (xs ++ [munched])
where
munched = munch strOur line got a little bit shorter. Great. Now let us have a look at your accumulator.
Avoid repeated
++Your results looks like this:
munch str ++ munch str' ++ munch str'' ++ munch str''' …However, with parentheses, we actually have
(…(((munch str ++ munch str') ++ munch str'') ++ munch str''') …)Since
++ is linear in its first argument, that is going to be slow. However, we don't need to use ++ at all, since we're constructing a list element-wise from the first to the last. So instead of an accumulator, let us have extract return the list as soon as possible:extract [] = []
extract str = munched : extract (drop (length munched) str)
where munched = munch strAt that point
extract has exactly the same type as runs, so we can get rid of it. We end up withruns :: String -> [String]
runs [] = []
runs str = munched : runs (drop (length munched) str)
where munched = munch strReplace
str with xs and the first [] with "", and you have my first variant. We're done.Other functions
You could rewrite
chomp in a way that returns already groups of characters, e.g.chomp "aaaaabbbbbccccc" = ["aaaaa","bbbbb","ccccc"]Then, you could rewrite
munch so that it splits a string after 9 characters:munch "aaaaaaaaaa" = ["aaaaaaaa","aa"]If you do that,
runs gets a lot simpler:runs = concatMap munch . chompHowever, the types of
chomp and munch would differ in that case. By the way, this variant of chomp is in the standard library, and munch is also easy to implement.Code Snippets
runs :: String -> [String]
runs "" = []
runs xs = munched : runs (drop (length munched) xs)
where munched = munch xsextract [] xs = xs
extract str xs = extract (drop (length (munch str)) str) (xs ++ [(munch str)])extract [] xs = xs
extract str xs = extract (drop (length munched) str) (xs ++ [munched])
where
munched = munch strmunch str ++ munch str' ++ munch str'' ++ munch str''' …(…(((munch str ++ munch str') ++ munch str'') ++ munch str''') …)Context
StackExchange Code Review Q#158183, answer score: 2
Revisions (0)
No revisions yet.