principleModerate
Approach to string split by character in Haskell
Viewed 0 times
approachcharactersplithaskellstring
Problem
I'm trying to learn Haskell and the mindset of programming functionally. I have started off by trying to understand the basics by writing code without any
So far, I think I'm getting used to the type system and how the basics works.
I thought to myself, why not try to get some feedback on what I have written to learn even more. So hopefully I will get some valuable information!
I have written a function which iterates a
Here's what I have got so far:
With output:
So the main idea is to only show groups which have values in them. I bet there is:
I welcome all types of feedback except for feedback like "there is a library function which does all that for you". I'd like feedback on the
Monads in it. So far, I think I'm getting used to the type system and how the basics works.
I thought to myself, why not try to get some feedback on what I have written to learn even more. So hopefully I will get some valuable information!
I have written a function which iterates a
String and splits it into [String] by using given Char as a delimiter. Similar to what PHP's explode and Python's split does. Here's what I have got so far:
main :: IO()
main = do
print $ groupBy "/hejsan/asdas" '/' []
print $ groupBy "/hejsan/asdas/hoho" '/' []
print $ groupBy "/hejsan/asdas/hahsdhashdas" '/' []
print $ groupBy "/hejsan/99/hahsdhashdas" '/' []
print $ groupBy "/123/asdas/hahsdhashdas///" '/' []
print $ groupBy "/123/asdas/hahsdhashdas//20////hej" '/' []
groupBy :: String -> Char -> [String] -> [String]
groupBy "" _ [] = []
groupBy (x:xs) c []
| x == c = groupBy xs c []
| otherwise = groupBy xs c [[x]]
groupBy "" _ (r:rs) = reverse result
where result = if r == "" then rs else [r] ++ rs
groupBy (x:xs) c (r:rs)
| x == c = groupBy xs c $ nextGroup ++ rs
| otherwise = groupBy xs c $ [r ++ [x]] ++ rs
where nextGroup = if r == "" then [r] else [""] ++ [r]With output:
["hejsan","asdas"]
["hejsan","asdas","hoho"]
["hejsan","asdas","hahsdhashdas"]
["hejsan","99","hahsdhashdas"]
["123","asdas","hahsdhashdas"]
["123","asdas","hahsdhashdas","20","hej"]So the main idea is to only show groups which have values in them. I bet there is:
- Some library function which can simplify the functions internals
- Some way to eliminate the 3rd argument so that it will be transparent to the user (Currying?)
I welcome all types of feedback except for feedback like "there is a library function which does all that for you". I'd like feedback on the
Solution
Well, sorry, but it is really tempting to note that there is in fact a library routine that can do that, and it is even called the same as your function:
This code with
But let's look at your implementation - two things jump out:
-
You are clearly using the first element in the
-
Why do you even need an accumulation parameter for your groups? Once you append something to
With the two changes and a bit of refactoring (replacing your
This can be improved further:
-
This has clearly two modes of operation depending on
-
Building a list using
That's really all the difference to the "perfect" library-level implementation (see the library code). Note that generally, initialization of accumulation parameters is simply done by defining a wrapper function, even though we can easily put this function without even needing it.
Hope this helps.
groupBy (\a b -> b /= '/') "/hejsan/asdas"This code with
groupBy from Data.List will give you ["/hejsan","/asdas"]. Even though it should probably be noted that this is taking advantage of how groupBy is implemented internally, which might not be the best idea.But let's look at your implementation - two things jump out:
-
You are clearly using the first element in the
rs list differently from the rest. Why not make it an additional parameter so you can skip de- and recomposing the list all the time? -
Why do you even need an accumulation parameter for your groups? Once you append something to
rs, you already know that it will be reversed in the end and end up at the start of the return value. So you can simply prepend the group to the result of the recursive function call.With the two changes and a bit of refactoring (replacing your
if by pattern matches), we get the following version:groupBy :: String -> Char -> String -> [String]
groupBy "" _ "" = []
groupBy "" _ r = [r]
groupBy (x:xs) c ""
| x == c = groupBy xs c ""
| otherwise = groupBy xs c [x]
groupBy (x:xs) c r
| x == c = r : groupBy xs c ""
| otherwise = groupBy xs c (r ++ [x])This can be improved further:
-
This has clearly two modes of operation depending on
r == "". When calling you always know whether that is the case, so you could easily split it into two functions. One of them can get rid of the r parameter completely, so you end up with a nice function to call from outside.-
Building a list using
++ is inefficient, as it creates a copy of the list each time you append an element, leading O(n²) complexity. It's better to use x:r and then reverse once in the end, which is a more efficient O(n). A bit more involved refactoring enables you to construct the string in the right form right away (see span).That's really all the difference to the "perfect" library-level implementation (see the library code). Note that generally, initialization of accumulation parameters is simply done by defining a wrapper function, even though we can easily put this function without even needing it.
Hope this helps.
Code Snippets
groupBy (\a b -> b /= '/') "/hejsan/asdas"groupBy :: String -> Char -> String -> [String]
groupBy "" _ "" = []
groupBy "" _ r = [r]
groupBy (x:xs) c ""
| x == c = groupBy xs c ""
| otherwise = groupBy xs c [x]
groupBy (x:xs) c r
| x == c = r : groupBy xs c ""
| otherwise = groupBy xs c (r ++ [x])Context
StackExchange Code Review Q#6992, answer score: 10
Revisions (0)
No revisions yet.