patternMinor
String-splitting function
Viewed 0 times
functionstringsplitting
Problem
This function was hard to write as a Clojure newbie, and I don't like the result.
Can you help me find a better (more readable) way to do it?
With Arthurs help, the final code looks like this:
Still lazy and a lot more readable!
Can you help me find a better (more readable) way to do it?
(defn split-seq
"Splits a seq into blocks defined by start-fn and stop-fn.
Returns a lazy seq of seqs"
[start-fn stop-fn lines]
(let [step (fn [c state]
(when-let [s (seq c)]
(if (stop-fn (first s))
(cons state (split-seq start-fn stop-fn (rest s) ))
(recur (rest s)
(if (start-fn (first s))
'()
(cons (first s) state))))))]
(lazy-seq (step lines '()))))
(defn post-start? [l] (.startsWith l "#ENTRY_START"))
(defn post-end? [l] (.startsWith l "#ENTRY_END"))
(defn split-lines
"Split a line-seq into entries based on #ENTRY_START and #ENTRY_END."
[data]
(split-seq post-start? post-end? data))
;
; Test
;
(def test-data [
"Header line"
"#ENTRY_START"
"entry line 1 "
"entry line 2"
"#ENTRY_END"
"This line should be filtered out"
"#ENTRY_START Having data here shouldn't make a difference."
"entry line 1 "
"entry line 2"
"#ENTRY_END"
"This should be gone too"])
(split-lines test-data)
; yields (("entry line 2" "entry line 1 ") ("entry line 2" "entry line 1 "))
; The order of elements doesn't matter in my case because I'm making a map with this dataWith Arthurs help, the final code looks like this:
(defn entry-seq [data]
(let [[f r] (split-with #(not (post-end? %))
(rest (drop-while #(not (post-start? %)) data)))]
(when (not-empty f) (lazy-seq (cons f (entry-seq2 r))))))Still lazy and a lot more readable!
Solution
So the idea in this new approach is to write two expressions:
and then wrap them up into a lazy sequence:
first lets expand the test data to include some additional edge cases:
then wrap our two expressions into a function:
and we test it:
- one that extracts the next expresstion:
(take-while #(not= "#ENTRY_END" %)
(rest (drop-while #(not= "#ENTRY_START" %) data)))- one that extracts everything after the next expression
(rest (drop-while #(not= "#ENTRY_END" %)
(rest (drop-while #(not= "#ENTRY_START" %) data))))and then wrap them up into a lazy sequence:
first lets expand the test data to include some additional edge cases:
user> (def test-data [
"Header line"
"#ENTRY_START"
"entry line 1 "
"entry line 2"
"#ENTRY_END"
"not part of an entry"
"also not part of an entry"
"#ENTRY_START"
"entry line 1 "
"entry line 2"
"#ENTRY_END"
"footer1"
"footer2"])then wrap our two expressions into a function:
user> (defn entry-seq [data]
(let [f (take-while #(not= "#ENTRY_END" %)
(rest (drop-while #(not= "#ENTRY_START" %) data)))
r (rest (drop-while #(not= "#ENTRY_END" %)
(rest (drop-while #(not= "#ENTRY_START" %) data))))]
(when (not-empty f) (lazy-seq (cons f (entry-seq r))))))
#'user/entry-seqand we test it:
user> (take 4 (entry-seq test-data))
(("entry line 1 " "entry line 2") ("entry line 1 " "entry line 2"))Code Snippets
user> (def test-data [
"Header line"
"#ENTRY_START"
"entry line 1 "
"entry line 2"
"#ENTRY_END"
"not part of an entry"
"also not part of an entry"
"#ENTRY_START"
"entry line 1 "
"entry line 2"
"#ENTRY_END"
"footer1"
"footer2"])user> (defn entry-seq [data]
(let [f (take-while #(not= "#ENTRY_END" %)
(rest (drop-while #(not= "#ENTRY_START" %) data)))
r (rest (drop-while #(not= "#ENTRY_END" %)
(rest (drop-while #(not= "#ENTRY_START" %) data))))]
(when (not-empty f) (lazy-seq (cons f (entry-seq r))))))
#'user/entry-sequser> (take 4 (entry-seq test-data))
(("entry line 1 " "entry line 2") ("entry line 1 " "entry line 2"))Context
StackExchange Code Review Q#18071, answer score: 4
Revisions (0)
No revisions yet.