patternMinor
Find the common characters between two strings without using set operations
Viewed 0 times
withouttheoperationstwobetweenusingfindcharactersstringscommon
Problem
Below are two implementations of the problem "find the characters in common between two strings" - in Clojure without using collections / set operations.
Are these approaches sufficiently idiomatic and could the be more concise? (aside from using collections functions).
Also, is there a Clojure library that wraps or provides an alternative to
Are these approaches sufficiently idiomatic and could the be more concise? (aside from using collections functions).
Also, is there a Clojure library that wraps or provides an alternative to
java.util.BitSet?;; Approach #1 BitSets
;; - use a bitsets for each string,
;; - flip the bit for each character in each string
;; - then "and" the two bitsets. Theoretically O(N + M) performance
(defn string-to-charbits [s]
(loop [[f & rest] s
bits (java.util.BitSet.)]
(if f (recur rest (do (.set bits (int f)) bits))
bits)))
(defn process-bitset [bitset]
(loop [bitset bitset
count 0
result '[]]
(if ( (.indexOf s2 (int f)) -1)
(recur rest s2 (conj result f))
(recur rest s2 result))
(distinct result))))
;; Might need to sort the results of each function call to make this pass
(assert (= (find-matching-chars "abcd" "cde")
(find-matching-chars-brute "abcd" "cde"))
"expected [\\c \\d]")
Solution
With your restrictions, yes, that seems fairly idiomatic. I am not aware of a good bitset implementation yet -- wrapping java.util.BitSet wouldn't work that well, since a clojure version would want to be persistent. That said, your restrictions are not idiomatic at all. Presumably, you don't want to use clojure.set/intersection (and completely trivializing the problem), but something like
really would be far more idiomatic. I could see using something like your BitSet code if
In all honesty, my probable course in a situation like this would be to build filter and linked lists from scratch if I couldn't use the built-in versions, but at this point, I am probably straying away from idiomatic code myself.
(defn find-matching-chars [s1 s2]
(distinct (filter (into #{} s1) s2)))really would be far more idiomatic. I could see using something like your BitSet code if
find-matching-chars was actually causing significant performance issues, but short of that, some form of collections manipulation really is the idiomatic way to solve this issue.In all honesty, my probable course in a situation like this would be to build filter and linked lists from scratch if I couldn't use the built-in versions, but at this point, I am probably straying away from idiomatic code myself.
Code Snippets
(defn find-matching-chars [s1 s2]
(distinct (filter (into #{} s1) s2)))Context
StackExchange Code Review Q#13521, answer score: 2
Revisions (0)
No revisions yet.