HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Find the common characters between two strings without using set operations

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
withouttheoperationstwobetweenusingfindcharactersstringscommon

Problem

Below are two implementations of the problem "find the characters in common between two strings" - in Clojure without using collections / set operations.

Are these approaches sufficiently idiomatic and could the be more concise? (aside from using collections functions).

Also, is there a Clojure library that wraps or provides an alternative to java.util.BitSet?

;; Approach #1 BitSets
;; - use a bitsets for each string,
;; - flip the bit for each character in each string
;; - then "and" the two bitsets. Theoretically O(N + M) performance

(defn string-to-charbits [s]
(loop [[f & rest] s
bits (java.util.BitSet.)]
(if f (recur rest (do (.set bits (int f)) bits))
bits)))

(defn process-bitset [bitset]
(loop [bitset bitset
count 0
result '[]]
(if ( (.indexOf s2 (int f)) -1)
(recur rest s2 (conj result f))
(recur rest s2 result))
(distinct result))))

;; Might need to sort the results of each function call to make this pass
(assert (= (find-matching-chars "abcd" "cde")
(find-matching-chars-brute "abcd" "cde"))
"expected [\\c \\d]")

Solution

With your restrictions, yes, that seems fairly idiomatic. I am not aware of a good bitset implementation yet -- wrapping java.util.BitSet wouldn't work that well, since a clojure version would want to be persistent. That said, your restrictions are not idiomatic at all. Presumably, you don't want to use clojure.set/intersection (and completely trivializing the problem), but something like

(defn find-matching-chars [s1 s2]
  (distinct (filter (into #{} s1) s2)))


really would be far more idiomatic. I could see using something like your BitSet code if find-matching-chars was actually causing significant performance issues, but short of that, some form of collections manipulation really is the idiomatic way to solve this issue.

In all honesty, my probable course in a situation like this would be to build filter and linked lists from scratch if I couldn't use the built-in versions, but at this point, I am probably straying away from idiomatic code myself.

Code Snippets

(defn find-matching-chars [s1 s2]
  (distinct (filter (into #{} s1) s2)))

Context

StackExchange Code Review Q#13521, answer score: 2

Revisions (0)

No revisions yet.