HiveBrain v1.2.0
Get Started
← Back to all entries
patternrubyMinor

Check if a string contains a set of characters

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
containscharacterscheckstringset

Problem

I have written this piece of code to check if a string contains a set of characters, regardless of the order.

Eg: 'ruby is best' / 'bysbe' => true

Code:

def find_chars(s1,s2)
s1.chars.sort.join =~ Regexp.new(s2.chars.sort.join(".*")) ? true : false
end


I need to optimize it to reduce executing time. How? What is slowing the performance, the regex, the methods, the ternary, or all together?

EDIT:

The last one provided by @Flambino performs significantly better, and this one performs best

def find_chars(subject, characters)
  characters.chars.uniq.all?{|i| characters.count(i) <= subject.count(i)} 
end


Now I only need to investigate why :)

Solution

Using regex for this is overkill, but so is sorting both strings. I'd just do:

def find_chars(subject, characters)
  characters.chars.all? { |char| subject.include?(char) }
end


Just doing a very cursory benchmarking, but it's an order of magnitude faster, at least for small strings like your example.

Edit: I should point out that this implementation is different from the original in that it doesn't care about multiples of the same character.

The original code will return false if given, say, ruby and rr, since there's only one "r" in the string. The implementation above, however, just matches the first letter twice - or N times, if need be.

Whether this is correct for your use-case, I don't know.

Incidentally, you could consider uniqing the character set to avoid redundant matching, but it'll probably be much slower for all but the most extreme input.

Edit 2: Responding to comments. This isn't very pretty, but should get the job done:

def find_chars(subject, characters)
  letters = characters.chars
  subject.chars.each do |letter|
    i = letters.index(letter)
    next if i.nil?
    letters.delete_at(i)
    return true if letters.empty?
  end
  false
end


Unfortunately, something simple like (characters.chars - subject.chars).empty? won't work as it too deletes duplicates.

In the above, it doesn't actually matter if you loop through the subject and search the characters, or vice-versa. However, it might matter performance-wise. I'd suggest looping over the longer string, and searching the shorter one (which'll get shorter the more stuff gets matched and deleted):

def find_chars(subject, characters)
  letters, repertoire = [subject, characters].sort_by(&:length)
  repertoire.each do |letter|
    i = letters.index(letter)
    next if i.nil?
    letters.delete_at(i)
    return true if letters.empty?
  end
  false
end


Now, I don't know how this fares in terms of performance. You'll have to test it out.

You could also return to sorting the characters string, and do binary search; maybe quicker than index.

By the way, I'd consider monkey-patching this method into String. It'd be nicer to write "hello".covers?("lelo") or something. But whether stand-alone or monkey-patched, find_chars isn't a great name.

Edit 3: The above can also be written as so, though performance is likely similar:

def find_chars(subject, characters)
  letters, repertoire = [subject, characters].sort_by(&:length)
  repertoire.each_char do |letter|
    letters.sub!(letter, '')
    return true if letters.empty?
  end
  false
end


Just for the heck of it, here's another approach. But I doubt it'll be fast:

def find_chars(subject, characters)
  repertoire = subject.chars.group_by { |char| char } # use &:itself for Ruby 2.2+
  letters = characters.chars.group_by { |char| char }

  letters.all? do |char, list|
    repertoire[char] && repertoire[char].count >= list.count
  end
end


It's basically frequency analysis and comparison. Again, it might be worth to pick the shortest of the two strings as the one to loop.

Code Snippets

def find_chars(subject, characters)
  characters.chars.all? { |char| subject.include?(char) }
end
def find_chars(subject, characters)
  letters = characters.chars
  subject.chars.each do |letter|
    i = letters.index(letter)
    next if i.nil?
    letters.delete_at(i)
    return true if letters.empty?
  end
  false
end
def find_chars(subject, characters)
  letters, repertoire = [subject, characters].sort_by(&:length)
  repertoire.each do |letter|
    i = letters.index(letter)
    next if i.nil?
    letters.delete_at(i)
    return true if letters.empty?
  end
  false
end
def find_chars(subject, characters)
  letters, repertoire = [subject, characters].sort_by(&:length)
  repertoire.each_char do |letter|
    letters.sub!(letter, '')
    return true if letters.empty?
  end
  false
end
def find_chars(subject, characters)
  repertoire = subject.chars.group_by { |char| char } # use &:itself for Ruby 2.2+
  letters = characters.chars.group_by { |char| char }

  letters.all? do |char, list|
    repertoire[char] && repertoire[char].count >= list.count
  end
end

Context

StackExchange Code Review Q#138254, answer score: 2

Revisions (0)

No revisions yet.