patternrubyMinor
Removing list of words from a text file in Ruby
Viewed 0 times
fileremovingwordstextrubylistfrom
Problem
I have two files.
all prepositions.
I want to remove all the prepositions from the dictionary.
I want to reduce the number of lines in my code and also make it more elegant, idiomatic and readable.
- File 1. Has a list of all the dictionary words
- File 2. Has a list of
all prepositions.
I want to remove all the prepositions from the dictionary.
I want to reduce the number of lines in my code and also make it more elegant, idiomatic and readable.
#!/usr/bin/env ruby
path = "/Users/../Desktop/";
file_original_wordlist = File.open("#{path}" + "dictionary.txt", "r")
file_remove_wordlist = File.open("#{path}" + "prepositions.txt", "r")
# Need to initialize the variables else I get errors
delete_word = false
word_orig = ''
word_rem = ''
count = 0
file_original_wordlist.each_line do |line1|
file_remove_wordlist.each_line do |line2|
word_orig = line1
word_rem = line2
if word_orig.eql?(word_rem)
puts "Deleting the word " + word_rem
delete_word = true
count++
end
end
if delete_word == false
File.open(path + "scrubbed_list.txt", "a") {|f| f.write(word_orig) }
end
# Need to reopen the file otherwise after the first iteration to start from the beginning
file_remove_wordlist = File.open("#{path}" + "prepositions.txt", "r")
delete_word = false
end
puts "Deleted " + count + " words in total"Solution
Some notes:
-
I guess you come from imperative languages. Try to write in a more functional style (more expressions, less statements).
-
Use libraries (
-
This double
I'd write:
If the input file
-
I guess you come from imperative languages. Try to write in a more functional style (more expressions, less statements).
-
Use libraries (
File) to manipulate paths.-
This double
each_line is bad news for performance: O(n*m). Avoid it by building a data structure that has O(1) checks for inclusion. I'd create a set of the prepositions (it's the smaller set). The overall performance is now O(n).I'd write:
prepositions = open(File.join(path, "prepositions.txt")).lines.to_a
words = open(File.join(path, "dictionary.txt")).lines.to_a
filtered_words = words - prepositions
File.write("dictionary_without_prepositions.txt", filtered_words.join)If the input file
dictionary.txt is very, very large, this is a more lazy aproach:require 'set'
prepositions = open(File.join(path, "prepositions.txt")).lines.to_set
open("dictionary_without_prepositions.txt", "w") do |output|
open(File.join(path, "dictionary.txt")).lines.each do |line|
unless prepositions.include?(line)
output.write(line)
end
end
endCode Snippets
prepositions = open(File.join(path, "prepositions.txt")).lines.to_a
words = open(File.join(path, "dictionary.txt")).lines.to_a
filtered_words = words - prepositions
File.write("dictionary_without_prepositions.txt", filtered_words.join)require 'set'
prepositions = open(File.join(path, "prepositions.txt")).lines.to_set
open("dictionary_without_prepositions.txt", "w") do |output|
open(File.join(path, "dictionary.txt")).lines.each do |line|
unless prepositions.include?(line)
output.write(line)
end
end
endContext
StackExchange Code Review Q#42539, answer score: 4
Revisions (0)
No revisions yet.