patternrubyMinor
Patching Strings with Ed Scripts
Viewed 0 times
withscriptsstringspatching
Problem
I'm using Ed scripts to track changes in a web app I'm developing. Ed scripts are probably the easiest kind of patch file for a computer to parse: detailed explanation of Ed scripts.
I quickly put together the Ruby code below with extremely little thought. I'd like to know what kind of high-level patterns I could use for a problem like this. I'm aware of state machines in an electrical engineering sense, but I'm not aware of the generally-accepted way to apply them in Ruby.
```
def self.apply_patch(old_text, patch)
new_text_array = old_text.split("\n")
patch_array = patch.split("\n")
i=0
current_line = 0
while i < patch_array.length
#grab the command
patch_array[i].match(/(\d),?(\d)(\w)/)
first_line = $1.to_i
last_line = $2.to_i
command = $3
if first_line == 0
first_line = current_line
end
if last_line == 0
last_line = first_line
end
case command
when "a"
current_line = first_line
i+=1
while patch_array[i] != "."
if patch_array[i] == ".."
if patch_array[i+2] == "s/.//"
new_text_array.insert(current_line, ".")
else
new_text_array.insert(current_line, "..")
end
else
new_text_array.insert(current_line, patch_array[i])
end
current_line+=1
i+=1
end
when "d"
length = last_line - first_line
for ii in 0..length
new_text_array.delete_at(first_line-1)
end
when "c"
length = last_line - first_line
for ii in 0..length
new_text_array.delete_at(first_line-1)
end
current_line = first_line-1
i+=1
while patch_array[i] != "."
if patch_array[i] == ".."
if patch_array[i+2] == "s/.//"
new_text_array.insert(current_line, ".")
else
new_text_array.insert(current_line, "..")
end
else
new_text_array.insert(current_line, patch_array[i])
I quickly put together the Ruby code below with extremely little thought. I'd like to know what kind of high-level patterns I could use for a problem like this. I'm aware of state machines in an electrical engineering sense, but I'm not aware of the generally-accepted way to apply them in Ruby.
```
def self.apply_patch(old_text, patch)
new_text_array = old_text.split("\n")
patch_array = patch.split("\n")
i=0
current_line = 0
while i < patch_array.length
#grab the command
patch_array[i].match(/(\d),?(\d)(\w)/)
first_line = $1.to_i
last_line = $2.to_i
command = $3
if first_line == 0
first_line = current_line
end
if last_line == 0
last_line = first_line
end
case command
when "a"
current_line = first_line
i+=1
while patch_array[i] != "."
if patch_array[i] == ".."
if patch_array[i+2] == "s/.//"
new_text_array.insert(current_line, ".")
else
new_text_array.insert(current_line, "..")
end
else
new_text_array.insert(current_line, patch_array[i])
end
current_line+=1
i+=1
end
when "d"
length = last_line - first_line
for ii in 0..length
new_text_array.delete_at(first_line-1)
end
when "c"
length = last_line - first_line
for ii in 0..length
new_text_array.delete_at(first_line-1)
end
current_line = first_line-1
i+=1
while patch_array[i] != "."
if patch_array[i] == ".."
if patch_array[i+2] == "s/.//"
new_text_array.insert(current_line, ".")
else
new_text_array.insert(current_line, "..")
end
else
new_text_array.insert(current_line, patch_array[i])
Solution
Strategy
The code would be more elegant if you made the
If your
To make that work, you would have to get rid of the
Command parsing
This feels like a lot of lines of code to parse the command line, especially to handle the fallback line numbers:
You could simplify
For stricter validation, the regexp should be anchored by
If the command is unsupported, you should raise some kind of exception rather than raising a
Text manipulation
The name
Instead of deleting a line at a time using
There is code duplication between the
Suggested solution
This is about half the original line count.
The code would be more elegant if you made the
apply_patch function more like a language interpreter. Though it may be unavoidable that you have to treat the text as an array, you can treat the patch as a stream of instructions. That would relieve you of the annoyance of having to increment i.If your
patch is already in string form, then you can turn it back into a stream using StringIO. Chances are, you would be acquiring the patch from an input stream (such as STDIN), so you would use that stream directly instead.To make that work, you would have to get rid of the
if patch_array[i+2] == "s/.//" special case, which involves looking ahead by two lines. In any case, you would be better off implementing support for s/.// as the string substitution command that it is meant to represent.Command parsing
This feels like a lot of lines of code to parse the command line, especially to handle the fallback line numbers:
#grab the command
patch_array[i].match(/(\d*),?(\d*)(\w)/)
first_line = $1.to_i
last_line = $2.to_i
command = $3
if first_line == 0
first_line = current_line
end
if last_line == 0
last_line = first_line
endYou could simplify
first_line and last_line by assigning the values correctly the first time. To do that, you would need a regexp that produces a nil capture group when the number is missing.For stricter validation, the regexp should be anchored by
\A and \Z. I suggest using a using the regexp to specify exactly what commands are supported, for validation.If the command is unsupported, you should raise some kind of exception rather than raising a
String object. The exception message should include the bad command, to aid debugging.Text manipulation
The name
new_text_array is a bit cumbersome. How about just text? You don't need to assign the result to a new_text variable at the end of the function.Instead of deleting a line at a time using
delete_at, you can knock out an entire slice at once, using a Range for array indexing.There is code duplication between the
a command and the c command. The key insight is that a c (change) is just a combination of d (deletion) and a (append).Suggested solution
This is about half the original line count.
require 'stringio'
def self.apply_patch(old_text, patch)
text = old_text.split("\n")
patch = StringIO.new(patch)
current_line = 1
while patch_line = patch.gets
# Grab the command
m = %r{\A(?:(\d+))?(?:,(\d+))?([acd]|s/\.//)\Z}.match(patch_line)
raise ArgumentError.new("Invalid ed command: #{patch_line.chomp}") if m.nil?
first_line = (m[1] || current_line).to_i
last_line = (m[2] || first_line).to_i
command = m[3]
case command
when "s/.//"
(first_line..last_line).each { |i| text[i - 1].sub!(/./, '') }
else
if ['d', 'c'].include?(command)
text[first_line - 1 .. last_line - 1] = []
end
if ['a', 'c'].include?(command)
current_line = first_line - 1
while (patch_line = patch.gets) && patch_line.chomp! != '.'
text.insert(current_line, patch_line)
current_line += 1
end
end
end
end
text.join("\n")
endCode Snippets
#grab the command
patch_array[i].match(/(\d*),?(\d*)(\w)/)
first_line = $1.to_i
last_line = $2.to_i
command = $3
if first_line == 0
first_line = current_line
end
if last_line == 0
last_line = first_line
endrequire 'stringio'
def self.apply_patch(old_text, patch)
text = old_text.split("\n")
patch = StringIO.new(patch)
current_line = 1
while patch_line = patch.gets
# Grab the command
m = %r{\A(?:(\d+))?(?:,(\d+))?([acd]|s/\.//)\Z}.match(patch_line)
raise ArgumentError.new("Invalid ed command: #{patch_line.chomp}") if m.nil?
first_line = (m[1] || current_line).to_i
last_line = (m[2] || first_line).to_i
command = m[3]
case command
when "s/.//"
(first_line..last_line).each { |i| text[i - 1].sub!(/./, '') }
else
if ['d', 'c'].include?(command)
text[first_line - 1 .. last_line - 1] = []
end
if ['a', 'c'].include?(command)
current_line = first_line - 1
while (patch_line = patch.gets) && patch_line.chomp! != '.'
text.insert(current_line, patch_line)
current_line += 1
end
end
end
end
text.join("\n")
endContext
StackExchange Code Review Q#117043, answer score: 2
Revisions (0)
No revisions yet.