snippetrubyMinor
How to create a 'tee' command in Ruby for a Unix like pipeline?
Viewed 0 times
teeunixhowcreatelikeforrubycommandpipeline
Problem
I came across the question about creating memory-efficient Ruby pipe class with lazy evaluation. Some code is given to effectively create a pipeline of commands using lazy enumerators. I have been messing round with it and have implemented a
Here's the output:
Does anyone see a smarter way of doing this?
tee command shown below. I feel like there is a better way to do it though.class Pipe
class Tee
def initialize(pipe)
@pipe = pipe
@buffer = []
@pipe.source = self.to_enum
end
def StringIO.new("testing\na new\nline"))
# pipe.add(:grep, pattern: /line/)
pipe.add(:tee, :other => another_pipe)
pipe.add(:upcase)
pipe.add(:out)
pipe.run
puts "================================================="
another_pipe.runHere's the output:
TESTING
A NEW
LINE
=================================================
testingDoes anyone see a smarter way of doing this?
Solution
I know it's an old question, but it's an interesting one.
First, a couple small things:
-
In
Calling
-
Also in
The purpose of this is to cause the final object to be enumerated. We can do the same thing cheaper (even an empty block takes CPU cycles) with
Those things aside, the first thing I noticed is that there's a lot of repetition.
...but this made me realize that some of these methods don't need the
Since we called
That just leaves
Here's the final code:
First, a couple small things:
-
In
Pipe#run:enum = method(command).call(enum, options)Calling
method has a lot of overhead, and send(command, enum, options) does the same thing.-
Also in
Pipe#run:enum.each {}The purpose of this is to cause the final object to be enumerated. We can do the same thing cheaper (even an empty block takes CPU cycles) with
enum.to_a, which is also aliased as enum.force.Those things aside, the first thing I noticed is that there's a lot of repetition.
grep and upcase, for example, only differ on one line. We can extract the shared code out into another method, which I'll call enum_lazy:def upcase(enum, options)
enum_lazy(enum) do |line, yielder|
yielder << line.upcase
end
end
def grep(enum, options)
enum_lazy(enum) do |line, yielder|
yielder << line if line.match(options[:pattern])
end
end
private
def enum_lazy(enum, &block)
Enumerator.new do |yielder|
enum.each do |line|
yield(line, yielder)
end
end.lazy
end...but this made me realize that some of these methods don't need the
yielder pattern at all. For example, upcase takes one element at a time, performs an operation on it, and yields one value each time—it's just a lazy map, and this is equivalent:def upcase(enum, options)
enum.lazy.map(&:upcase)
endSince we called
lazy, this will return an Enumerator::Lazy as expected. We can simplify cut and out in the same way:def out(enum, options)
enum.lazy.map do |line|
puts line
line
end
end
def cut(enum, options)
enum.lazy.map do |line|
fields = line.chomp.split(options[:delimiter])
fields[ options[:field] ]
end
endgrep is different, because we only don't want output for every input line. As it turns out, though, grep is just a lazy select:def grep(enum, options)
enum.lazy.select {|line| line =~ options[:pattern] }
endThat just leaves
cat. I have a strong inkling that there's a similar optimization we could do for cat, but I haven't yet been able to actually figure it out. Perhaps someone else can help figure it out.Here's the final code:
class Pipe
class Tee
# (no changes here)
end
attr_writer :source
def initialize
@commands = []
@source = nil
end
def add(command, opts = {})
@commands << [command, opts]
self
end
def run
enum = @source
@commands.each do |command, options|
enum = send(command, enum, options)
end
enum.force
enum
end
def cat(enum, options)
Enumerator.new do |yielder|
enum.map { |line| yielder << line } if enum
options[:input].tap do |ios|
ios.each { |line| yielder << line }
end
end.lazy
end
def out(enum, options)
enum.lazy.map do |line|
puts line
line
end
end
def cut(enum, options)
enum.lazy.map do |line|
fields = line.chomp.split(/#{options[:delimiter]}/)
fields[ options[:field] ]
end
end
def upcase(enum, options)
enum.lazy.map(&:upcase)
end
def tee(enum, options)
teed = Tee.new(options.fetch(:other))
enum_lazy(enum) do |line, yielder|
yielder << line
teed << line
end
end
def grep(enum, options)
enum.lazy.select {|line| line =~ options[:pattern] }
end
private
def enum_lazy(enum, &block)
Enumerator.new do |yielder|
enum.each do |line|
yield(line, yielder)
end
end.lazy
end
endCode Snippets
enum = method(command).call(enum, options)enum.each {}def upcase(enum, options)
enum_lazy(enum) do |line, yielder|
yielder << line.upcase
end
end
def grep(enum, options)
enum_lazy(enum) do |line, yielder|
yielder << line if line.match(options[:pattern])
end
end
private
def enum_lazy(enum, &block)
Enumerator.new do |yielder|
enum.each do |line|
yield(line, yielder)
end
end.lazy
enddef upcase(enum, options)
enum.lazy.map(&:upcase)
enddef out(enum, options)
enum.lazy.map do |line|
puts line
line
end
end
def cut(enum, options)
enum.lazy.map do |line|
fields = line.chomp.split(options[:delimiter])
fields[ options[:field] ]
end
endContext
StackExchange Code Review Q#53928, answer score: 6
Revisions (0)
No revisions yet.