HiveBrain v1.2.0
Get Started
← Back to all entries
snippetrubyMinor

How to create a 'tee' command in Ruby for a Unix like pipeline?

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
teeunixhowcreatelikeforrubycommandpipeline

Problem

I came across the question about creating memory-efficient Ruby pipe class with lazy evaluation. Some code is given to effectively create a pipeline of commands using lazy enumerators. I have been messing round with it and have implemented a tee command shown below. I feel like there is a better way to do it though.

class Pipe

  class Tee

    def initialize(pipe)
      @pipe = pipe
      @buffer = []
      @pipe.source = self.to_enum
    end

    def  StringIO.new("testing\na new\nline"))
# pipe.add(:grep, pattern: /line/)
pipe.add(:tee, :other => another_pipe)
pipe.add(:upcase)
pipe.add(:out)
pipe.run

puts "================================================="
another_pipe.run


Here's the output:

TESTING
A NEW
LINE
=================================================
testing


Does anyone see a smarter way of doing this?

Solution

I know it's an old question, but it's an interesting one.

First, a couple small things:

-
In Pipe#run:

enum = method(command).call(enum, options)


Calling method has a lot of overhead, and send(command, enum, options) does the same thing.

-
Also in Pipe#run:

enum.each {}


The purpose of this is to cause the final object to be enumerated. We can do the same thing cheaper (even an empty block takes CPU cycles) with enum.to_a, which is also aliased as enum.force.

Those things aside, the first thing I noticed is that there's a lot of repetition. grep and upcase, for example, only differ on one line. We can extract the shared code out into another method, which I'll call enum_lazy:

def upcase(enum, options)
  enum_lazy(enum) do |line, yielder|
    yielder << line.upcase
  end
end

def grep(enum, options)
  enum_lazy(enum) do |line, yielder|
    yielder << line if line.match(options[:pattern])
  end
end

private
def enum_lazy(enum, &block)
  Enumerator.new do |yielder|
    enum.each do |line|
      yield(line, yielder)
    end
  end.lazy
end


...but this made me realize that some of these methods don't need the yielder pattern at all. For example, upcase takes one element at a time, performs an operation on it, and yields one value each time—it's just a lazy map, and this is equivalent:

def upcase(enum, options)
  enum.lazy.map(&:upcase)
end


Since we called lazy, this will return an Enumerator::Lazy as expected. We can simplify cut and out in the same way:

def out(enum, options)
  enum.lazy.map do |line|
    puts line
    line
  end
end

def cut(enum, options)
  enum.lazy.map do |line|
    fields = line.chomp.split(options[:delimiter])
    fields[ options[:field] ]
  end
end


grep is different, because we only don't want output for every input line. As it turns out, though, grep is just a lazy select:

def grep(enum, options)
  enum.lazy.select {|line| line =~ options[:pattern] }
end


That just leaves cat. I have a strong inkling that there's a similar optimization we could do for cat, but I haven't yet been able to actually figure it out. Perhaps someone else can help figure it out.

Here's the final code:

class Pipe
  class Tee
    # (no changes here)
  end

  attr_writer :source

  def initialize
    @commands = []
    @source = nil
  end

  def add(command, opts = {})
    @commands << [command, opts]
    self
  end

  def run
    enum = @source

    @commands.each do |command, options|
      enum = send(command, enum, options)
    end

    enum.force
    enum
  end

  def cat(enum, options)
    Enumerator.new do |yielder|
      enum.map { |line| yielder << line } if enum

      options[:input].tap do |ios|
        ios.each { |line| yielder << line }
      end
    end.lazy
  end

  def out(enum, options)
    enum.lazy.map do |line|
      puts line
      line
    end
  end

  def cut(enum, options)
    enum.lazy.map do |line|
      fields = line.chomp.split(/#{options[:delimiter]}/)
      fields[ options[:field] ]
    end
  end

  def upcase(enum, options)
    enum.lazy.map(&:upcase)
  end

  def tee(enum, options)
    teed = Tee.new(options.fetch(:other))

    enum_lazy(enum) do |line, yielder|
      yielder << line
      teed << line
    end
  end

  def grep(enum, options)
    enum.lazy.select {|line| line =~ options[:pattern] }
  end

  private
  def enum_lazy(enum, &block)
    Enumerator.new do |yielder|
      enum.each do |line|
        yield(line, yielder)
      end
    end.lazy
  end
end

Code Snippets

enum = method(command).call(enum, options)
enum.each {}
def upcase(enum, options)
  enum_lazy(enum) do |line, yielder|
    yielder << line.upcase
  end
end

def grep(enum, options)
  enum_lazy(enum) do |line, yielder|
    yielder << line if line.match(options[:pattern])
  end
end

private
def enum_lazy(enum, &block)
  Enumerator.new do |yielder|
    enum.each do |line|
      yield(line, yielder)
    end
  end.lazy
end
def upcase(enum, options)
  enum.lazy.map(&:upcase)
end
def out(enum, options)
  enum.lazy.map do |line|
    puts line
    line
  end
end

def cut(enum, options)
  enum.lazy.map do |line|
    fields = line.chomp.split(options[:delimiter])
    fields[ options[:field] ]
  end
end

Context

StackExchange Code Review Q#53928, answer score: 6

Revisions (0)

No revisions yet.