HiveBrain v1.2.0
Get Started
← Back to all entries
patternrubyModerate

Ruby script on-all-nodes-run not only for teaching

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
scriptnodesallteachingforrubynotonlyrun

Problem

I wrote the following Ruby script several years ago and have been using it often ever since on a Bioinformatics computer cluster.

It pulls out a list of hosts from the Torque queuing system qnodes. It ssh'es and runs a command on all the nodes. Then it prints the output and/or errors in a defined order (alphabetical sort of the hostnames).

A nice feature: results are printed immediately for the host that is next in the order.

I would like to use it as an example for a Ruby workshop. Could you please suggest best-practice and design pattern improvements?

```
#!/usr/bin/ruby

EXCLUDE = [/girkelab/, /biocluster/, /parrot/, /owl/]

require "open3"

# Non-interactive, no password asking, and seasonable timeouts
SSH_OPTIONS = ["-o PreferredAuthentications=publickey,hostbased,gssapi,gssapi-with-mic",
"-o ForwardX11=no",
"-o BatchMode=yes",
"-o SetupTimeOut=5",
"-o ServerAliveInterval=5",
"-o ServerAliveCountMax=2"
].join(" ")

SSH = "/usr/bin/ssh #{SSH_OPTIONS}"
MKDIR = "/bin/mkdir"

raise "Pleae give this command at least one argument" if ARGV.size < 1
COMMAND = ARGV[0..-1].join(' ')

output_o = {}
output_e = {}

IO_CONNECTIONS_TO_REMOTE_PROCESSES = {}

def on_all_nodes(&block)
nodes = []
Kernel.open('|qnodes | grep -v "^ " | grep -v "^$"') do |f|
while line = f.gets
i = line.split(' ').first
nodes.push(i) if EXCLUDE.select{|x| i =~ x}.empty?
end
end
nodes.sort.each {|n| block.call(n)}
end

# Create processes
on_all_nodes do |node|
stdin, stdout, stderr = Open3.popen3("#{SSH} #{node} \"#{COMMAND}\"")
IO_CONNECTIONS_TO_REMOTE_PROCESSES[node] = [stdin, stdout, stderr]
end

has_remote_errors = false

# Collect results
on_all_nodes do |node|
stdin, stdout, stderr = IO_CONNECTIONS_TO_REMOTE_PROCESSES[node]

stdin.close

e_thread = Thread.new do
while line = stderr.gets
line.chomp!
STDERR.puts "#{node} ERROR: #{line}"

Solution

First of all, take a look at the Net::SSH library. I don't have much experience with it, so I don't know whether it supports all the options you need. But if it does, using it might be more robust than using the command line utility (you wouldn't have to worry about whether ssh is installed in the place you expect (or at all) and you wouldn't have to worry about escaping the arguments).

Assuming you can't (or won't) use Net::SSH, you should at least replace /usr/bin/ssh with just ssh, so at least it still works if ssh is installed in another location in the PATH.

nodes = []
Kernel.open('|qnodes | grep -v "^ " | grep -v "^$"') do |f|
  while line = f.gets
    i = line.split(' ').first
    nodes.push(i) if EXCLUDE.select{|x| i =~ x}.empty?
  end
end


  • When you initialize an empty array and then append to it in a loop, that is often a good sign you want to use map and/or select instead.



  • line = f.gets is a bit of an anti-pattern in ruby. The IO class already has methods to iterate a file line-wise.



-
To find out whether none of the elements in an array meet a condition, negating any? seems more idiomatic than building an array with select and check whether it's empty.

nodes = Kernel.open('|qnodes | grep -v "^ " | grep -v "^$"') do |f|
  f.lines.map do |line|
    line.split(' ').first
  end.reject do |i|
    EXCLUDE.any? {|x| i =~ x}
  end
end


nodes.sort.each {|n| block.call(n)}


I would recommend that instead of taking a block and yielding each element, you just return nodes.sort and rename the function to all_nodes. This way you can use all_nodes.each to execute code on all nodes, but you could also use all_nodes.map or all_nodes.select when it makes sense.

Open3.popen3("#{SSH} #{node} \"#{COMMAND}\"")


Note that this will break if COMMAND contains double quotes itself. Generally trying to escape command line arguments by surrounding them with quotes is a bad idea. system and open3 accept multiple arguments exactly to avoid this.

If you make SSH an array (with one element per argument) instead of a string, you can use the multiple-arguments version of popen3 and can thus avoid the fickle solution of adding quote around COMMAND to escape spaces, i.e.:

Open3.popen3(*(SSH + [node, COMMAND]))


IO_CONNECTIONS_TO_REMOTE_PROCESSES = {}
# ...
on_all_nodes do |node|
  stdin, stdout, stderr = Open3.popen3("#{SSH} #{node} \"#{COMMAND}\"")
  IO_CONNECTIONS_TO_REMOTE_PROCESSES[node] = [stdin, stdout, stderr]
end


If you heeded my above advice about all_nodes you can simplify this using map. I also would suggest not using a Hash here. If you use an array instead, the nodes will stay in the order in which you inserted them, which will mean that you can iterate over that array instead of invoking all_nodes again.

has_remote_errors = false

all_nodes.map do |node|
  [node, Open3.popen3(*(SSH + [node, COMMAND]))]
end.each do |node, (stdin, stdout, stderr)|
  stdin.close

  ethread = # ...
  # ...
end


This way you removed the complexity of first putting everything into a hash and then getting it out again.

while line = stderr.gets


Again, this can be written more idiomatically as stderr.each_line do |line|. Same with stdout.

first = true


This is never used. I can only assume that it's a left over of previous iterations of the code, which is no longer necessary. Obviously it should be removed.

# Let the threads finish
t1 = nil
t2 = nil
while [t1, t2].include? nil
  if t1.nil?
    t1 = e_thread.join(0.1) # Gives 1/10 of a second to STDERR
  end
  if t2.nil?
    t2 = o_thread.join(0.1) # Give 1/10 of a second to STDOUT
  end
end


I don't see any benefit of doing it this way. Just do:

e_thread.join
o_thread.join


Note that joining on one thread does not mean that the other threads stop running - only the main thread does, but that's perfectly okay as you want that anyway.

Code Snippets

nodes = []
Kernel.open('|qnodes | grep -v "^ " | grep -v "^$"') do |f|
  while line = f.gets
    i = line.split(' ').first
    nodes.push(i) if EXCLUDE.select{|x| i =~ x}.empty?
  end
end
nodes = Kernel.open('|qnodes | grep -v "^ " | grep -v "^$"') do |f|
  f.lines.map do |line|
    line.split(' ').first
  end.reject do |i|
    EXCLUDE.any? {|x| i =~ x}
  end
end
nodes.sort.each {|n| block.call(n)}
Open3.popen3("#{SSH} #{node} \"#{COMMAND}\"")
Open3.popen3(*(SSH + [node, COMMAND]))

Context

StackExchange Code Review Q#1180, answer score: 10

Revisions (0)

No revisions yet.