patternpythonMinor
Designing unit tests for paramiko wrapper
Viewed 0 times
designingtestsparamikowrapperforunit
Problem
Before I get to the question, let me note that I'm a trained scientist, not a programmer; I've done my best to self-teach what I've needed to know so far, but in the interest of making my code usable for other people in my research group, I'd like to improve my code. I spent a weekend writing actual documentation for stuff, and the next thing on the list of making this more stable is unit testing, which I think I have a basic understanding of, but have never implemented in anything I've written.
I've developed a wrapper for paramiko that allows me load scientific data from a remote host. I was curious about how to best design a unit test for other scripts that rely on this function, or if one is even necessary? If there are any security holes or performance improvements, suggestions would be very appreciated!
```
def get_remote_data(filepath, time=False, info=False, copy_to_local=False):
"""
A paramiko wrapper that gets file from a remote computer. Parses
hostname from filepath. Works only for netcdf files!
Keyword Arguments:
filepath -- the path of the file with hostname.
time -- print time required for loading(default False)
info -- print some information about the file after loading
(default False)
copy_to_local -- copies file to local /tmp/remote_data/ and checks
if this file already exists there (default False)
Example:
>>> get_remote_data('pgierz@rayo3:/csys/paleo2/pgierz/GR30s.nc')
# TODO: Find out if this is the right way to do this, and make a unit test.
Paul J. Gierz, Sat Feb 14 14:20:43 2015
"""
# Import stuff from your own library:
from UI_Stuff import print_colors
if time:
import time
import paramiko
import os
from scipy.io import netcdf
if time:
now = time.time()
print "Trying to load ".ljust(40) \
+print_colors.WARNING("{f}").format(f=os.path.basename(file
I've developed a wrapper for paramiko that allows me load scientific data from a remote host. I was curious about how to best design a unit test for other scripts that rely on this function, or if one is even necessary? If there are any security holes or performance improvements, suggestions would be very appreciated!
```
def get_remote_data(filepath, time=False, info=False, copy_to_local=False):
"""
A paramiko wrapper that gets file from a remote computer. Parses
hostname from filepath. Works only for netcdf files!
Keyword Arguments:
filepath -- the path of the file with hostname.
time -- print time required for loading(default False)
info -- print some information about the file after loading
(default False)
copy_to_local -- copies file to local /tmp/remote_data/ and checks
if this file already exists there (default False)
Example:
>>> get_remote_data('pgierz@rayo3:/csys/paleo2/pgierz/GR30s.nc')
# TODO: Find out if this is the right way to do this, and make a unit test.
Paul J. Gierz, Sat Feb 14 14:20:43 2015
"""
# Import stuff from your own library:
from UI_Stuff import print_colors
if time:
import time
import paramiko
import os
from scipy.io import netcdf
if time:
now = time.time()
print "Trying to load ".ljust(40) \
+print_colors.WARNING("{f}").format(f=os.path.basename(file
Solution
Single responsibility principle
This function does too many things:
The function may also print timing stats, which probably doesn't make much sense when not copying the remote file.
Unit testing is most effective when testing small independent units that have a single responsibility, well-defined inputs and outputs, and ideally no side effects.
As a first step, it would be good to split this function to multiple smaller functions:
Each of these functions have well-defined, simpler responsibilities.
The
but it's ok, as it only orchestrates those things,
it's more of a controller and has trivially simple implementation.
Now that you have smaller units,
you can think about testing them individually, if necessary.
Actually I don't see much to test here,
because of the code is either printing,
or making paramiko API calls.
Testing that the paramiko calls work and you can fetch files will not be unit testing anymore,
because it will require external factors correctly setup,
such as a remote server with a file at a specific location,
which would violate the principle of unit testing.
That is not to say it's not worth testing,
maybe it is, to test your setup,
but it's not "unit testing", it's integration testing.
Code review
There is just one part that might be worth testing:
I suggest to put this in a function of its own:
And eliminate the repeated calls to
This function does too many things:
- maybe it copies a remote file
- maybe it prints info
The function may also print timing stats, which probably doesn't make much sense when not copying the remote file.
Unit testing is most effective when testing small independent units that have a single responsibility, well-defined inputs and outputs, and ideally no side effects.
As a first step, it would be good to split this function to multiple smaller functions:
def get_remote_file(filepath, time=False):
# ...
def get_local_file(filepath):
# ...
def get_file(filepath, time=False, copy_to_local=False):
if copy_to_local:
return get_remote_file(filepath, time)
return get_local_file(filepath)
def print_file_info(filepath, file):
# ...Each of these functions have well-defined, simpler responsibilities.
The
get_file function may seem to do multiple things,but it's ok, as it only orchestrates those things,
it's more of a controller and has trivially simple implementation.
Now that you have smaller units,
you can think about testing them individually, if necessary.
Actually I don't see much to test here,
because of the code is either printing,
or making paramiko API calls.
Testing that the paramiko calls work and you can fetch files will not be unit testing anymore,
because it will require external factors correctly setup,
such as a remote server with a file at a specific location,
which would violate the principle of unit testing.
That is not to say it's not worth testing,
maybe it is, to test your setup,
but it's not "unit testing", it's integration testing.
Code review
There is just one part that might be worth testing:
user = filepath.split(':')[0].split('@')[0]
host = filepath.split(':')[0].split('@')[1]
rfile = filepath.split(':')[1]I suggest to put this in a function of its own:
def split_ssh_filepath(filepath):
user = filepath.split(':')[0].split('@')[0]
host = filepath.split(':')[0].split('@')[1]
rfile = filepath.split(':')[1]
return user, host, rfileAnd eliminate the repeated calls to
filepath.split:def split_ssh_filepath(ssh_filepath):
user_host_rfile = ssh_filepath.split(':')[:2]
user, host = user_host_rfile[0].split('@')[:2]
rfile = user_host_rfile[1]
return user, host, rfileCode Snippets
def get_remote_file(filepath, time=False):
# ...
def get_local_file(filepath):
# ...
def get_file(filepath, time=False, copy_to_local=False):
if copy_to_local:
return get_remote_file(filepath, time)
return get_local_file(filepath)
def print_file_info(filepath, file):
# ...user = filepath.split(':')[0].split('@')[0]
host = filepath.split(':')[0].split('@')[1]
rfile = filepath.split(':')[1]def split_ssh_filepath(filepath):
user = filepath.split(':')[0].split('@')[0]
host = filepath.split(':')[0].split('@')[1]
rfile = filepath.split(':')[1]
return user, host, rfiledef split_ssh_filepath(ssh_filepath):
user_host_rfile = ssh_filepath.split(':')[:2]
user, host = user_host_rfile[0].split('@')[:2]
rfile = user_host_rfile[1]
return user, host, rfileContext
StackExchange Code Review Q#82097, answer score: 3
Revisions (0)
No revisions yet.