HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

File validation and unit testing

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
filevalidationtestingandunit

Problem

I just started a new project which reads in a video file and splits it up into frames. I started by just throwing everything in the GUI, but decided I wanted to make this a more well designed program, backed with unit tests. So I started to split out functionality into its own class and have a few questions on best practice.

from os import path

VALID_EXTENTIONS = ["avi", "mp4"]

class InvalidVideoSourceException(Exception):
    pass

def isVideoSrcValid(src):
    if (not path.isfile(src)):
        return False

    filenameTokens = path.splitext(src)

    if (len(filenameTokens) < 1):
        return False

    if (not filenameTokens[1][1:].lower() in VALID_EXTENTIONS):
        return False

    return True

class VideoCapture(object):
    def __init__(self, videoSrc):
        if (not isVideoSrcValid(videoSrc)):
            raise InvalidVideoSourceException("Invalid video file.")
        self.videoSrc = videoSrc


I'm setting up my capture class which will end up handling all the interaction with the video-- skipping to a frame, getting meta data, etc. Pretty simple.

My concern is the best way to handle validation of the path. The class is essentially useless until it has a valid video file, so you must provide that to __init__. If the path is invalid for whatever reason, then the class should throw an exception.

Originally, I had the isVideoSrcValid function within the class and it returned nothing, it would go through its validation and throw an exception as it got to it. This was nice because VALID_EXTENTIONS belonged to the class which I like since it doesn't feel good having a global variable hanging around like that. I could pass it into the function, but I like that even less, since it is constant data. The other advantage of this approach was having precise error messages. If the file was invalid because the extension isn't supported, it would report that.

The downside, and why I ultimately choose to return a Boolean from the function rathe

Solution

As a general statement, Python convention uses underscores_in_names instead of camelCase.

Class Structure

This implementation is really up to you. Without a little more detail on your program structure its hard to suggest how the classes should be arranged. However, it feels like the isVideoSrcValid function as well as VALID_EXTENSIONS should be in some video class.

isVideoSrcValid

This code can be simplified. What does it matter if the file had no extension? If thats the case, then the next check not filenameTokens[1][1:].lower() in VALID_EXTENTIONS would make the function return false. Because of this, we can remove the 2nd if-statement then just return the value from the last if-statement.

Instead of slicing our filenameTokens extension (to remove the .) simply add a period to the VALID_EXTENSIONS. Also, unpack the tuple returned from splitext() for readability.

Here is the revamped code:

VALID_EXTENSIONS = ['.avi', '.mp4']
def isVideoSrcValid(src):
    if (not path.isfile(src)):
        return False

    root, ext = path.splitext(src)

    return ext.lower() in VALID_EXTENTIONS


Testing

Your tests look pretty comprehensive. I have two main points:

-
Off-base tests

Why are you testing os.path.isfile()? I understand its to make sure that your code in setUp functioned properly and open().close() created valid files. However, the only way your open().close() calls would not create valid files is if they threw an error. If thats the case, your tests would not run.

Moreso the point I want to make is this: testing os.path.isfile() is not the point of your tests and this test class. This test class should only test whether or not isVideoSrcValid works correctly. These tests are independent and thus assume that the functions it uses act politely.

-
Name your test functions to indicate what they test.

Test functions should be named to indicate what they test, not what they test with. For example, take your first test function: test_empty(). This is saying "I'm testing my function with an empty string". However, how much information does this give us if we needed to debug should it fail? Did the first if-statement fail (yes)? Or was it the second check?

Names like that can be ambiguous in that sense. I would recommend changing that function to:

def test_is_file_failure(self):
    self.assertFalse(vc.isVideoSrcValid("")) 

    # Test with an os resource that is not a file.
    self.assertFalse(vc.isVideoSrcValid(os.path.expanduser('~')))

Code Snippets

VALID_EXTENSIONS = ['.avi', '.mp4']
def isVideoSrcValid(src):
    if (not path.isfile(src)):
        return False

    root, ext = path.splitext(src)

    return ext.lower() in VALID_EXTENTIONS
def test_is_file_failure(self):
    self.assertFalse(vc.isVideoSrcValid("")) 

    # Test with an os resource that is not a file.
    self.assertFalse(vc.isVideoSrcValid(os.path.expanduser('~')))

Context

StackExchange Code Review Q#51479, answer score: 2

Revisions (0)

No revisions yet.