HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Open a text file and remove any blank lines

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
fileopentextanyblankremoveandlines

Problem

I have the following functions which help me to open a text file and remove any blank (empty) lines:

def fileExists(file):
    try:
        f = open(file,'r')
        f.close()
    except FileNotFoundError:
        return False
    return True
def isLineEmpty(line):
    return len(line.strip()) < 1 
def removeEmptyLines(file):
    lines = []
    if not fileExists(file):
        print ("{} does not exist ".format(file))
        return
    out = open(file,'r')
    lines = out.readlines()
    out.close()
    out = open(file,'w')
    t=[]
    for line in lines:
        if not isLineEmpty(line):
            t.append(line)
    out.writelines(t)   
    out.close()


As you can see I open a file 2 times. One for reading and one for writing. Is there anything I can to to improve this code?

Solution

I see two good strategies to accomplish the task.

One solution is to read all of the text into a list, then rewind to the start of the file and write out the desired lines.

def remove_empty_lines(filename):
    """Overwrite the file, removing empty lines and lines that contain only whitespace."""
    with open(filename, 'r+') as f:
        lines = f.readlines()
        f.seek(0)
        f.writelines(line for line in lines if line.strip())
        f.truncate()


The disadvantage of that approach is that it scales poorly if the file is huge, because the entire contents have to be read into memory first. The other approach would be to overwrite the file while reading from it. This uses two file objects, each keeping track of its own position within the file.

def remove_empty_lines(filename):
    """Overwrite the file, removing empty lines and lines that contain only whitespace."""
    with open(filename) as in_file, open(filename, 'r+') as out_file:
        out_file.writelines(line for line in in_file if line.strip())
        out_file.truncate()


Note some other issues with your code.

First, PEP 8, the official Python style guide, recommends lower_case_with_underscore for function names, and two blank lines between functions.

The fileExists function does not just test for the existence of the file — it actually checks whether you can open it for reading, which is a more stringent condition, since it also involves file permissions. But I don't see any reason to check specifically for file existence. All kinds of I/O errors are possible, such as filesystem permission denial, read-only filesystem, disk quota, or hardware failure. Furthermore, even if the file exists when you check, it could disappear during the split second between if not fileExists(…) and the real open(…) call. On top of that, your rmoveEmptyLines function has no way of reporting failure to its caller. (Printing an error message doesn't count!) Therefore, the only reasonable approach is to Just Do It, and handle any exception that might occur.

Any open() call should be written using a with block, which will automatically close the file handle when exiting the block.

Code Snippets

def remove_empty_lines(filename):
    """Overwrite the file, removing empty lines and lines that contain only whitespace."""
    with open(filename, 'r+') as f:
        lines = f.readlines()
        f.seek(0)
        f.writelines(line for line in lines if line.strip())
        f.truncate()
def remove_empty_lines(filename):
    """Overwrite the file, removing empty lines and lines that contain only whitespace."""
    with open(filename) as in_file, open(filename, 'r+') as out_file:
        out_file.writelines(line for line in in_file if line.strip())
        out_file.truncate()

Context

StackExchange Code Review Q#145126, answer score: 9

Revisions (0)

No revisions yet.