HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Merge all text files in a directory and save a temp file

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
directoryfileallmergetexttempsavefilesand

Problem

I've coded this function where you read all text file in a directory and you save in a temporary file all values. The text files are x, y and z format. The function returns

  • the name of the temporary file



  • the bounding box



  • the origin (top-left corner)



  • and the bottom (bottom-right corner).



I wish for some comments or suggestion on how to improve my working code.

import os
import tempfile
import glob

class LaserException(Exception):
    """Laser exception, indicates a laser-related error."""
    pass

sepType = {
        "space": ' ',
        "tab": '\t',
        "comma": ',',
        "colon": ':',
        "semicolon": ';',
        "hyphen": '-',
        "dot": '.'
        }

def tempfile_merge(path,separator,wildcard= '*.txt'):
    file_temp = tempfile.NamedTemporaryFile(delete=False,dir=path)
    name = file_temp.name
    minx = float('+inf')
    maxx = float('-inf')
    miny = float('+inf')
    maxy = float('-inf')
    for file in glob.glob(os.path.join(path,wildcard)):
        for line in open(file, "r"):
            element = line.split(sepType[separator])
            if len(element) < 3:
                raise TypeError("not enough arguments: %s has only %s columns" % (inFile_name_suffix,len(element)))
            try:
                maxx = max(maxx, float(element[0]))
                minx = min(minx, float(element[0]))
                maxy = max(maxy, float(element[1]))
                miny = min(miny, float(element[1]))
            except ValueError:
                raise LaserException("x,y,z are not float-values")
            newelement = " ".join([str(e) for e in element])+ "\n"
            file_temp.write(newelement)
    file_temp.close()
    return(name, ((minx,maxy),(maxx,maxy),(maxx,miny),(minx,miny)),(minx,maxy),(maxx,miny))

Solution

Without going into implementation details, I would suggest looking into the following performance optimisations.

  • Use buffered reads. If you actually read a line at the time it's pretty time consuming.



  • Use buffered writes. Instead of writing each new line, collect in a buffer and write in chunks.



For coding review comments, this might be applicable.

  • The method does 3 different things - merging, syntax checking and bounding rectangle. It might be simpler to maintain and extend the code, if this was refactored into minor helper methods.



  • Based on the method name, I would be be able to guess what it does.



  • The initial comments say the method calculates the bounding box, but as the files contain 3d data, would it not be more correct to include the z value.



  • If the method only need to calculate the bounding rectangle (x,y), is there a need for the merged file to contain the third dimension data ?

Context

StackExchange Code Review Q#23318, answer score: 2

Revisions (0)

No revisions yet.