HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Parsing oscilloscope binary data

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
binaryoscilloscopedataparsing

Problem

I wrote a simply parser for Tektronix's internal binary file format .isf. The code is supposed to read in the file and return the header data as well as the actual oszi-data.

Since I mostly code by myself, I would like your opinion on whether my code follows common practice and whether it can be improved.

```
import numpy as np
import os.path

def read_curve(binaryfile):
"""
Reads one tektronix .isf file and returns a dictionary containing
all tags as keys. The actual data is stored in the key "data".
"""
postfixs = [".isf", ".ISF"]
if os.path.splitext(binaryfile)[-1] not in postfixs:
raise ValueError("File type unkown.")

with open(binaryfile, 'rb') as bfile:

# read header
header = {}
current = 0

while True:

current, name = _read_chunk(bfile, " ")

if name != ":CURVE":
current, value = _read_chunk(bfile, ";")

assert name not in header
header[name] = value
else:
# ":CURVE" is the last tag of the header, followed by
# a '#' and a 7 digit number.
header[name] = bfile.read(8)
current = bfile.tell()
break

assert header["ENCDG"] == "BINARY"

# read data as numpy array
header["data"] = _read_data(bfile, current, header)

return header

def _read_chunk(headerfile, delimiter):
"""
Reads one chunk of header data. Based on delimiter, this may be a tag
(ended by " ") or the value of a tag (ended by ";").
"""
chunk = []
while True:
c = headerfile.read(1)
if c != delimiter:
chunk.append(c)
else:
return headerfile.tell(), "".join(chunk)

def _read_data(bfile, position, header):
"""
Reads in the binary data as numpy array.
Apparently, there are only 1d-signals stored in .isf files, so a 1d-array
is read.
Returns a 2d-array wit

Solution

This seems pretty nice. I have just some nitpicks about this part:

postfixs = [".isf", ".ISF"]
if os.path.splitext(binaryfile)[-1] not in postfixs:
    raise ValueError("File type unkown.")


Nitpicks:

-
I generally don't like repeating the same thing in lowercase and uppercase versions. I prefer to use lowercase, and convert untrusted input to lowercase. However, that will make this check less strict: iSF and ISf also become valid extensions, which you might not want.

-
I'd spell out "postfixs" as "postfixes", or better yet, use the more common term "extensions"

-
Instead of a list of allowed extensions, I'd use a set. Although in this example you surely won't be hitting performance issues, it's a matter of principle and habit

So I'd write this as:

extensions = set([".isf"])
if os.path.splitext(binaryfile)[-1].lower() not in extensions:
    raise ValueError("File type unknown.")


As for the rest, this seems pretty clean to me. I especially like the assert statements documenting your assumptions.

Code Snippets

postfixs = [".isf", ".ISF"]
if os.path.splitext(binaryfile)[-1] not in postfixs:
    raise ValueError("File type unkown.")
extensions = set([".isf"])
if os.path.splitext(binaryfile)[-1].lower() not in extensions:
    raise ValueError("File type unknown.")

Context

StackExchange Code Review Q#90898, answer score: 4

Revisions (0)

No revisions yet.