patternpythonMinor
Parsing oscilloscope binary data
Viewed 0 times
binaryoscilloscopedataparsing
Problem
I wrote a simply parser for Tektronix's internal binary file format
Since I mostly code by myself, I would like your opinion on whether my code follows common practice and whether it can be improved.
```
import numpy as np
import os.path
def read_curve(binaryfile):
"""
Reads one tektronix .isf file and returns a dictionary containing
all tags as keys. The actual data is stored in the key "data".
"""
postfixs = [".isf", ".ISF"]
if os.path.splitext(binaryfile)[-1] not in postfixs:
raise ValueError("File type unkown.")
with open(binaryfile, 'rb') as bfile:
# read header
header = {}
current = 0
while True:
current, name = _read_chunk(bfile, " ")
if name != ":CURVE":
current, value = _read_chunk(bfile, ";")
assert name not in header
header[name] = value
else:
# ":CURVE" is the last tag of the header, followed by
# a '#' and a 7 digit number.
header[name] = bfile.read(8)
current = bfile.tell()
break
assert header["ENCDG"] == "BINARY"
# read data as numpy array
header["data"] = _read_data(bfile, current, header)
return header
def _read_chunk(headerfile, delimiter):
"""
Reads one chunk of header data. Based on delimiter, this may be a tag
(ended by " ") or the value of a tag (ended by ";").
"""
chunk = []
while True:
c = headerfile.read(1)
if c != delimiter:
chunk.append(c)
else:
return headerfile.tell(), "".join(chunk)
def _read_data(bfile, position, header):
"""
Reads in the binary data as numpy array.
Apparently, there are only 1d-signals stored in .isf files, so a 1d-array
is read.
Returns a 2d-array wit
.isf. The code is supposed to read in the file and return the header data as well as the actual oszi-data.Since I mostly code by myself, I would like your opinion on whether my code follows common practice and whether it can be improved.
```
import numpy as np
import os.path
def read_curve(binaryfile):
"""
Reads one tektronix .isf file and returns a dictionary containing
all tags as keys. The actual data is stored in the key "data".
"""
postfixs = [".isf", ".ISF"]
if os.path.splitext(binaryfile)[-1] not in postfixs:
raise ValueError("File type unkown.")
with open(binaryfile, 'rb') as bfile:
# read header
header = {}
current = 0
while True:
current, name = _read_chunk(bfile, " ")
if name != ":CURVE":
current, value = _read_chunk(bfile, ";")
assert name not in header
header[name] = value
else:
# ":CURVE" is the last tag of the header, followed by
# a '#' and a 7 digit number.
header[name] = bfile.read(8)
current = bfile.tell()
break
assert header["ENCDG"] == "BINARY"
# read data as numpy array
header["data"] = _read_data(bfile, current, header)
return header
def _read_chunk(headerfile, delimiter):
"""
Reads one chunk of header data. Based on delimiter, this may be a tag
(ended by " ") or the value of a tag (ended by ";").
"""
chunk = []
while True:
c = headerfile.read(1)
if c != delimiter:
chunk.append(c)
else:
return headerfile.tell(), "".join(chunk)
def _read_data(bfile, position, header):
"""
Reads in the binary data as numpy array.
Apparently, there are only 1d-signals stored in .isf files, so a 1d-array
is read.
Returns a 2d-array wit
Solution
This seems pretty nice. I have just some nitpicks about this part:
Nitpicks:
-
I generally don't like repeating the same thing in lowercase and uppercase versions. I prefer to use lowercase, and convert untrusted input to lowercase. However, that will make this check less strict:
-
I'd spell out "postfixs" as "postfixes", or better yet, use the more common term "extensions"
-
Instead of a
So I'd write this as:
As for the rest, this seems pretty clean to me. I especially like the
postfixs = [".isf", ".ISF"]
if os.path.splitext(binaryfile)[-1] not in postfixs:
raise ValueError("File type unkown.")Nitpicks:
-
I generally don't like repeating the same thing in lowercase and uppercase versions. I prefer to use lowercase, and convert untrusted input to lowercase. However, that will make this check less strict:
iSF and ISf also become valid extensions, which you might not want.-
I'd spell out "postfixs" as "postfixes", or better yet, use the more common term "extensions"
-
Instead of a
list of allowed extensions, I'd use a set. Although in this example you surely won't be hitting performance issues, it's a matter of principle and habitSo I'd write this as:
extensions = set([".isf"])
if os.path.splitext(binaryfile)[-1].lower() not in extensions:
raise ValueError("File type unknown.")As for the rest, this seems pretty clean to me. I especially like the
assert statements documenting your assumptions.Code Snippets
postfixs = [".isf", ".ISF"]
if os.path.splitext(binaryfile)[-1] not in postfixs:
raise ValueError("File type unkown.")extensions = set([".isf"])
if os.path.splitext(binaryfile)[-1].lower() not in extensions:
raise ValueError("File type unknown.")Context
StackExchange Code Review Q#90898, answer score: 4
Revisions (0)
No revisions yet.