snippetpythonMinor
Parse a text file containing settings for neuroimaging software
Viewed 0 times
containingfilesettingstextneuroimagingparseforsoftware
Problem
I would like to refactor a large Python method I wrote to have better practices.
I wrote a method that parses a design file from the FSL neuroimaging library. Design files are text files with settings for the neuroimaging software. I initially wrote the code to do all the processing in a single loop though the file.
I'm looking for all manner of suggestions, but mostly tips of design and best practices. I'd love input on the project as a whole, but I know that might be out of scope for this site.
The code structure:
You can also view the code on GitHub along with the larger project.
```
def parse_design_file(self,fsf_lines, type):
"""
Parses design file information and return information in parsed variables that can be used by the csv methods
"""
analysis_name=''
output_path=''
zvalue=''
pvalue=''
if type == self.FIRST_TYPE or self.PRE_TYPE:
in_file=''
if type == self.FIRST_TYPE:
ev_convolves=dict()
ev_paths=dict()
ev_deriv=dict()
ev_temp=dict()
if type == self.ME_TYPE or type == self.FE_TYPE:
feat_paths=dict()
count=''
if type == self.FIRST_TYPE or type == self.FE_TYPE:
ev_names=dict()
evg_lines=list()
cope_names=dict()
cope_def_lines=list()
if type == self.PRE_TYPE:
tr=''
total_volumes=''
brain_thresh=''
motion_correction=''
smoothing=''
deleted=''
if type == self.FE_TYPE:
first_example_dir=''
if type == self.ME_TYPE:
FE_example_dir=''
for line in fsf_lines:
#regex matching
#all
output_match=re.search("set fmri\(outputdir\)",line)
feat_file_match=re.search("feat_files\(\d+\)",line)
total_vols_match=re.search("fmri\(npts\)", line)
z_match=re.search("set
I wrote a method that parses a design file from the FSL neuroimaging library. Design files are text files with settings for the neuroimaging software. I initially wrote the code to do all the processing in a single loop though the file.
I'm looking for all manner of suggestions, but mostly tips of design and best practices. I'd love input on the project as a whole, but I know that might be out of scope for this site.
The code structure:
- Define variables based on type of design file
- Parse design file based on type of design file
- Return the now populated variables setup in 1
You can also view the code on GitHub along with the larger project.
```
def parse_design_file(self,fsf_lines, type):
"""
Parses design file information and return information in parsed variables that can be used by the csv methods
"""
analysis_name=''
output_path=''
zvalue=''
pvalue=''
if type == self.FIRST_TYPE or self.PRE_TYPE:
in_file=''
if type == self.FIRST_TYPE:
ev_convolves=dict()
ev_paths=dict()
ev_deriv=dict()
ev_temp=dict()
if type == self.ME_TYPE or type == self.FE_TYPE:
feat_paths=dict()
count=''
if type == self.FIRST_TYPE or type == self.FE_TYPE:
ev_names=dict()
evg_lines=list()
cope_names=dict()
cope_def_lines=list()
if type == self.PRE_TYPE:
tr=''
total_volumes=''
brain_thresh=''
motion_correction=''
smoothing=''
deleted=''
if type == self.FE_TYPE:
first_example_dir=''
if type == self.ME_TYPE:
FE_example_dir=''
for line in fsf_lines:
#regex matching
#all
output_match=re.search("set fmri\(outputdir\)",line)
feat_file_match=re.search("feat_files\(\d+\)",line)
total_vols_match=re.search("fmri\(npts\)", line)
z_match=re.search("set
Solution
Your function acts widely different depending on the value of the
I assume that you have done it this way to try and share code between the different filetypes. This is the wrong way to do it. You should share code by calling common functions or using common base classes. You should not share code by having functions exhibit widely different behavior.
For parsing, I'd suggest you actually parse the file, not just run some regular expressions over it. From what I gather your file basically consistents of lines of the form:
I'd write a
Would become
Then the function for a particular type can just look up entries in the dictionary.
In the end, you return a long tuple. Why? Long tuples are difficult to work with. Perhaps you should really be storing them on an object with many attributes?
type parameter. In fact, it basically acts like different functions depending on that parameter. It returns completely different sets of values depending on the value of type. The function would be better off split into several functions, one for each type. Then each function would be simpler, and easier to follow. I assume that you have done it this way to try and share code between the different filetypes. This is the wrong way to do it. You should share code by calling common functions or using common base classes. You should not share code by having functions exhibit widely different behavior.
For parsing, I'd suggest you actually parse the file, not just run some regular expressions over it. From what I gather your file basically consistents of lines of the form:
set something somevalueI'd write a
parse() function that converts the file into a dictionary. So this:# Threshold IC maps
set fmri(thresh_yn) 1
set fmri(mmthresh) 0.5
set fmri(ostats) 0Would become
{'fmri(thesh_yn)' : 1, 'fmri(mmthres)' : 0.5, 'fmri(ostats)' : 0}Then the function for a particular type can just look up entries in the dictionary.
In the end, you return a long tuple. Why? Long tuples are difficult to work with. Perhaps you should really be storing them on an object with many attributes?
Code Snippets
set something somevalue# Threshold IC maps
set fmri(thresh_yn) 1
set fmri(mmthresh) 0.5
set fmri(ostats) 0{'fmri(thesh_yn)' : 1, 'fmri(mmthres)' : 0.5, 'fmri(ostats)' : 0}Context
StackExchange Code Review Q#11961, answer score: 3
Revisions (0)
No revisions yet.