HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Python script for making a table

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
scriptforpythonmakingtable

Problem

I have written a Python 2.6.6 script in order to make tables by taking the third column of every text file in a particular subgroup (signal and background in this case[particle physics]). However, I used the exec command too often and I think this code can be written more elegantly.

```
#!/usr/bin/python
import sys
stringpu = ["50PU", "140PU"]
stringsig = ["NM1", "NM2", "NM3", "Scenario4", "Scenario6"]
stringbkg = ["TTbar", "DiBoson", "TopJets", "BosonJets"]

for a in range(len(stringsig)):
exec("FILESIG"+str(a)+" = open('SummerStd_'+stringpu[1]+'_'+stringsig[a]+'.txt', 'r')")
exec("LINESIG"+str(a)+" = FILESIG"+str(a)+".readlines()")

for b in range(len(stringbkg)):
exec("FILEBKG"+str(b)+" = open('SummerStd_'+stringpu[1]+'_'+stringbkg[b]+'.txt', 'r')")
exec("LINEBKG"+str(b)+" = FILEBKG"+str(b)+".readlines()")

table = open('table.txt', 'w')
table.write("\\begin{table}[h] \n")
table.write("\\centering \n")
table.write("\\begin{tabular}{|c|c|c|c|c|c|} \n")
table.write("\\hline & \\textbf{NM1} & \\textbf{NM2} & \\textbf{NM3} & \\textbf{STOC} & \\textbf{STC} \\\\ \n")

n = 0
for line in range(len(LINESIG0)):
n += 1
for i in range(len(stringsig)):
exec("wordsig"+str(i+1)+" = LINESIG"+str(i)+"[line].split()")
if n > 2:
table.write("\hline \n")
table.write(wordsig2[2]+" & "+wordsig1[3]+" & "+wordsig2[3]+" & "+wordsig3[3]+" & "+wordsig4[3]+" & "+wordsig5[3]+" \\\\ \n")

table.write("\hline \n")
table.write("\end{tabular} \n")
table.write("\end{table} \n")

table.write("\\begin{table}[h] \n")
table.write("\\centering \n")
table.write("\\begin{tabular}{|c|c|c|c|c|c|} \n")
table.write("\\hline & \\textbf{TTbar} & \\textbf{DiBoson} & \\textbf{TopJets} & \\textbf{BosonJets} \\\\ \n")

n = 0
for line in range(len(LINEBKG0)):
n += 1
for i in range(len(stringbkg)):
exec("wordbkg"+str(i+1)+" = LINEBKG"+str(i)+"[line].split()")
if n > 2:
table.write("\hline \n")
table.write(wor

Solution

Before rewriting the program, I'll start with some smaller issues.

Minor simplifications

  • Use print and '''long strings''' instead of multiple file.write() calls to get newlines. Use r'' (raw strings) to avoid doubling backslashes.



  • I would just write the tables to standard output, and redirect the output to a file using the shell. That avoids hard-coding 'table.txt' in your program.



  • Use for line in range(1, len(LINESIG0)) to skip the first line.



  • Ideally, you should rename your files to coincide with their column headings, so that you don't have to map 'STOC'SummerStd_140PU_Scenario4.txt.



Bugs

  • The third column of a line would be line[2], not line[3].



  • Calling open() without calling close() results in file descriptor leaks. Better yet, use a with block to open files, so that they will be closed automatically. (For a small number of files like this, in a "throwaway" script, you won't run out of file descriptors. It's a poor habit, though.)



Rewriting

exec() is a bad idea. It's hard to interpret your code because I have to imagine what the generated code will look like before mentally executing it. It's complicated further by the fact that some of the string handling happens when preparing the statement ("FILESIG"+str(a) in the line below) and some of it happens inside the evaluation (for example, +stringsig[a]+).

exec("FILESIG"+str(a)+" = open('SummerStd_'+stringpu[1]+'_'+stringsig[a]+'.txt', 'r')")


As you suspected, code that deals with many similar variables should be stored in a data structure such as a list (or multi-dimensional list) or dictionary, not multiple variables.

To generate two tables, you have duplicated the code. A function should be defined instead.

def table(filenames, headings):
    data = []
    for fn in filenames:
        with open(fn) as f:
            data.append([line.split() for line in f])

    print r'''\begin{table}[h]
\centering
\begin{tabular}{|c|%s|}
\hline''' % ('|'.join('c' for _ in headings))
    print ' & '.join([''] + headings)

    for line in range(1, len(data[0])):
        print r'\hline'
        print ' & '.join([data[2][line][2]] + [f[line][3] for f in data])

    print r'''\hline
\end{tabular}
\end{table}'''

signal_filenames = ['SummerStd_140PU_%s.txt' % (s) for s in
     ['NM1', 'NM2', 'NM3', 'Scenario4', 'Scenario6']
]
table(signal_filenames, ['NM1', 'NM2', 'NM3', 'STOC', 'STC'])

background_filenames = ['SummerStd_140PU_%s.txt' % (s) for s in
     ['TTbar', 'DiBoson', 'TopJets', 'BosonJets']
]
table(signal_filenames, ["TTbar", "DiBoson", "TopJets", "BosonJets"])

Code Snippets

exec("FILESIG"+str(a)+" = open('SummerStd_'+stringpu[1]+'_'+stringsig[a]+'.txt', 'r')")
def table(filenames, headings):
    data = []
    for fn in filenames:
        with open(fn) as f:
            data.append([line.split() for line in f])

    print r'''\begin{table}[h]
\centering
\begin{tabular}{|c|%s|}
\hline''' % ('|'.join('c' for _ in headings))
    print ' & '.join([''] + headings)

    for line in range(1, len(data[0])):
        print r'\hline'
        print ' & '.join([data[2][line][2]] + [f[line][3] for f in data])

    print r'''\hline
\end{tabular}
\end{table}'''

signal_filenames = ['SummerStd_140PU_%s.txt' % (s) for s in
     ['NM1', 'NM2', 'NM3', 'Scenario4', 'Scenario6']
]
table(signal_filenames, ['NM1', 'NM2', 'NM3', 'STOC', 'STC'])

background_filenames = ['SummerStd_140PU_%s.txt' % (s) for s in
     ['TTbar', 'DiBoson', 'TopJets', 'BosonJets']
]
table(signal_filenames, ["TTbar", "DiBoson", "TopJets", "BosonJets"])

Context

StackExchange Code Review Q#61335, answer score: 3

Revisions (0)

No revisions yet.