HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Reading an input file with 6 columns

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
readingfilecolumnswithinput

Problem

Consider this code where I read an input file with 6 columns (0-5):

  • Initialize a variable history_ends to 5000.



  • When the column0 value (i.e. job[0]



  • All the historyjobs list all contents in item3, item4, item5 is equal to targetjobs. First list item3, item4, item5 when this condition is satisfied. Add those historyjobs all item1 to list listsub.



  • Find the running mean of the items in listsub and reverse the list, store it in list



  • Check the condition if items in listsub > a*0.9 which satisfies the condition. Stores the result items in list condsub.



  • Reopen the inputfile and check whether column0 is equal to items in condsub. If it satisfies, then add the column1 to a list condrun.



  • Open the output file and write in colum0 the second item of first list in targetjobs i.e. j, in column1, write the average of list condrun, column2 is (j-avg)/j, column3 is the maximum item in list condrun, column4 is the minimum item in list condrun, column5 is the length of the list condrun, the last four columns is based on the condition.



  • I am repeating the whole procedure using a while loop by assigning the variable historyends to the next item int(targetjobs[1][0]).



```
from __future__ import division
import itertools

history_begins = 1; history_ends = 5000; n = 0; total = 0
historyjobs = []; targetjobs = []
listsub = []; listrun = []; listavg = [] ; F = [] ; condsub = [] ;condrun = [] ;mlistsub = []; a = []

def check(inputfile):

f = open(inputfile,'r') #reads the inputfile
lines = f.readlines()
for line in lines:
job = line.split()
if( int(job[0]) history_ends
k = 0
for i, element in enumerate(historyjobs):
if( (int(historyjobs[i][3]) == int(targetjobs[k][3])) and (int(historyjobs[i][4]) == int(targetjobs[k][4])) and (int(historyjobs[i][5]) == int(targetjobs[k][5])) ): #historyjobs list all contents in column3,col

Solution

Comments about coding style:

  • inconsistent indentation makes the code harder to read



  • preferred indentation width in python is 4 spaces



Why is the question tagged python3 ? This is python2 code.

history_begins = 1; history_ends = 5000; n = 0; total = 0
historyjobs = []; targetjobs = []
listsub = []; listrun = []; listavg = [] ; F = [] ; condsub = [] ;condrun = [] ;mlistsub = []; a = []


There are variables defined here that aren't actually used in the script, and global variables shouldn't have one-letter names.

f = open(inputfile,'r') #reads the inputfile


No it doesn't, it just creates a file handle.

lines = f.readlines()
for line in lines:


However, this does read the file, it even loads it all in RAM at once, which is a waste because you don't actually need to, so do this instead:

for line in f:


f1.write(str(j))
f1.write('\t')
f1.write('\t')
...
if (float(er1) < 0.50):
    f1.write("good")
    f1.write("\n")
else:
    f1.write("bad")
    f1.write("\n")


That's redundant, factor it:

print(j, round(a,2), round(er1,3), c, d, g, h, sep='\t\t', end='\t\t', file=out)
w = ('bad', 'good')
print(w[er1 < .2], w[er1 < .3], w[er1 < .4], w[er1 < .5], sep='\t', file=out)


f = open('newfileinput','r') #again read the same inputfile


Why read the same file multiple times ? That's inefficient…

Code Snippets

history_begins = 1; history_ends = 5000; n = 0; total = 0
historyjobs = []; targetjobs = []
listsub = []; listrun = []; listavg = [] ; F = [] ; condsub = [] ;condrun = [] ;mlistsub = []; a = []
f = open(inputfile,'r') #reads the inputfile
lines = f.readlines()
for line in lines:
for line in f:
f1.write(str(j))
f1.write('\t')
f1.write('\t')
...
if (float(er1) < 0.50):
    f1.write("good")
    f1.write("\n")
else:
    f1.write("bad")
    f1.write("\n")

Context

StackExchange Code Review Q#30565, answer score: 2

Revisions (0)

No revisions yet.