patternpythonMinor
Calculate query coverage from BLAST output
Viewed 0 times
queryoutputblastcalculatefromcoverage
Problem
I have a BLAST output file and want to calculate query coverage, appending the query lengths as an additional column to the output. Let's say I have
2 7 15f=open('file.txt', 'r')
lines=f.readlines()
import re
for line in lines:
new_list=re.split(r'\t+',line.strip())
q_start=new_list[0]
q_end=new_list[1]
q_len=new_list[3]
q_cov=((float(q_end)-float(q_start))/float(q_len))*100
q_cov=round(q_cov,1)
q_cov=str(q_cov)
new_list.append(q_cov)
r=open('results.txt', 'a')
x='\t'.join(new_list)
x=x+'\n'
r.writelines(x)
f.close()
r.close()Solution
One serious bug is that you open
Since you want to treat your
Put your
results.txt for each line of input. It's almost always better to open files in a with block. Then, you won't have to worry about closing your filehandles, even if the code exits abnormally. The with block would have made your results.txt mistake obvious as well.Since you want to treat your
q_start, q_end, and q_len as numbers, I wouldn't even bother to assign their string representations to a variable. Just convert them to a float as soon as possible. Similarly, q_cov should be a float; I would just stringify it at the last moment. I would also postpone rounding just for the purposes of formatting the output, preferring to preserve precision in q_cov itself.Put your
import statements at the beginning of the program.import re
with open('file.txt') as input, open('results.txt', 'a') as output:
for line in input.readlines():
fields = re.split(r'\t+', line.strip())
q_start, q_end, q_len = map(float, (fields[0], fields[1], fields[3]))
q_cov = 100 * (q_end - q_start) / q_len
print >>output, '\t'.join(fields + [str(round(q_cov, 1))])Code Snippets
import re
with open('file.txt') as input, open('results.txt', 'a') as output:
for line in input.readlines():
fields = re.split(r'\t+', line.strip())
q_start, q_end, q_len = map(float, (fields[0], fields[1], fields[3]))
q_cov = 100 * (q_end - q_start) / q_len
print >>output, '\t'.join(fields + [str(round(q_cov, 1))])Context
StackExchange Code Review Q#39879, answer score: 4
Revisions (0)
No revisions yet.