HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonModerate

Saving a column from a file to a list

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
filecolumnlistfromsaving

Problem

I want to read only 1 column from file and save it in the list column. I have this code and I don't know how to optimize it. It takes lots of time because I need to call it a lot of times.

column = (sum(1 for line in open(file) if line.strip("\n")))*[0]
counter_row = 0
counter_column = 0
with open(file, 'r') as f:
    for row in f:
        row = row.strip("\n")
        if row:
            for el_column in row.split(","):
                if counter_column==n_column:
                     column[counter_row]=el_column
                counter_column = counter_column + 1
        counter_row = counter_row + 1
        counter_column = 0

Solution

Your biggest problem is the first line:

column = (sum(1 for line in open(file) if line.strip("\n")))*[0]


Opening the file twice is a bad idea for performance, probably much worse than not knowing how large of a list to allocate. Furthermore, that statement only allocates one element per non-empty line, whereas the counter_row counts every line in the file. So, the code would crash with an IndexError if there are any empty lines.

The inner for loop is just a complicated way to do indexing.

Assuming that you want empty lines to be represented by 0, you could write:

column = []
with open(file) as f:
    for line in f:
        row = line.strip("\n")
        column.append(row.split(",")[n_column] if row else 0)


If you don't need to test for empty lines, then you could just write:

with open(file) as f:
    column = [line.strip("\n").split(",")[n_column] for line in f]


You should consider using the csv module, though, which can handle CSV quoting properly.

Code Snippets

column = (sum(1 for line in open(file) if line.strip("\n")))*[0]
column = []
with open(file) as f:
    for line in f:
        row = line.strip("\n")
        column.append(row.split(",")[n_column] if row else 0)
with open(file) as f:
    column = [line.strip("\n").split(",")[n_column] for line in f]

Context

StackExchange Code Review Q#142736, answer score: 11

Revisions (0)

No revisions yet.