HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Extracting specific rows and columns from a CSV file

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
rowsfilecolumnscsvextractingspecificandfrom

Problem

I have written a function to selectively extract data from a file.
I want to be able to extract only from a certain line and only given rows.

Would convert this function into a generator improve the overhead when I need to process large files?

import itertools
import csv

def data_extraction(filename,start_line,lenght,span_start,span_end):
    with open(filename, "r") as myfile:
        file_= csv.reader(myfile, delimiter=' ')  #extracts data from .txt as lines
        return (x for x in [filter(lambda a: a != '', row[span_start:span_end]) \
        for row in itertools.islice(file_, start_line, lenght)])

Solution

Use round parenthesis for generators

Also x for x in was unnecessary:

return (filter(lambda a: a != '', row[span_start:span_end]) \
    for row in itertools.islice(file_, start_line, lenght))


If you use Python 2 you should use itertools.ifilter because it returns a generator while filter returns a list.

The functions is pretty clear overall, I suggest you space your argument list as according to PEP8 conventions. Also investigate in easier to remember argument formats like f(file, line_range, inline_range) where two tuples replace 4 arguments.

Code Snippets

return (filter(lambda a: a != '', row[span_start:span_end]) \
    for row in itertools.islice(file_, start_line, lenght))

Context

StackExchange Code Review Q#142142, answer score: 5

Revisions (0)

No revisions yet.