HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Getting columns from lines of uneven length

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
columnslengthgettingunevenfromlines

Problem

I am given pasted data from a table, so I have spaces as delimiters and in some of the fields I don't care about. I want the first field and the last three fields, and got them using this code:

testdata = """8/5/15 stuffidontneed custid locid 55.00
8/9/15 stuff i really dont need with extra spaces custid otherlocid 79.00"""
rows = testdata.split('\n')
tupls = [row.split(' ') for row in rows]
dates = [tupl[0] for tupl in tupls]
custids, locids, amounts = ([tupl[i] for tupl in tupls] for i in range (-3, 0))
print(dates, custids, locids, amounts)
# ['8/5/15', '8/9/15'] ['custid', 'custid'] ['locid', 'otherlocid'] ['55.00', '79.00']


I just thought there might be a more elegant way to do things, maybe capturing the data in the middle as a single field.

Edit: I have attempted to add delimiters using re.finditer, but I can't replace the matches easily.

Solution

After you have tupls:

data = [(t[0], t[-3], t[-2], t[-1] for t in tupls] # Or use range...
print(list(zip(*data))


Gives:

[('8/5/15', '8/9/15'), ('custid', 'custid'), ('locid', 'otherlocid'), ('55.00', '79.00')]


So this:

testdata = """8/5/15 stuffidontneed custid locid 55.00
8/9/15 stuff i really dont need with extra spaces custid otherlocid 79.00"""
rows = testdata.split('\n')
tupls = [row.split(' ') for row in rows]
dates = [tupl[0] for tupl in tupls]
custids, locids, amounts = ([tupl[i] for tupl in tupls] for i in range (-3, 0))
print(dates, custids, locids, amounts)


Becomes:

testdata = """8/5/15 stuffidontneed custid locid 55.00
8/9/15 stuff i really dont need with extra spaces custid otherlocid 79.00"""
rows = testdata.split('\n')
tupls = [row.split(' ') for row in rows]
data = [(t[0], t[-3], t[-2], t[-1] for t in tupls] # Or use range...
print(list(zip(*data))

Code Snippets

data = [(t[0], t[-3], t[-2], t[-1] for t in tupls] # Or use range...
print(list(zip(*data))
[('8/5/15', '8/9/15'), ('custid', 'custid'), ('locid', 'otherlocid'), ('55.00', '79.00')]
testdata = """8/5/15 stuffidontneed custid locid 55.00
8/9/15 stuff i really dont need with extra spaces custid otherlocid 79.00"""
rows = testdata.split('\n')
tupls = [row.split(' ') for row in rows]
dates = [tupl[0] for tupl in tupls]
custids, locids, amounts = ([tupl[i] for tupl in tupls] for i in range (-3, 0))
print(dates, custids, locids, amounts)
testdata = """8/5/15 stuffidontneed custid locid 55.00
8/9/15 stuff i really dont need with extra spaces custid otherlocid 79.00"""
rows = testdata.split('\n')
tupls = [row.split(' ') for row in rows]
data = [(t[0], t[-3], t[-2], t[-1] for t in tupls] # Or use range...
print(list(zip(*data))

Context

StackExchange Code Review Q#140410, answer score: 2

Revisions (0)

No revisions yet.