patternpythonMinor
Filling in gym membership prices by joining PANDAS dataframes
Viewed 0 times
pandasmembershipgymfillingpricesdataframesjoining
Problem
I have two dataframes,
and
```
Date Description Amount
0 2010-01-31 1 Month Membership 54.036316
1 2010-01-31 1 Year Membership 325.000000
2 2010-01-31 4 Month Membership 147.642353
3 2010-01-31 7 Month Membership 227.890000
5 2010-02-28 1 Month Membership 55.283846
6 2010-02-28 1 Year Membership 333.250000
7 2010-02-28 4 Month Membership 146.257358
8 2010-02-28 6 Month Membership 165.000000
9 2010-02-28 7 Month Membership 223.905714
10 2010-02-28 Pool Only 250.000000
...
462 2015-12-31 4 Month Membership 146.390000
463 2015-12-31
df which contains a list of members in addition to the type of contract that they purchased on a given date, df is about 10 000 entries. I have another df_prices which contains the average price of a given contract for a given year and month. df resembles:Member Nbr Date-Joined Contract Type
1 1 2010-03-31 1 Year Membership
2 1 2011-04-16 1 Year Membership
3 1 2012-08-06 1 Year Membership
4 1 2013-08-21 1 Year Membership
5 1 2014-08-31 1 Year Membership
6 2 2015-09-03 1 Year Membership
7 2 2012-12-10 4 Month Membership
8 2 2013-03-13 1 Year Membership
9 3 2014-03-15 1 Year Membership
10 3 2010-02-09 1 Year Membership
...
10095 7374 2016-02-29 1 Month Membership
10096 7375 2016-03-01 1 Year Membership
10097 7376 2016-03-01 1 Month Membership
10098 7378 2016-03-04 1 Month Membership
10099 7379 2016-03-06 1 Month Membership
10100 7380 2016-03-05 1 Year Membership
10101 7387 2016-03-10 3 Month Membershipand
df_prices```
Date Description Amount
0 2010-01-31 1 Month Membership 54.036316
1 2010-01-31 1 Year Membership 325.000000
2 2010-01-31 4 Month Membership 147.642353
3 2010-01-31 7 Month Membership 227.890000
5 2010-02-28 1 Month Membership 55.283846
6 2010-02-28 1 Year Membership 333.250000
7 2010-02-28 4 Month Membership 146.257358
8 2010-02-28 6 Month Membership 165.000000
9 2010-02-28 7 Month Membership 223.905714
10 2010-02-28 Pool Only 250.000000
...
462 2015-12-31 4 Month Membership 146.390000
463 2015-12-31
Solution
Consider a left side merge to match member data with corresponding prices by contract type and date. No
for loops or external list needed for this approach:df = pd.read_excel(location)
df_prices = pd.read_excel(location2)
df_prices.fillna(method='ffill',inplace=True)
finaldf = pd.merge(df, df_prices, left_on=['Date-Joined','Contract Type'],
right_on=['Date','Description'],
how='left')[['Member Nbr','Date-Joined','Contract Type','Amount']]
finaldf.to_excel(location, index=False)Code Snippets
df = pd.read_excel(location)
df_prices = pd.read_excel(location2)
df_prices.fillna(method='ffill',inplace=True)
finaldf = pd.merge(df, df_prices, left_on=['Date-Joined','Contract Type'],
right_on=['Date','Description'],
how='left')[['Member Nbr','Date-Joined','Contract Type','Amount']]
finaldf.to_excel(location, index=False)Context
StackExchange Code Review Q#135012, answer score: 3
Revisions (0)
No revisions yet.