HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Filling in gym membership prices by joining PANDAS dataframes

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
pandasmembershipgymfillingpricesdataframesjoining

Problem

I have two dataframes, df which contains a list of members in addition to the type of contract that they purchased on a given date, df is about 10 000 entries. I have another df_prices which contains the average price of a given contract for a given year and month.

df resembles:

Member Nbr      Date-Joined         Contract Type
1           1        2010-03-31     1 Year Membership
2           1        2011-04-16     1 Year Membership
3           1        2012-08-06     1 Year Membership
4           1        2013-08-21     1 Year Membership
5           1        2014-08-31     1 Year Membership
6           2        2015-09-03     1 Year Membership
7           2        2012-12-10     4 Month Membership
8           2        2013-03-13     1 Year Membership
9           3        2014-03-15     1 Year Membership
10          3        2010-02-09     1 Year Membership
...
10095    7374        2016-02-29     1 Month Membership
10096    7375        2016-03-01     1 Year Membership
10097    7376        2016-03-01     1 Month Membership
10098    7378        2016-03-04     1 Month Membership
10099    7379        2016-03-06     1 Month Membership
10100    7380        2016-03-05     1 Year Membership
10101    7387        2016-03-10     3 Month Membership


and df_prices

```
Date Description Amount
0 2010-01-31 1 Month Membership 54.036316
1 2010-01-31 1 Year Membership 325.000000
2 2010-01-31 4 Month Membership 147.642353
3 2010-01-31 7 Month Membership 227.890000
5 2010-02-28 1 Month Membership 55.283846
6 2010-02-28 1 Year Membership 333.250000
7 2010-02-28 4 Month Membership 146.257358
8 2010-02-28 6 Month Membership 165.000000
9 2010-02-28 7 Month Membership 223.905714
10 2010-02-28 Pool Only 250.000000
...
462 2015-12-31 4 Month Membership 146.390000
463 2015-12-31

Solution

Consider a left side merge to match member data with corresponding prices by contract type and date. No for loops or external list needed for this approach:

df = pd.read_excel(location)
df_prices = pd.read_excel(location2)
df_prices.fillna(method='ffill',inplace=True)

finaldf = pd.merge(df, df_prices, left_on=['Date-Joined','Contract Type'],
                   right_on=['Date','Description'],
                   how='left')[['Member Nbr','Date-Joined','Contract Type','Amount']]

finaldf.to_excel(location, index=False)

Code Snippets

df = pd.read_excel(location)
df_prices = pd.read_excel(location2)
df_prices.fillna(method='ffill',inplace=True)

finaldf = pd.merge(df, df_prices, left_on=['Date-Joined','Contract Type'],
                   right_on=['Date','Description'],
                   how='left')[['Member Nbr','Date-Joined','Contract Type','Amount']]

finaldf.to_excel(location, index=False)

Context

StackExchange Code Review Q#135012, answer score: 3

Revisions (0)

No revisions yet.