patternpythonMinor
Working with pandas dataframes for stock backtesting exercise
Viewed 0 times
pandaswithstockworkingexercisebacktestingfordataframes
Problem
I'm attempting to apply a long set of conditions and operations onto a pandas dataframe (see the dataframe below with VTI, upper, lower, etc). I attempted to use apply, but I was having a lot of trouble doing so. My current solution (which works perfectly) relies on a
If I wanted to run a bunch of conditions and
My intuition says to use
Get row data simply obtains key data from the dataframe and computes certain information based on globals (like how much capital I have already, how many stocks I have already) and spits out a list. I append all these lists into a dataframe that I call the portfolio.
I've given a snippet of the code that I've already made using a
` ##This is only for the sell portion of the algorithm
if val['sell'] == True and tokens == maxtokens:
print 'nothign to sell'
if val['sell'] == True and tokens = sellbuybuffer:
status = 'sold'
#This
for loop iterating through the dataframe. But my sense is that this is an inefficient way to complete my simulation. I'd appreciate help on the design of my code.VTI uppelower sell buy AU BU BL date Tok order
44.58 NaN NaN False False False False False 2001-06-15 5 0
44.29 NaN NaN False False False False False 2001-06-18 5 1
44.42 NaN NaN False False False False False 2001-06-19 5 2
44.88 NaN NaN False False False False False 2001-06-20 5 3
45.24 NaN NaN False False False False False 2001-06-21 5 4
If I wanted to run a bunch of conditions and
for loops like the below and run the function below (the get row data function) only if the row meets the conditions provided, how would I do so?My intuition says to use
.apply() but I'm not clear how to do it within this scenario. With all the if's and for-loops combined, it's a lot of rows. The below actually outputs an entirely new dataframe. I'm wondering if there are more efficient/better ways to think about the design of this simulation/stock backtesting process.Get row data simply obtains key data from the dataframe and computes certain information based on globals (like how much capital I have already, how many stocks I have already) and spits out a list. I append all these lists into a dataframe that I call the portfolio.
I've given a snippet of the code that I've already made using a
for-loop.` ##This is only for the sell portion of the algorithm
if val['sell'] == True and tokens == maxtokens:
print 'nothign to sell'
if val['sell'] == True and tokens = sellbuybuffer:
status = 'sold'
#This
Solution
Pandas allows you to filter dataframes efficiently using boolean formulas.
Instead of using a
To sort a dataframe, you can also simply write:
Instead of using a
for loop and conditional branching, use the following syntax:df = portfolio[(portfolio['sell'] == True) & (portfolio['Tok'] < maxtokens)]To sort a dataframe, you can also simply write:
portfolio = portfolio.sort('VTI', ascending=False)
sold_positions = portfolio[portfolio['BL'] == True].sort('upperlower', ascending=True)Code Snippets
df = portfolio[(portfolio['sell'] == True) & (portfolio['Tok'] < maxtokens)]portfolio = portfolio.sort('VTI', ascending=False)
sold_positions = portfolio[portfolio['BL'] == True].sort('upperlower', ascending=True)Context
StackExchange Code Review Q#43517, answer score: 3
Revisions (0)
No revisions yet.