patternpythonCriticalCanonical
Creating an empty Pandas DataFrame, and then filling it
Viewed 0 times
emptycreatingfillinganddataframepandasthen
Problem
I'm starting from the pandas DataFrame documentation here: Introduction to data structures
I'd like to iteratively fill the DataFrame with values in a time series kind of calculation. I'd like to initialize the DataFrame with columns A, B, and timestamp rows, all 0 or all NaN.
I'd then add initial values and go over this data calculating the new row from the row before, say
I'm currently using the code as below, but I feel it's kind of ugly and there must be a way to do this with a DataFrame directly or just a better way in general.
I'd like to iteratively fill the DataFrame with values in a time series kind of calculation. I'd like to initialize the DataFrame with columns A, B, and timestamp rows, all 0 or all NaN.
I'd then add initial values and go over this data calculating the new row from the row before, say
row[A][t] = row[A][t-1]+1 or so.I'm currently using the code as below, but I feel it's kind of ugly and there must be a way to do this with a DataFrame directly or just a better way in general.
import pandas as pd
import datetime as dt
import scipy as s
base = dt.datetime.today().date()
dates = [ base - dt.timedelta(days=x) for x in range(9, -1, -1) ]
valdict = {}
symbols = ['A','B', 'C']
for symb in symbols:
valdict[symb] = pd.Series( s.zeros(len(dates)), dates )
for thedate in dates:
if thedate > dates[0]:
for symb in valdict:
valdict[symb][thedate] = 1 + valdict[symb][thedate - dt.timedelta(days=1)]
Solution
Here's a couple of suggestions:
Use
Note: we could create an empty DataFrame (with
To do these type of calculations for the data, use a NumPy array:
Hence we can create the DataFrame:
Use
date_range for the index:import datetime
import pandas as pd
import numpy as np
todays_date = datetime.datetime.now().date()
index = pd.date_range(todays_date-datetime.timedelta(10), periods=10, freq='D')
columns = ['A','B', 'C']Note: we could create an empty DataFrame (with
NaNs) simply by writing:df_ = pd.DataFrame(index=index, columns=columns)
df_ = df_.fillna(0) # With 0s rather than NaNsTo do these type of calculations for the data, use a NumPy array:
data = np.array([np.arange(10)]*3).THence we can create the DataFrame:
In [10]: df = pd.DataFrame(data, index=index, columns=columns)
In [11]: df
Out[11]:
A B C
2012-11-29 0 0 0
2012-11-30 1 1 1
2012-12-01 2 2 2
2012-12-02 3 3 3
2012-12-03 4 4 4
2012-12-04 5 5 5
2012-12-05 6 6 6
2012-12-06 7 7 7
2012-12-07 8 8 8
2012-12-08 9 9 9
Code Snippets
import datetime
import pandas as pd
import numpy as np
todays_date = datetime.datetime.now().date()
index = pd.date_range(todays_date-datetime.timedelta(10), periods=10, freq='D')
columns = ['A','B', 'C']df_ = pd.DataFrame(index=index, columns=columns)
df_ = df_.fillna(0) # With 0s rather than NaNsdata = np.array([np.arange(10)]*3).TContext
Stack Overflow Q#13784192, score: 427
Revisions (0)
No revisions yet.