patternpythonMinor
Calculating speed from a Pandas Dataframe with Time, X, and Y columns
Viewed 0 times
pandascolumnswithtimecalculatinganddataframefromspeed
Problem
I'm trying to calculate speed between consecutive timepoints given data that is in a '.csv' file with the following columns: "Time Elapsed", "x", and "y". The ultimate goal is to get the data into a format where I can plot "Time Elapsed" vs. "speed"
I'm fairly sure that my implementation is doing what I want, but it's certainly possible (and likely) that I overlooked something. I'm also wondering whether there are faster/more efficient ways (in Python) to perform these calculations?
I'm fairly sure that my implementation is doing what I want, but it's certainly possible (and likely) that I overlooked something. I'm also wondering whether there are faster/more efficient ways (in Python) to perform these calculations?
import pandas as pd
import numpy as np
def calculate_speeds(path_to_csv):
data_df = pd.read_csv(path_to_csv)
xy = data_df[['x', 'y']]
b = np.roll(xy, -1, axis=0)[:-1]
a = xy[:-1]
dxy = np.linalg.norm(a - b, axis=1)
dt = (np.roll(data_df['Time Elapsed'], -1) - data_df['Time Elapsed'])[:-1]
speeds = np.divide(dxy, dt)
speed_df = pd.DataFrame(data={'Time Elapsed':data_df['Time Elapsed'][:-1],'Speed':speeds})
return speed_dfSolution
The code reads good enough and is pretty straightforward; you should consider improving its quality by adding documentation in the form of a docstring explaining what kind of data it expects. Speaking as such, you should extract the I/O part out of the function and have it being fed the required data, for better reusability and testing.
One other thing to note is that you perform the same
One other thing to note is that you perform the same
roll twice on different columns of the same dataframe. You can combine these operations and write the equation in a more ... equationy-ish way:import pandas as pd
import numpy as np
def calculate_speeds(positions_over_time):
time = 'Time Elapsed'
movements_over_timesteps = (
np.roll(positions_over_time, -1, axis=0)
- positions_over_time)[:-1]
speeds = np.sqrt(
movements_over_timesteps.x ** 2 +
movements_over_timesteps.y ** 2
) / movements_over_timesteps[time]
return pd.DataFrame({
time: positions_over_time[time][:-1],
'Speed': speeds,
})
if __name__ == '__main__':
data_df = pd.read_csv(path_to_csv)
calculate_speeds(data_df)Code Snippets
import pandas as pd
import numpy as np
def calculate_speeds(positions_over_time):
time = 'Time Elapsed'
movements_over_timesteps = (
np.roll(positions_over_time, -1, axis=0)
- positions_over_time)[:-1]
speeds = np.sqrt(
movements_over_timesteps.x ** 2 +
movements_over_timesteps.y ** 2
) / movements_over_timesteps[time]
return pd.DataFrame({
time: positions_over_time[time][:-1],
'Speed': speeds,
})
if __name__ == '__main__':
data_df = pd.read_csv(path_to_csv)
calculate_speeds(data_df)Context
StackExchange Code Review Q#158688, answer score: 3
Revisions (0)
No revisions yet.