HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Calculating speed from a Pandas Dataframe with Time, X, and Y columns

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
pandascolumnswithtimecalculatinganddataframefromspeed

Problem

I'm trying to calculate speed between consecutive timepoints given data that is in a '.csv' file with the following columns: "Time Elapsed", "x", and "y". The ultimate goal is to get the data into a format where I can plot "Time Elapsed" vs. "speed"

I'm fairly sure that my implementation is doing what I want, but it's certainly possible (and likely) that I overlooked something. I'm also wondering whether there are faster/more efficient ways (in Python) to perform these calculations?

import pandas as pd
import numpy as np

def calculate_speeds(path_to_csv):
    data_df = pd.read_csv(path_to_csv)

    xy = data_df[['x', 'y']]

    b = np.roll(xy, -1, axis=0)[:-1]
    a = xy[:-1]
    dxy = np.linalg.norm(a - b, axis=1)   

    dt = (np.roll(data_df['Time Elapsed'], -1) - data_df['Time Elapsed'])[:-1]

    speeds = np.divide(dxy, dt)

    speed_df = pd.DataFrame(data={'Time Elapsed':data_df['Time Elapsed'][:-1],'Speed':speeds})

    return speed_df

Solution

The code reads good enough and is pretty straightforward; you should consider improving its quality by adding documentation in the form of a docstring explaining what kind of data it expects. Speaking as such, you should extract the I/O part out of the function and have it being fed the required data, for better reusability and testing.

One other thing to note is that you perform the same roll twice on different columns of the same dataframe. You can combine these operations and write the equation in a more ... equationy-ish way:

import pandas as pd
import numpy as np

def calculate_speeds(positions_over_time):
    time = 'Time Elapsed'

    movements_over_timesteps = (
        np.roll(positions_over_time, -1, axis=0)
        - positions_over_time)[:-1]

    speeds = np.sqrt(
        movements_over_timesteps.x ** 2 +
        movements_over_timesteps.y ** 2
    ) / movements_over_timesteps[time]

    return pd.DataFrame({
        time: positions_over_time[time][:-1],
        'Speed': speeds,
    })

if __name__ == '__main__':
    data_df = pd.read_csv(path_to_csv)
    calculate_speeds(data_df)

Code Snippets

import pandas as pd
import numpy as np


def calculate_speeds(positions_over_time):
    time = 'Time Elapsed'

    movements_over_timesteps = (
        np.roll(positions_over_time, -1, axis=0)
        - positions_over_time)[:-1]

    speeds = np.sqrt(
        movements_over_timesteps.x ** 2 +
        movements_over_timesteps.y ** 2
    ) / movements_over_timesteps[time]

    return pd.DataFrame({
        time: positions_over_time[time][:-1],
        'Speed': speeds,
    })


if __name__ == '__main__':
    data_df = pd.read_csv(path_to_csv)
    calculate_speeds(data_df)

Context

StackExchange Code Review Q#158688, answer score: 3

Revisions (0)

No revisions yet.