patternpythonMinor

Calculate middle values faster

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

valuesfastercalculatemiddle

Problem

I want to improve the performance of this code:

```
#!/usr/bin/env python
# -- coding: utf-8 --

import numpy
import random
import math

random.seed(42)

class TrainModel:
def __init__(cls, scheduling_interval, trains):
cls.scheduling_interval = scheduling_interval
cls.trains = trains

class Train:
def __init__( cls, legs):
cls.legs = legs

class Leg: # Teilfahrt
def __init__(cls, travel_time, earliest_departure_time, latest_departure_time,
power_profile):
cls.travel_time = travel_time # Fahrzeit
cls.earliest_departure_time = earliest_departure_time # Früheste Abfraht
cls.latest_departure_time = latest_departure_time # Späteste Abfahrt
cls.power_profile = power_profile # Leistungsprofil

# Test model
scheduling_interval = 60*24 # one day in minutes
train_number = 10
trains = []
for i in xrange(train_number):
legs = []
leg_number = random.randint(5,10)
for j in xrange(leg_number):
travel_time = random.randint(5,40)
earliest_departure_time = random.randint(0,scheduling_interval-100)
latest_departure_time = earliest_departure_time+7
power_profile = numpy.random.rand(travel_time*60+1)
legs.append(Leg(travel_time, earliest_departure_time, latest_departure_time, power_profile))
train = Train(legs)
trains.append(train)

train_model = TrainModel(scheduling_interval, trains)

def calc_middle(t, profile):
x_min = int(min((901-1)*(t+1), len(profile)-1))
sum_m = sum(a for a in profile[(901-1)*t+1:x_min] if a > 0)
sum_m += max(0.5 profile[(901-1)t], 0)
sum_m += max(0.5 * profile[x_min], 0)
return sum_m

storage = {}

# Slow loops profiling shows that most of the time is spend in the sum function of calc_middle
number_of_middle_intervals = int(math.ceil((train_model.scheduling_interval*60+1)/901))
for t in xrange(len(train_model.trains)):
for r in xrange(len(train_model.trains[t].legs)):
current_t

Solution

Running a profiler over your code shows that most of the time is spent in this line in calc_middle():

sum_m = sum(a for a in profile[(901-1)*t+1:x_min] if a > 0)

So if we want to make this go faster, we need to speed up this line. Some suggestions:

-
Pre-filter the negative elements with numpy.clip().

The same array is coming through this function many times – wouldn’t it be cheaper to filter out negative elements just once?

We always get full_profile being passed in here, so add

full_profile = full_profile.clip(0)

before entering the for i in xrange(number_of_middle_intervals) loop.

This lets you reduce calc_middle to:

def calc_middle(t, profile):
    x_min = int(min((901-1)*(t+1), len(profile)-1))
    profile = profile.clip(0)
    return sum(profile[(901-1)*t:x_min+1])

There's a nice little saving.

-
Use numpy.sum().

Since the slice is now a pure numpy array, rather than a Python generator expression, we can call numpy.sum() here for extra speed:

def calc_middle(t, profile):
    x_min = int(min((901-1)*(t+1), len(profile)-1))
    profile = profile.clip(0)
    return numpy.sum(profile[(901-1)*t:x_min+1])

Reprofiling the new code highlights another line which is chewing up cycles: computing x_min, e.g.:

x_min = int(min((901-1)*(t+1), len(profile)-1))

You can make a few tweaks for speed here:

Reuse values where possible

Don't unnecessarily cast to int()

Only compute this value if you actually need to

Knowing that this line is quite a sink, here’s a rewrite of the innermost for loop that tries to speed it up:

x_start = 900 * i
if x_start > dim_end:
    tmp_sum = 0
else:
    x_min = min(901 * i, len(full_profile)-1)
    if x_min < dim_start:
        tmp_sum = 0
    else:
        tmp_sum = numpy.sum(full_profile[x_start:x_min+1])

There are probably more optimisations to be made; this is just based on an hour or so poking around before bed. In informal testing, it takes the time to run with train_number=24 from ~2.4s to ~0.6s. Better, but probably not yet suitable for 24000 trains.

I thought about it a bit more this morning. Here are a few more things I found. None of these are as significant, but they’re useful savings.

-
If you compute x_start as:

x_start = 0
for i in xrange(number_of_middle_intervals):
    # do stuff
    x_start += 900

rather than

for i in xrange(number_of_middle_intervals):
    x_start = 900 * i

I get a small saving. Saves about 0.2s when running with train_number=200 (from 4.69s to 4.5s).

-
Since x_start is increasing with every iteration of the loop, when you get x_start > dim_end, you know that this will always be true for the current value of the j-loop. So you can fill in the rest of power_value_per_middle_interval with zeroes:

if x_start > dim_end:
    remaining_length = len(power_value_per_middle_intervall) - number_of_middle_intervals
    power_value_per_middle_intervall += [0] * remaining_length
    break

4.5s ~> 4.12s.

Code Snippets

sum_m = sum(a for a in profile[(901-1)*t+1:x_min] if a > 0)

full_profile = full_profile.clip(0)

def calc_middle(t, profile):
    x_min = int(min((901-1)*(t+1), len(profile)-1))
    profile = profile.clip(0)
    return sum(profile[(901-1)*t:x_min+1])

def calc_middle(t, profile):
    x_min = int(min((901-1)*(t+1), len(profile)-1))
    profile = profile.clip(0)
    return numpy.sum(profile[(901-1)*t:x_min+1])

x_min = int(min((901-1)*(t+1), len(profile)-1))

Context

StackExchange Code Review Q#114853, answer score: 7

Revisions (0)

No revisions yet.