patternpythonMinor
Does this code actually give a valid representation of how quick a system is?
Viewed 0 times
thishowsystemgiveactuallydoesvalidcodequickrepresentation
Problem
So I wrote this code a while ago as a way of seeing how many times it could compute a simple sum in a second (one tick). I was wondering if this gives me valid results or is in fact giving me bogus ones.
from datetime import datetime
from datetime import timedelta
start_time = datetime.now()
def millis():
dt = datetime.now() - start_time
ms = (dt.days * 24 * 60 * 60 + dt.seconds) * 1000 + dt.microseconds / 1000.0
return ms
def tickscheck(start):
x = 0
count = 0
while millis() - start < 1000:
x = 4+5
#counting up
count = count + 1
print("It Took " + str(count) + " Counts\nOver " + str(millis()- start) + "ticks")
running = True
while(running == True):
tickscheck(millis())
restart = input("Do You Want Restart?")
restart = restart.lower()
if restart in ("yes", "y", "ok", "sure", ""):
print("Restarting")
else:
print("closing Down")
running = FalseSolution
No, your benchmark is completely bogus. Let's look at the main loop:
The addition
But aren't we then still benchmarking the counter increment? Likely, no. The function call
The next problem is which time you are measuring. You are currently measuring the elapsed wall clock time (sometimes: “real” time), i.e. the time that you observed passing. However, this is not the same as the time your loop spent executing, because the operating system is free to interrupt execution to schedule other processes in between. The correct time to measure is the “user” time to see how much your program itself worked, and the “system” time to see how much the OS kernel worked on your program's behalf via system calls. The used CPU time is the sum of user and system time. This paragraph uses Unix terminology, but the problem is the same on Windows.
All of this is complicated, so don't do it yourself. Python has a
Edit: code example
Assuming Python 3.3 or later, we can use the
Unfortunately I can't test it right now because I don't have Python 3.3 or later installed. So we might have to use a timer that uses wallclock time instead – this isn't so catastrophic because
Example ballpark results:
So with a back of the envelope-calculation, we can say that machine 2 is roughly 2–3 times as fast as machine 1. This is consistent with their specs (desktop vs. low-end server).
while millis() - start < 1000:
x = 4+5
#counting up
count = count + 1The addition
4+5 can be constant-folded when the code is compiled. If you are lucky, the assignment remains although a constant assignment might be lifted out of the loop.But aren't we then still benchmarking the counter increment? Likely, no. The function call
millis() in the loop condition is probably at least an order of magnitude more expensive. So essentially, you're benchmarking how fast you can check whether you're done with benchmarking.The next problem is which time you are measuring. You are currently measuring the elapsed wall clock time (sometimes: “real” time), i.e. the time that you observed passing. However, this is not the same as the time your loop spent executing, because the operating system is free to interrupt execution to schedule other processes in between. The correct time to measure is the “user” time to see how much your program itself worked, and the “system” time to see how much the OS kernel worked on your program's behalf via system calls. The used CPU time is the sum of user and system time. This paragraph uses Unix terminology, but the problem is the same on Windows.
All of this is complicated, so don't do it yourself. Python has a
timeit module where you can measure how long a function took to execute a specific number of times. The number of iterations per second is then the inverse of the time spent executing. To prevent the overhead of the function call from affecting the result, the timed function should include a loop itself, e.g. with a million iterations. To execute the body of the loop 10M times, you would then time the function ten times etc.Edit: code example
Assuming Python 3.3 or later, we can use the
time.process_time function which measures the used CPU time of our process.import time
import timeit
n = 10000000 # 10M
timings = timeit.repeat(
setup='x, y = 4, 5',
stmt='z = x + y',
timer=time.process_time, # comment this line for pre-3.3 pythons
number=n
)
best_time = max(timings)
repetitions_per_second = n/best_time
print("repetitions per second: %.2E" % repetitions_per_second)Unfortunately I can't test it right now because I don't have Python 3.3 or later installed. So we might have to use a timer that uses wallclock time instead – this isn't so catastrophic because
repeat will take three measurements, and we'll use the best. This does not eliminate the effect of scheduling, but it does minimize it.Example ballpark results:
3.2 2.7
Machine 1 2E7So with a back of the envelope-calculation, we can say that machine 2 is roughly 2–3 times as fast as machine 1. This is consistent with their specs (desktop vs. low-end server).
Code Snippets
while millis() - start < 1000:
x = 4+5
#counting up
count = count + 1import time
import timeit
n = 10000000 # 10M
timings = timeit.repeat(
setup='x, y = 4, 5',
stmt='z = x + y',
timer=time.process_time, # comment this line for pre-3.3 pythons
number=n
)
best_time = max(timings)
repetitions_per_second = n/best_time
print("repetitions per second: %.2E" % repetitions_per_second)3.2 2.7
Machine 1 <= 6E6 = 1E7
Machine 2 <= 2E7 > 2E7Context
StackExchange Code Review Q#48415, answer score: 5
Revisions (0)
No revisions yet.