HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMajor

Benchmarking things in C#

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
benchmarkingthingsstackoverflow

Problem

I needed a better way to benchmark code, because, well, rewriting the same benchmarking code every time I need it is just...well...unpleasant.

So, here's a class which does just that, it runs an Action over a specific number of rounds, and calculates certain stats on it.

Another nice feature is that it doesn't store the run times as it calculates the stats. So you can literally supply any value for rounds and it should work. (Not tested for rounds values greater than 10,000,000.)

The latest version is on GitHub.

It's pretty simple. Two static methods on a simple class that run the benchmark.

The nice thing about this class is it includes a version for Func as well, which will also verify the output of the function. This means you can benchmark and verify your code at the same time, to make sure that nothing weird happens.

```
///
/// Represents the result of a benchmarking session.
///
public class BenchmarkResult
{
///
/// The total number of rounds ran.
///
public ulong RoundsRun { get; set; }

///
/// The average time for all the rounds.
///
public TimeSpan AverageTime { get; set; }

///
/// The maximum time taken for a single round.
///
public TimeSpan MaxTime { get; set; }

///
/// The minimum time taken for a single round.
///
public TimeSpan MinTime { get; set; }

///
/// The variance (standard deviation) of all the rounds.
///
public TimeSpan Variance { get; set; }

///
/// The number of rounds that passed testing. (Always equivalent to for .)
///
public ulong RoundsPassed { get; set; }

///
/// The total amount of time taken for all the benchmarks. (Does not include statistic calculation time, or result verification time.)
///
///
/// Depending on the number of rounds and time taken for each, this value may not be entirely representful of the actual result, and may have rounded over. It should be used with cau

Solution

Small issues before I get into the big one:

-
Please make those setters private. The caller of this code has no business changing any of those values.

-
Don't use ulong unless you are interoperating with unmanaged code. long has plenty of range. .NET uses signed quantities even for quantities that are logically always positive. It makes it easier to do things like take the difference of two quantities.

Indeed, benchmarking is hard. I wrote a long series of articles on this for a developer site which unfortunately I think went out of business. I'll have to see if I can find those and repost them.

So today just a quick note.

Suppose the code under test allocates a lot of objects, but not so many that the GC gets invoked. You then run a second test in the same process which allocates a small number of objects, just enough to push the collection pressure up higher, and boom, the GC runs. Who gets charged the cost of that GC? The second test. Who was to blame for most of the cost of that GC? The first test. So you can easily charge GC costs to the wrong thing.

The net effect of this is not just that the cost gets charged to the wrong code, but also that your results may vary widely from run to run, depending on the state of the heap at any particular time.

Now, one might say, OK, we'll just do a forced collection (don't forget to wait for pending finalizers!) after every test, and charge that cost to the code under test. This will certainly remove variability from the test, which is good. But now we are benchmarking code in an unrealistic environment: an environment where the GC is fully collected. Real users running your code will not be running in that world, so now we've made the test less variable by making it less realistic. That variability is part of the user's experience of the code, and you're removing that for your testing purposes.

I don't have a good one-size-fits-all solution. What I have is the knowledge that when I'm designing benchmarks, I have to think about deferred costs like GCs and make a policy for how I'm going to measure them.

Context

StackExchange Code Review Q#125539, answer score: 40

Revisions (0)

No revisions yet.