patternjavaMinor

YAuB - Micro Benchmark Follow-on

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

microyaubfollowbenchmark

Problem

Following some great advice here from Simon, I realized that I had over-engineered things, and that the Task builder methods were a horrible Java8 abstraxction. In Simon's words: "From a usability perspective, this is a bit weird...".

After messing with things a bit more, the original use-case like:

uBench.addTask(Task.buildCheckedIntTask("Legato Java7", () -> getMaximumBeauty(line), expect));
uBench.addTask(Task.buildCheckedIntTask("Legato Java8", () -> getMaximumBeauty8(line), expect));
uBench.addTask(Task.buildCheckedIntTask("Janos Java7", () -> computeMaxBeauty(line), expect));
uBench.addTask(Task.buildCheckedIntTask("Rolfl Java7", () -> beautyMax7(line), expect));
uBench.addTask(Task.buildCheckedIntTask("Rolfl Java8Regex", () -> beautyMaxF(line), expect));
uBench.addTask(Task.buildCheckedIntTask("Rolfl Java8Filter", () -> beautyMax8(line), expect));

Can be drastically simplified if the addTask method takes a Supplier directly (instead of a task), and a separate Predicate to check the results. The same code above, can be expressed as:

uBench.addTask("Legato Java7", () -> getMaximumBeauty(line), g -> g == 1574);
    uBench.addTask("Legato Java8", () -> getMaximumBeauty8(line), g -> g == 1574);
    uBench.addTask("Janos Java7", () -> computeMaxBeauty(line), g -> g == 1574);
    uBench.addTask("Rolfl Java7", () -> beautyMax7(line), g -> g == 1574);
    uBench.addTask("Rolfl Java8Regex", () -> beautyMaxF(line), g -> g == 1574);
    uBench.addTask("Rolfl Java8Filter", () -> beautyMax8(line), g -> g == 1574);

(where g is a mnemonic for got). That can in turn be simplified to a single predicate:

Predicate check = g -> g == 1574;

and code like:

```
uBench.addTask("Legato Java7", () -> getMaximumBeauty(line), check);
uBench.addTask("Legato Java8", () -> getMaximumBeauty8(line), check);
uBench.addTask("Janos Java7", () -> computeMaxBeauty(line), check);
uBench.addTask("Rolfl Java7", () -> beautyMax7(line), check);
uBench

Solution

For now I'd point out mostly usability issues.

Duplicated logic

The input and the expected value variables are repeated in every task:

uBench.addIntTask("Legato Java7", () -> getMaximumBeauty(line), expect);
uBench.addIntTask("Legato Java8", () -> getMaximumBeauty8(line), expect);
uBench.addIntTask("Janos Java7", () -> computeMaxBeauty(line), expect);
uBench.addIntTask("Rolfl Java7", () -> beautyMax7(line), expect);
uBench.addIntTask("Rolfl Java8Regex", () -> beautyMaxF(line), expect);
uBench.addIntTask("Rolfl Java8Filter", () -> beautyMax8(line), expect);

It doesn't really make sense to have to repeat this:
when comparing a number of alternative implementations,
normally you will use the same input for all of them.
Of course,
you'll probably want to re-run the same methods with different input/output pairs,
but do so one at a time.
To clarify even further,
I don't see a use case for comparing the result of methodA on inputA with the result of methodB on inputB.
Maybe there is such a use case, but I don't think that would be the typical case.
I think normally you would want to run methodA, methodB, methodC, ... on inputA,
then again run the same methods on inputB, then again on inputC, and so on.

One way to avoid repeatedly specifying the same inputs and outputs to each of the tasks could be to store them inside the benchmark instance, by adding .setInput and .setExpectedOutput methods, and let the tasks share that data.
For running the same tasks against several input/output pairs,
these methods could take varargs.
The run method could validate if the input/output pairs are sane.

Task types

It's tedious to have separate .add*Task methods for different return values.
It makes your implementation full of duplicated logic,
and it forces users of the framework into methods returning specific types.
How would I benchmark different search algorithms that sort collections in-place, with no return value?

I'd recommend to take a similar approach as I did in my framework:

Use instance variables to store the initial data and the computation result

Use a thin wrapper around the methods under test. The wrapper passes the input data to the real methods under test, knows how to get the output from the real methods, which can return any type, and store the computation result in an instance field

The validator verifies the result that was written to the instance field

The bottom line is: find a solution to not require prescribed return types.
The framework will be easier to use, and the implementation will have less duplicated code. (no more addIntTask, addLongTask, ...)

Too much boilerplate to remember

There's quite a bit of boilerplate to remember to use this framework,
especially this part:

System.out.println("Warming up");
uBench.benchMark(5000).stream().forEach(System.out::println);
System.out.println("\n\nReal runs\n\n");
uBench.benchMark(10000).stream().sorted(Comparator.comparing(UBench.Stats::get95thPercentile))
        .forEach(System.out::println);

Something like this would be nice to achieve the same result:

uBench.benchMark(5000, "Warming up");
uBench.benchMark(10000, "Real runs", UBench.Stats::get95thPercentile);

You could still keep the uBench.benchMark(int) version for "power users".

Looks promising!

A very important usability feature I see here is that all the functionality is easily accessible and intuitive from a UBench instance.
One can easily explore the available features in an IDE using auto-completion on method names and hints on parameter types.
This is in contrast with an annotation-driven approach that forces users to remember multiple things: the annotation names, and how to trigger the annotation processor that will run the benchmarks.

The reporting features are also great, and something I will definitely shamelessly steal borrow to improve my alternative framework.

Code Snippets

uBench.addIntTask("Legato Java7", () -> getMaximumBeauty(line), expect);
uBench.addIntTask("Legato Java8", () -> getMaximumBeauty8(line), expect);
uBench.addIntTask("Janos Java7", () -> computeMaxBeauty(line), expect);
uBench.addIntTask("Rolfl Java7", () -> beautyMax7(line), expect);
uBench.addIntTask("Rolfl Java8Regex", () -> beautyMaxF(line), expect);
uBench.addIntTask("Rolfl Java8Filter", () -> beautyMax8(line), expect);

System.out.println("Warming up");
uBench.benchMark(5000).stream().forEach(System.out::println);
System.out.println("\n\nReal runs\n\n");
uBench.benchMark(10000).stream().sorted(Comparator.comparing(UBench.Stats::get95thPercentile))
        .forEach(System.out::println);

uBench.benchMark(5000, "Warming up");
uBench.benchMark(10000, "Real runs", UBench.Stats::get95thPercentile);

Context

StackExchange Code Review Q#82439, answer score: 4

Revisions (0)

No revisions yet.