HiveBrain v1.2.0
Get Started
← Back to all entries
principlecppMinor

Benchmarking inline vs normal functions in C++

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
inlinefunctionsbenchmarkingnormal

Problem

I am currently learning C++ and have learnt about inline functions and how they can give a performance benefit.

How could this be improved?

#include 
#include 
#include 

long fibonacci(unsigned n) {
    if (n  start, end;

    // Start time.
    start = std::chrono::system_clock::now();

    //std::cout  functionalTimeElapsed = end - start;

    // Start time.
    start = std::chrono::system_clock::now();

    //std::cout  inlineTimeElapsed = end - start;

    if (functionalTimeElapsed > inlineTimeElapsed) {
        std::cout << "n = " << n << "\tFunctional method was faster by " << ((functionalTimeElapsed - inlineTimeElapsed) / inlineTimeElapsed) * 100 << "%\n";
    } else if (functionalTimeElapsed < inlineTimeElapsed) {
        std::cout << "n = " << n << "\tInline method was faster by " << ((inlineTimeElapsed - functionalTimeElapsed) / functionalTimeElapsed) * 100 << "%\n";       
    } else {
        std::cout << "n = " << n << "\tFunctional method and inline method took the same time.\n";
    }
}

int main() {

    for (int i = 5; i <=40 ; i++) {
        timedFibonacci(i);
    }

    cin.get();
}

Solution

Inlining

First of all, your inline function calls the non-inline function, so if the compiler is really paying attention to the inline specification, it's only affecting a single "layer" of invocation.

inline long fib(unsigned n) {
    if (n < 2) return n;
    return fibonacci(n - 1) + fibonacci(n - 2);
}


To stand a chance of getting some good out of the inline specification, you probably want to have this call itself instead of the other function:

inline long fib(unsigned n) {
    if (n < 2) return n;
    return fib(n - 1) + fib(n - 2);
}


In general, recursive functions are a problem for inlining in any case, so many compilers completely disable inline code generation for recursive functions.

Add in the fact that most reasonably current compilers basically ignore the inline specifier anyway, and you get a high likelihood that you won't see any difference between these functions at all.

Timing

I'd do the timing somewhat differently. The single responsibility principle applies here, just like everything else. That means the timer should deal only with timing. I usually use something on this general order:

template 
auto timer(F f, std::string const &label, Args && ...args) {
    using namespace std::chrono;

    auto start = high_resolution_clock::now();
    auto holder = f(std::forward(args)...);
    auto stop = high_resolution_clock::now();
    std::cout (stop - start).count() << "\n";

    return holder;
}


Arguably this already does more than it should (printing out results in addition to the actual timing) but at least it's quite a bit closer to a single responsibility.

With that, we invoke each Fibonacci generator separately, something on this order:

auto foo = timer(fib, "inline", max);


The result will then look something like this:

inline time: 1062


Just a minor note: this is a somewhat more general timing function that most people usually need. In particular, it doesn't require the function being timed to take a specific number of arguments--the number and types of arguments you pass after the label have to match those needed by the function you're timing, but as far as the timer itself cares, it's essentially wide open.

Efficiency

Ignoring, for the moment, the issue of inlining (which is probably pretty much a red herring anyway), I'd consider other ways of computing Fibonacci numbers. In the absence of memoization, a recursive function is horribly inefficient. An iterative function is many times faster. That can be written with a for loop, something on this general order:

unsigned long long fib(unsigned limit) { 
    unsigned long long a = 1;
    unsigned long long b = 1;
    unsigned long long c;

    if (limit < 2)
        return 1;

    for (auto i=1ULL; i<limit; i++) {
        c = a + b;
        a = b;
        b = c;
    }
    return c;   
}


This is somewhat longer, but the difference in speed is...fairly substantial. Computing fib(43) with each, I get timings like this:

out of line time: 2807407170
     inline time: 2794785991
  iterative time: 321
Sum (ignore): 1568397607


I had to switch to showing the time in nanoseconds to get a non-zero time for the iterative solution. Even so, for fib(43) and highest resolution-timing I can get, it still shows a time of 0 about one run out of every 3 or 4. The recursive versions are approximately 7 orders of magnitude slower (and although inline code generation did seem to help a tiny bit, it's still minuscule compared to the improvement from a better algorithm. Oh, and for what it's worth, it looks like the inline code generation really did make a difference. It is pretty small, but at least in my testing, the inline version does finish just a tiny bit faster every time.

Final Code

For what it's worth, here's the code I ran to get the timing comparison:

#include 
#include 
#include 

template 
auto timer(F f, std::string const &label, Args && ...args) {
    using namespace std::chrono;

    auto start = high_resolution_clock::now();
    auto holder = f(std::forward(args)...);
    auto stop = high_resolution_clock::now();
    std::cout (stop - start).count() << "\n";

    return holder;
}

long fibonacci(unsigned n) {
    if (n < 2) return n;
    return fibonacci(n - 1) + fibonacci(n - 2);
}

inline long fib(unsigned n) {
    if (n < 2) return n;
    return fib(n - 1) + fib(n - 2);
}

unsigned long long iter_fib(unsigned limit) {
    unsigned long long a = 1;
    unsigned long long b = 1;
    unsigned long long c;

    if (limit < 2)
        return 1;

    for (auto i = 1ULL; i < limit; i++) {
        c = a + b;
        a = b;
        b = c;
    }
    return c;
}

int main() {
    static const int max = 43;
    auto a = timer(fibonacci, "out of line", max);
    auto b = timer(fib,       "     inline", max);
    auto c = timer(iter_fib,  "  iterative", max);
    std::cout << "Sum (ignore): " << a + b + c;
}

Code Snippets

inline long fib(unsigned n) {
    if (n < 2) return n;
    return fibonacci(n - 1) + fibonacci(n - 2);
}
inline long fib(unsigned n) {
    if (n < 2) return n;
    return fib(n - 1) + fib(n - 2);
}
template <typename F, typename ...Args>
auto timer(F f, std::string const &label, Args && ...args) {
    using namespace std::chrono;

    auto start = high_resolution_clock::now();
    auto holder = f(std::forward<Args>(args)...);
    auto stop = high_resolution_clock::now();
    std::cout << label << " time: " << duration_cast<microseconds>(stop - start).count() << "\n";

    return holder;
}
auto foo = timer(fib, "inline", max);
inline time: 1062

Context

StackExchange Code Review Q#143586, answer score: 7

Revisions (0)

No revisions yet.