HiveBrain v1.2.0
Get Started
← Back to all entries
principlecppMinor

C++ vs D - Algorithm Optimization/Conversion (Using vectors/arrays)

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
conversionvectorsarraysalgorithmoptimizationusing

Problem

OK, I've been playing with D for a while (and been in love with its expressive power and simplicity, to be honest). However, since I'm still new to D, I'm facing a few issues.

Let's take the following example. All the following program does is to take numbers in the 0..10000000 (fairly big one, for benchmarking purposes) and for each one of them return a vector/array with the positions of bits set (in binary).

E.g.

4 = 100(2) => [ 2 ]
5 = 101(2) => [ 0, 2 ]
6 = 110(2) => [ 1, 2 ]
7 = 111(2) => [ 0, 1, 2]

And so on...


Now here's my C++ code (no vector reserve etc being used) :

// bits.cpp

#include 
#include 
#include "math.h"

using namespace std;

vector bitsSet(unsigned long long bitboard)
{
    unsigned int n;
    vector res;

    for (n = 0; bitboard != 0; n++, bitboard &= (bitboard - 1))
    {
        res.push_back(log2(bitboard & ~(bitboard-1)));
    }

    return res;
}

int main()
{
    for (int k=0; k res = bitsSet(k);
    }

    return 0;
}


And here's my D code :

// bits.d

import std.stdio;
import std.math;

int[] bitsSet(ulong bitboard)
{
    int[] res;

    for (int n=0; bitboard!=0; n++, bitboard&=(bitboard-1))
        res ~= cast(int)log2(bitboard & ~(bitboard-1));

    return res;
}

void main(string[] args)
{
    for (ulong k=0; k<10000000; k++)
    {
        int[] bits = bitsSet(k);
    }
}


Now, given that the 2 pieces of code are compiled with g++ bits.cpp -o cbits (or clang++ bits.cpp -o cbits) and dmd bits.d -ofdbits, respectively, these are my benchmark results, using time (on Mac OS X 10.8.2) :

For C++ :

time ./cbits

real    0m19.742s
user    0m19.722s
sys 0m0.012s


For D :

time ./dbits

real    0m14.914s
user    0m14.891s
sys 0m0.017s


This looks OK. (with D being - for me - noticeably faster).

NOTE :

Now, if I try to use something like res.reserve(64); in my C++ bitsSet function, though, time drops to around 7s.... which IS significantly faster. Tried something like `res.l

Solution

How about: using res.length = 64; and then in your loop replace

res ~= cast(int)log2(bitboard & ~(bitboard-1));


by:

res[n] = cast(int)log2(bitboard & ~(bitboard-1));


But to be honest this type of code will be completely dominated by allocations, which is not entirely interesting as a language benchmark. Try to reuse the same array instead.

And there is much faster ways of locating bits (see core.bitop).

tl;dr you're optimizing a slow program.

Code Snippets

res ~= cast(int)log2(bitboard & ~(bitboard-1));
res[n] = cast(int)log2(bitboard & ~(bitboard-1));

Context

StackExchange Code Review Q#21371, answer score: 8

Revisions (0)

No revisions yet.