HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Speeding up the generation all bit strings with Hamming distance one

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
thebitallwithdistanceonegenerationspeedingstringshamming

Problem

I'm playing around with hypercubes. For this, I need to create its edge set. Specifically, the edge set is the set of all bit strings with Hamming distance 1. An important property is that I want to generate every such pair exactly once. I do this with the following piece of code (I'm also providing a small benchmark for testing its performance):

#include 
#include 
#include 
#include 

int main() 
{
    // With 2^15, takes about 0.8 seconds on my machine.
    const int n = std::pow(2, 15);

    // Storing edges not strictly necessary, but let's do it so that the compiler
    // doesn't optimize something away.
    std::vector edges;
    auto start = std::chrono::high_resolution_clock::now();

    for (int i = 0; i > 1;
            k |= k >> 2;
            k |= k >> 4;
            k |= k >> 8;
            k |= k >> 16;

            if (k + 1 == u)
            {
                edges.emplace_back(j);
                edges.emplace_back(i);
            }
        }
    }

    auto duration = std::chrono::duration_cast(std::chrono::high_resolution_clock::now() - start);
    std::cout << (duration.count() / 1000.0) << "\n";

    // 2^(n-1)*n
    std::cout << (edges.size() / 2) << "\n";
}


This works fine, but can we make the above even more efficient?

Solution

To start with, here is a trivial optimization you can perform: in the most inner loop, you store u, then have k = u - 1, then compare k + 1 to u, which corresponds to comparing k to u - 1. Basically your inner loop performs unneeded work; just set directly u to (i ^ j) - 1 and you will save some work. Here is the most inner loop once changed:

const int u = (i ^ j) - 1;

// https://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2
int k = u;
k |= k >> 1;
k |= k >> 2;
k |= k >> 4;
k |= k >> 8;
k |= k >> 16;

if (k == u)
{
    edges.emplace_back(j);
    edges.emplace_back(i);
}


It timed it on my computer and it was consistently a bit faster, which means that the compiler wasn't able to make this simple transformation.

On the other hand, we can notice an interesting pattern with your original code: k + 1 can only be equal to u when u is a power of \$2\$. Checking whether a number is a power of \$2\$ or (or \$0\$, but the formula you use already has a special case of \$0\$) is easily done: a power of \$2\$ has exactly one bit set, so you only have to check whether removing a set bit from an integer returns \$0\$ to check whether an integer is \$0\$ or a power of \$2\$. This comparison is so cheap that you can perform it first and perform your current check only when it is true:

const int u = (i ^ j);

if (not (u & (u - 1))) // check whether u is 0 or a power of 2
{
    int k = u - 1;
    k |= k >> 1;
    k |= k >> 2;
    k |= k >> 4;
    k |= k >> 8;
    k |= k >> 16;

    if (k + 1 == u)
    {
        edges.emplace_back(j);
        edges.emplace_back(i);
    }
}


Now, correct me if I'm wrong, but if u is a power of two, you don't actually need to check anything else after (you don't need to round up to a power of \$2\$ since you know it's already one). You can simply remove all the k stuff. I mean, it was a wild guess, but the result were exactly the same on my computer. You can just check that u is a power of \$2\$ then emplace_back your values directly.

Code Snippets

const int u = (i ^ j) - 1;

// https://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2
int k = u;
k |= k >> 1;
k |= k >> 2;
k |= k >> 4;
k |= k >> 8;
k |= k >> 16;

if (k == u)
{
    edges.emplace_back(j);
    edges.emplace_back(i);
}
const int u = (i ^ j);

if (not (u & (u - 1))) // check whether u is 0 or a power of 2
{
    int k = u - 1;
    k |= k >> 1;
    k |= k >> 2;
    k |= k >> 4;
    k |= k >> 8;
    k |= k >> 16;

    if (k + 1 == u)
    {
        edges.emplace_back(j);
        edges.emplace_back(i);
    }
}

Context

StackExchange Code Review Q#116341, answer score: 5

Revisions (0)

No revisions yet.