patterncppMinor
Gray codes addition — Take 2
Viewed 0 times
takeadditioncodesgray
Problem
For a while, I have tried to find ways (known or new) to do stuff with Gray codes. As some of you probably know, I already tried to implement an algorithm to add two Gray codes without having to convert them to a regular binary representation, perform a regular addition and convert the result back to a Gray code.
The first time, I tried to implement a bitwise algorithm described by Harold Lucal and then to optimize it. Since even the most optimized code I could come with was still slower than the naive double-conversion solution, I decided to start over from scratch, namely, to observe patterns in Gray codes and find a new way to perform an addition. Before going any further, let me describe the notation used in the rest of the question and how some mathematical symbols are (ab)used to represent common bitwise operations:
Here are the most interesting things I observed and used to devise the new addition algorithm:
-
A Gray power of \$2\$ corresponds to two bits set followed by clear bits, except for \$1\$. That one is easy to observe: with the usual integer representation, a power of \$2\$ corresponds to a single bit set and the formula to convert an integer \$n\$ to its corresponding Gray code is \$(n \gg 1) \oplus n\$.
-
For any Gray code \$n\$, \$2n = (n \ll 1) \oplus parity(n)\$.
-
For two Gray codes \$a\$ and \$b\$, if \$a\$ is a power of \$2\$ and \$a > b\$, then \$a \oplus b = a + b\$.
-
For two Gray codes \$a = 2^n\$ and \$b\$, if \$a \leq b
Additionally, I will frequently use what I call the « base » or « base2 » of a Gray code (I don't have a better name - if you have one, please l
The first time, I tried to implement a bitwise algorithm described by Harold Lucal and then to optimize it. Since even the most optimized code I could come with was still slower than the naive double-conversion solution, I decided to start over from scratch, namely, to observe patterns in Gray codes and find a new way to perform an addition. Before going any further, let me describe the notation used in the rest of the question and how some mathematical symbols are (ab)used to represent common bitwise operations:
- \$\oplus\$ is used to represent a bitwise XOR operation.
- \$\ll\$ is used to represent a left shift.
- \$\gg\$ is used to represent a right shift.
- \$parity(n)\$ is used to compute the parity of a Gray code \$n\$, which happends to correspond to the parity of the number of set bits in its representation (you can find an implementation in the older question).
Here are the most interesting things I observed and used to devise the new addition algorithm:
-
A Gray power of \$2\$ corresponds to two bits set followed by clear bits, except for \$1\$. That one is easy to observe: with the usual integer representation, a power of \$2\$ corresponds to a single bit set and the formula to convert an integer \$n\$ to its corresponding Gray code is \$(n \gg 1) \oplus n\$.
-
For any Gray code \$n\$, \$2n = (n \ll 1) \oplus parity(n)\$.
-
For two Gray codes \$a\$ and \$b\$, if \$a\$ is a power of \$2\$ and \$a > b\$, then \$a \oplus b = a + b\$.
-
For two Gray codes \$a = 2^n\$ and \$b\$, if \$a \leq b
Additionally, I will frequently use what I call the « base » or « base2 » of a Gray code (I don't have a better name - if you have one, please l
Solution
Performance Issues
You have at worst 4 branches in your addition algorithm. The pattern of the bits has to be considered random for the sense of branch prediction. Thus you are a victim of branch prediction failure.
In addition to these 4 branches, you have at most 3 for loops (which are easier to predict).
Alternative implementation
A quick google later landed me on wikipedia with the following code:
which to my eye seems much faster due to the lack of branches other than the one for loop which is easily predictable for the cpu and may possibly even be unrolled.
So:
is:
Unless you can figure out a way to do the addition with 1 for loop and 0 branches, you're better of just converting back and forth.
Edit: I found a way of doing it without any branches, constant time:
It's using the trick for calculating a number xored with all its right shifts which converts both inputs to binary, does the addition and back to graycode. All in roughly 30 ALU instructions.
You have at worst 4 branches in your addition algorithm. The pattern of the bits has to be considered random for the sense of branch prediction. Thus you are a victim of branch prediction failure.
In addition to these 4 branches, you have at most 3 for loops (which are easier to predict).
Alternative implementation
A quick google later landed me on wikipedia with the following code:
/*
The purpose of this function is to convert an unsigned
binary number to reflected binary Gray code.
The operator >> is shift right. The operator ^ is exclusive or.
*/
unsigned int binaryToGray(unsigned int num)
{
return (num >> 1) ^ num;
}
/*
The purpose of this function is to convert a reflected binary
Gray code number to a binary number.
*/
unsigned int grayToBinary(unsigned int num)
{
unsigned int mask;
for (mask = num >> 1; mask != 0; mask = mask >> 1)
{
num = num ^ mask;
}
return num;
}which to my eye seems much faster due to the lack of branches other than the one for loop which is easily predictable for the cpu and may possibly even be unrolled.
So:
binaryToGray(grayToBinary(x) + grayToBinary(y));is:
- Much less code than your algorithm.
- Only 2 loops compared to 3 loops and 4 unpredictable branches.
Unless you can figure out a way to do the addition with 1 for loop and 0 branches, you're better of just converting back and forth.
Edit: I found a way of doing it without any branches, constant time:
unsigned int addGray(unsigned int a, unsigned int b){
assert(sizeof(unsigned int) == 4);
a = a ^ (a >> 16);
b = b ^ (b >> 16);
a = a ^ (a >> 8);
b = b ^ (b >> 8);
a = a ^ (a >> 4);
b = b ^ (b >> 4);
a = a ^ (a >> 2);
b = b ^ (b >> 2);
unsigned int x = (a ^ (a >> 1)) + (b ^ (b >> 1));
return (x >> 1) ^ x;
}It's using the trick for calculating a number xored with all its right shifts which converts both inputs to binary, does the addition and back to graycode. All in roughly 30 ALU instructions.
Code Snippets
/*
The purpose of this function is to convert an unsigned
binary number to reflected binary Gray code.
The operator >> is shift right. The operator ^ is exclusive or.
*/
unsigned int binaryToGray(unsigned int num)
{
return (num >> 1) ^ num;
}
/*
The purpose of this function is to convert a reflected binary
Gray code number to a binary number.
*/
unsigned int grayToBinary(unsigned int num)
{
unsigned int mask;
for (mask = num >> 1; mask != 0; mask = mask >> 1)
{
num = num ^ mask;
}
return num;
}binaryToGray(grayToBinary(x) + grayToBinary(y));unsigned int addGray(unsigned int a, unsigned int b){
assert(sizeof(unsigned int) == 4);
a = a ^ (a >> 16);
b = b ^ (b >> 16);
a = a ^ (a >> 8);
b = b ^ (b >> 8);
a = a ^ (a >> 4);
b = b ^ (b >> 4);
a = a ^ (a >> 2);
b = b ^ (b >> 2);
unsigned int x = (a ^ (a >> 1)) + (b ^ (b >> 1));
return (x >> 1) ^ x;
}Context
StackExchange Code Review Q#86376, answer score: 3
Revisions (0)
No revisions yet.