HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Optimizing multiplication of two polynomials in GF(2^32)

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
twopolynomialsoptimizingmultiplication

Problem

I have the following method for multiplying two polynomials in GF(232):

#include 
#include 
#include 
#include 
#include 

typedef unsigned short uint16_t;
typedef unsigned int uint32_t;

unsigned get_high16(unsigned x)
{
    return (x >> 16) & 0xffff;
}

unsigned get_low16(unsigned x)
{
    return x & 0xffff;
}

uint32_t mul_reduce_32(uint32_t x, uint32_t y)
{
    const __m128i a = _mm_set_epi32(0, 0, 0, x);
    const __m128i b = _mm_set_epi32(0, 0, 0, y);
    const __m128i c =_mm_clmulepi64_si128(a, b, 0);

    uint16_t D = _mm_extract_epi16(c, 2);

    D ^= _mm_extract_epi32(c, 1) >> 31;
    D ^= _mm_extract_epi32(c, 1) >> 30;
    D ^= _mm_extract_epi32(c, 1) >> 29;
    D ^= _mm_extract_epi32(c, 1) >> 27;
    D ^= _mm_extract_epi32(c, 1) >> 25;

    const uint16_t X3 = _mm_extract_epi16(c, 3);
    const __m128i X3D = _mm_set_epi16(0, 0, 0, 0, 0, 0, X3, D);

    const uint32_t E = _mm_extract_epi32(X3D, 0) << 7;
    const uint32_t F = _mm_extract_epi32(X3D, 0) << 5;
    const uint32_t G = _mm_extract_epi32(X3D, 0) << 3;
    const uint32_t I = _mm_extract_epi32(X3D, 0) << 2;
    const uint32_t J = _mm_extract_epi32(X3D, 0) << 1;

    const uint16_t H1 = X3 ^ get_high16(E) ^ get_high16(F) ^ get_high16(G) ^ get_high16(I) ^ get_high16(J);
    const uint16_t H0 = D ^ get_low16(E) ^ get_low16(F) ^ get_low16(G) ^ get_low16(I) ^ get_low16(J);

    const uint16_t R1 = _mm_extract_epi16(c, 1) ^ H1;
    const uint16_t R0 = _mm_extract_epi16(c, 0) ^ H0;

    return _mm_extract_epi32(_mm_set_epi16(0, 0, 0, 0, 0, 0, R1, R0), 0);
}


The following is a small test program and benchmark for the function.

```
int main()
{
assert(mul_reduce_32(1, 1) == 1);
assert(mul_reduce_32(23523, 34651) == 744709325);
assert(mul_reduce_32(34652346, 5534651) == 1329462203);

// A little benchmark
std::mt19937 rng(std::time(0));
std::uniform_int_distribution uint_dist10(0, std::numeric_limits::max());

const int N = 100000000;
std::vector vec1(N, 0);

Solution

For the time output in main():

  • Use std::clock_t as opposed to clock_t in C++.



  • You could have another variable, endTime, for the ending time.



  • The computed time could also have its own variable, elapsedTime, to be printed.



  • Only the clock time needs to be cast to a double, not the macro as well. You should also use static_cast<>() as this is C++.



Example with changes:

std::clock_t startTime = std::clock();

// ...

std::clock_t endTime = std::clock();

double elapsedTime = static_cast(endTime - startTime) / CLOCKS_PER_SEC;

std::cout << elapsedTime;


But, since you're using C++11, consider utilizing the ` library instead. Unlike , it is more C++-oriented but still contains some aspects.

Side-notes:

  • For std::time, 0 can be replaced with nullptr in C++11.



  • Consider renaming N to something more descriptive. Using a constant in place of a magic number is nice, but you can increase readability by using a better name.



  • Since only your test program uses those first four libraries, they should just be in that file.



  • You don't need those typedefs in your first file; they are already defined in `.



Specific questions:


Currently, I am using many temporary variables. I could try to help the compiler by removing many of them, but this way the code should be more readable and easier for someone to modify.

I would rely on the compiler to handle this, as it should be able to optimize them out. But you're right; readability and maintainability are also important. They can always be adjusted accordingly.

You can also maintain these two things by:

  • avoiding single-character variable names



  • keeping variables as close in scope as possible



You seem to do well with the second point, but not so much the first. I've already addressed this in the side-notes, but this is a problem throughout the first file. By using more descriptive names, it'll be much easier for others to understand your code and make any needed modifications.

Code Snippets

std::clock_t startTime = std::clock();

// ...

std::clock_t endTime = std::clock();

double elapsedTime = static_cast<double>(endTime - startTime) / CLOCKS_PER_SEC;

std::cout << elapsedTime;

Context

StackExchange Code Review Q#48857, answer score: 5

Revisions (0)

No revisions yet.