HiveBrain v1.2.0
Get Started
← Back to all entries
patterncModerate

Checking if CPU supports rdrand

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
cpucheckingrdrandsupports

Problem

My goal with this bit of code is to check if my processor supports rdrand and, if not, execute some other random number generating function. To check if rdrand is supported, the 30th bit in the ecx register should be set.

I guess my one of my dilemmas is whether or not I should explicitly check inside level 1 but I think eax implicitly sets the level there. Additionally, I wonder if I should set initialize the values of the registers to 0. I'd also like to know of any 'gotchas' and possible improvements.

#include 
#include 

#define bit_RDRND   (1 << 30)

int main(int argc, char **argv)
{

    unsigned int eax;
    unsigned int ebx;
    unsigned int ecx;
    unsigned int edx;

    eax = 0x01;

    __asm__ __volatile__(
                         "cpuid;"
                         : "=a"(eax), "=b"(ebx), "=c"(ecx), "=d"(edx)
                         : "a"(eax)
                         );

    printf("The value of the ecx register is %08x.\n", ecx);

    if(ecx & bit_RDRND){
        //use rdrand
        printf("use rdrand\n");
    }
    else{
        //use mt19937
        printf("use mt19937");
    }

    return 0;
}

Solution

Let's look at the inline assembly code that you have:

  • You were correct to use the __volatile__ specifier here. That's a common mistake when writing inline asm, so that's a good start. The CPUID instruction has very important side-effects, and you don't want the compiler deciding to elide it in the name of "optimization".


(Note: I may be wrong here, and __volatile__ may not actually be necessary for CPUID. The official documentation, as I read it, is rather unclear, and I confess to not fully understanding the nuances. See also discussion in comments below.)

  • You don't need a semicolon after the cpuid instruction in the inline assembly block. For only a single instruction, no terminator is needed. If you have multiple instructions, standard practice is simply to separate them with \n\t so that they will be nicely formatted in the compiler's intermediate assembly output. A semicolon won't affect the correctness of the code, of course, but they might confuse a human reader of the code who isn't expecting to see one there. They also just clutter up the compiler's intermediate asm output, should you ever need to analyze it.



  • This is purely stylistic, but I like to align the colons in the inline asm syntax in columns with the opening and closing parentheses. The way you have it now is perfectly readable, though. This starts to matter more when you have multiple instructions in a single asm block.



-
You also expressed concern in the question about whether you should initialize the values of the registers to 0. In fact, you seem to be slightly confused about exactly how to call CPUID. The authoritative reference is always the Intel manuals, but transcriptions have been made and placed online by multiple folks. Searching Google for the instruction name + x86 makes them very easy to find. Here is the documentation for CPUID. It tells you exactly how it needs to be called.

The summary is, EAX needs to be set to the "function"/"level", which indicates the type of information to retrieve. In your case, to get the feature bits, EAX should contain 1. The other registers do not need to be pre-initialized; they will be clobbered and filled by the CPU during the execution of CPUID. The code you have is therefore correct as-written; there are no hidden "gotchas" to be aware of (at least, as long as you are using CPUID as shown here; it is a very "overloaded" instruction!).

-
Finally, I know this is probably just demo code (are you expected to post real code here?), but you really should wrap ugliness like this up in a function. Not only because it looks cleaner, but it also helps to consolidate the scope when the need arises to maintain it. Here's how I'd write it (using my preferred naming conventions; use whatever matches your project):

void InvokeCPUID(unsigned int  function,
                 unsigned int  subfunction,
                 unsigned int* pEAX,
                 unsigned int* pEBX,
                 unsigned int* pECX,
                 unsigned int* pEDX)
{
    assert(pEAX != NULL);
    assert(pEBX != NULL);
    assert(pECX != NULL);
    assert(pEDX != NULL);

    __asm__ __volatile__("cpuid"
                        : "=a" (*pEAX),
                          "=b" (*pEBX),
                          "=c" (*pECX),
                          "=d" (*pEDX)
                        :  "a" (function),
                           "c" (subfunction)
                        );
}

_Bool SupportsRDRAND()
{
    const unsigned int flag_RDRAND = (1 << 30);

    unsigned int eax, ebx, ecx, edx;
    InvokeCPUID(1, 0, &eax, &ebx, &ecx, &edx);

    return ((ecx & flag_RDRAND) == flag_RDRAND);
}


The object code emitted by GCC and Clang for SupportsRDRAND (with the call to CPUID inlined) is beautifully simple:

SupportsRDRAND:
    push    ebx        ; preserve caller-save EBX (required by calling convention)
    mov     eax, 1     ; set EAX to request function 1
    xor     ecx, ecx   ; set ECX to request subfunction 0
    cpuid
    shr     ecx, 30    ; the result we want is in ECX...
    and     ecx, 1     ; ...test for the flag of interest...
    mov     eax, ecx   ; ...and put the result in EAX so it can be returned
    pop     ebx        ; restore previously-saved value of EBX
    ret


That said…it is my recommendation that you not use inline assembly at all! I know its raw power is quite seductive, and sometimes you truly have no choice, but you really should try to avoid inline assembly if at all possible. Architectural portability obviously isn't a compelling argument here, but the fact that inline assembly disrupts the optimizer is still important, and even more important is that it vastly increases the development time and maintenance costs. You really do need to be an expert to get the inline assembly code right. And even if you are such an expert (and we still get it wrong regularly), the maintenance programmer who comes along later is invariably not.

Intrinsics are virtu

Code Snippets

void InvokeCPUID(unsigned int  function,
                 unsigned int  subfunction,
                 unsigned int* pEAX,
                 unsigned int* pEBX,
                 unsigned int* pECX,
                 unsigned int* pEDX)
{
    assert(pEAX != NULL);
    assert(pEBX != NULL);
    assert(pECX != NULL);
    assert(pEDX != NULL);

    __asm__ __volatile__("cpuid"
                        : "=a" (*pEAX),
                          "=b" (*pEBX),
                          "=c" (*pECX),
                          "=d" (*pEDX)
                        :  "a" (function),
                           "c" (subfunction)
                        );
}

_Bool SupportsRDRAND()
{
    const unsigned int flag_RDRAND = (1 << 30);

    unsigned int eax, ebx, ecx, edx;
    InvokeCPUID(1, 0, &eax, &ebx, &ecx, &edx);

    return ((ecx & flag_RDRAND) == flag_RDRAND);
}
SupportsRDRAND:
    push    ebx        ; preserve caller-save EBX (required by calling convention)
    mov     eax, 1     ; set EAX to request function 1
    xor     ecx, ecx   ; set ECX to request subfunction 0
    cpuid
    shr     ecx, 30    ; the result we want is in ECX...
    and     ecx, 1     ; ...test for the flag of interest...
    mov     eax, ecx   ; ...and put the result in EAX so it can be returned
    pop     ebx        ; restore previously-saved value of EBX
    ret
#include <cpuid.h>   // for __get_cpuid intrinsic
_Bool SupportsRDRAND()
{
    const unsigned int flag_RDRAND = (1 << 30);

    unsigned int eax, ebx, ecx, edx;
    __get_cpuid(1, &eax, &ebx, &ecx, &edx);

    return ((ecx & flag_RDRAND) == flag_RDRAND);
}
#include <cpuid.h>     // for __cpuid

Context

StackExchange Code Review Q#147656, answer score: 12

Revisions (0)

No revisions yet.