HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Fastest fill memory with specified 64-bit value

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
bitwithvaluefastestmemoryfillspecified

Problem

I need to fast fill a memory block in C#, so I wrote something like this (in Free Pascal Compiler + Lazarus in 64 bit mode):

//  ECX = Ptr64
//  EDX = Count
//  R8  = Value

PROCEDURE Fill64 (VAR Ptr64; Count: QWord; Value: QWord);
BEGIN
  ASM
    PUSH    RDI
    MOV     RDI, RCX    // Destination Index = Ptr64
    MOV     RAX, R8     // Accumulator = Value
    MOV     RCX, RDX    // Counter Register = Count
    TEST    RCX, RCX    // If RCX is 0, set ZF (zero flag)
    JZ      @Exit       // Exit if ZF is set
    REP     STOSQ       // Fill memory using 64 bit value from RAX register
  @Exit:
    POP     RDI
  END;
END;


And I use it from C#:

[DllImport ("MemUtil64.dll", EntryPoint = "Fill64", CallingConvention = CallingConvention.Cdecl)]
private static extern unsafe void Fill64 (void* ptr, ulong count, ulong value);


If this code okay, and fastest possible?

Solution

Unfortunately the answer is CPU-dependent. For example here is Android's memset implementation: it uses rep stosq for some CPUs and a different more complicated implementation (avoiding rep stosq) for others.

For further details, on Intel CPUs, refer to the "Enhanced REP MOVSB and STOSB operation (ERMSB)" section of the Intel® 64 and IA-32
Architectures
Optimization Reference Manual.

Context

StackExchange Code Review Q#25393, answer score: 7

Revisions (0)

No revisions yet.