patterncsharpMinor
Fastest fill memory with specified 64-bit value
Viewed 0 times
bitwithvaluefastestmemoryfillspecified
Problem
I need to fast fill a memory block in C#, so I wrote something like this (in Free Pascal Compiler + Lazarus in 64 bit mode):
And I use it from C#:
If this code okay, and fastest possible?
// ECX = Ptr64
// EDX = Count
// R8 = Value
PROCEDURE Fill64 (VAR Ptr64; Count: QWord; Value: QWord);
BEGIN
ASM
PUSH RDI
MOV RDI, RCX // Destination Index = Ptr64
MOV RAX, R8 // Accumulator = Value
MOV RCX, RDX // Counter Register = Count
TEST RCX, RCX // If RCX is 0, set ZF (zero flag)
JZ @Exit // Exit if ZF is set
REP STOSQ // Fill memory using 64 bit value from RAX register
@Exit:
POP RDI
END;
END;And I use it from C#:
[DllImport ("MemUtil64.dll", EntryPoint = "Fill64", CallingConvention = CallingConvention.Cdecl)]
private static extern unsafe void Fill64 (void* ptr, ulong count, ulong value);If this code okay, and fastest possible?
Solution
Unfortunately the answer is CPU-dependent. For example here is Android's memset implementation: it uses
For further details, on Intel CPUs, refer to the "Enhanced REP MOVSB and STOSB operation (ERMSB)" section of the Intel® 64 and IA-32
Architectures
Optimization Reference Manual.
rep stosq for some CPUs and a different more complicated implementation (avoiding rep stosq) for others.For further details, on Intel CPUs, refer to the "Enhanced REP MOVSB and STOSB operation (ERMSB)" section of the Intel® 64 and IA-32
Architectures
Optimization Reference Manual.
Context
StackExchange Code Review Q#25393, answer score: 7
Revisions (0)
No revisions yet.