HiveBrain v1.2.0
Get Started
← Back to all entries
snippetcppMinor

Speeding up create CRC function

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
speedingcrcfunctioncreate

Problem

I have a function like this:

BOOL CGameData::GetCheckSum(BYTE o_byObjCheckSum[32], int *o_pnFileSize, char* pFilePath)
{
memset(o_byObjCheckSum, 0x00, 32);
*o_pnFileSize   = 0;

UINT uiCheckSum = 0;
if(strlen(pFilePath) <=0)
{
return FALSE;
}
FILE *fp;
fp=fopen(pFilePath, "rb");
if(NULL == fp)
{
    return FALSE;
}
fseek( fp, 0L, SEEK_END );
long lFileSize = ftell( fp );
*o_pnFileSize = lFileSize;
fseek( fp, 0L, SEEK_SET );  

BYTE *pFileData = new BYTE [lFileSize];
fread(pFileData, lFileSize, 1, fp); 
sha256_encode(pFileData, lFileSize, o_byObjCheckSum);

fclose(fp);
delete [] pFileData;

return TRUE;
}


How can I make this faster?

If it will help, this function is called here:

STRNCPY_MEMSET(szResDirectoryPath, RESOBJ_DIRECTORY_PATH, MAX_PATH);
    if(FALSE == GetAllFileNameList(&vectFileNameList, szResDirectoryPath))
    {
        g_pFieldGlobal->WriteSystemLogEX(TRUE, "[ERROR] LoadResObjCheckList_ error !!, Directory(%s)\r\n", szResDirectoryPath);
        return FALSE;
    }

    nCnt = vectFileNameList.size(); 
    for(i=0; i (resObjCheckSum.szResObjFileName, resObjCheckSum));
        }
    }

Solution

In general, the file input takes the most amount of processing time.

Your CRC algorithm may also be consuming a lot of the processing time.

Here are some suggested optimizations:

  • Don't clear the checksum. The CRC will write to it anyway. This is


a wasted function call.

  • The strlen function searches the string to determine length. Use


std::string instead, because it maintains the length of the string
and can return the string length faster.

  • Don't read the entire file into memory. Pick a chunk size and use


that. A seek to the end of file may consume a lot of time. If you
must get the file size, use an OS API that will return the size of
the file; hopefully it will be faster than seeking to the end of the
file.

  • If your OS supports memory mapped files, you may want want to use


this. The memory mapped files allows the OS to handle reading the
file into memory.

  • Don't keep allocating from the heap. Create a large array once and


use that. Memory allocation may become a bottleneck when the memory
becomes fragmented.

  • Turn on compiler optimizations to highest level.



  • Try creating multiple threads for processing. For example, you could


spawn two threads, the main program would pass different filenames to
the threads as they complete their processing.

  • Reduce the number of branches; branches may force instruction


pipeline reloading, which wastes time that could be spent processing
data. Research "loop unrolling".

  • Optimize data to fit into the cache. Reloading of the cache for


items outside the cache wastes time.

  • If you can, do whatever it takes to keep the hard drive spinning.


There is an overhead associated with starting up the hard drive
motors. Keeping the drive spinning reduces the need to restart the
motors. See also "double buffering".

  • Optimize the CRC by performing it in parallel, if possible.

Context

StackExchange Code Review Q#82387, answer score: 3

Revisions (0)

No revisions yet.