HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Fast line-by-line file reader

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
fastfilereaderline

Problem

This is designed for a high performance complex log analyzer. Very simple idea: read a file line-by-line as fast as possible.

I would appreciate any hints what should/could be improved in this code.

GitHub

FastLineReader.h

/* Copyright (c) 2015 Simon Toth kontakt@simontoth.cz
 * Lincensed under the MIT license: http://opensource.org/licenses/MIT
 */

#ifndef FASTLINEREADER_H
#define FASTLINEREADER_H

// STD C++
#include 

/** Quick line-by-line parser of text files for POSIX/Linux
 *
 *  This function provides a fast line parser with a callback model.
 *
 * @param filename file to be parsed
 * @param callback function that will be called for each line
 * @returns 0 on success, -1 if file could not be opened
 **/
int fastLineParser(const char * const filename, void (*callback)(const char * const, const char * const));

#endif // FASTLINEREADER_H


FastLineReader.cpp

#include "FastLineReader.h"

// POSIX
#include 
#include 
#include 
#include 
#include 
#include 

// C++ STD
#include 
#include 
using namespace std;

int fastLineParser(const char * const filename, void (*callback)(const char * const, const char * const))
{
    int fd = open(filename, O_RDONLY); // open file
    if (fd == -1)
    {
        cerr (mmap(0, static_cast(fs.st_size), PROT_READ, MAP_SHARED, fd, 0));
    if (buf == MAP_FAILED)
    {
        cerr (memchr(begin,'\n',static_cast(buff_end-begin)))) != NULL)
    {
        callback(begin,end);

        if (end != buff_end)
            begin = end+1;
        else
            break;
    }

    // enable if you are working with malformed text files, proper text file needs to end with a newline
#ifdef MALFORMED_TEXFILE
    callback(begin,buff_end);
#endif

    munmap(buf, static_cast(fs.st_size));
    // silent error handling - weak error

    close(fd);
    return 0;
}

Solution

-
posix_fadvise conveniently provides a POSIX_FADV_SEQUENTIAL macro. Use it instead of a magic 1 and a comment.

-
The client doesn't know in advance whether the file is malformed or not. Better detect a malformed text file in run time:

if (begin != buf_end)


-
A Bugs section of posix_fadvise man page says that


In kernels before 2.6.6, if len was specified as 0, then this was interpreted literally as "zero bytes", rather than as meaning "all bytes through to the end of the file".

Since you already know the file size, better be safe and call it with fs.st_size instead of 0.

-
Move #include to the cpp file. The client code doesn't need it.

-
fastLineParser can be used in C code; just declare it as extern "C"

-
I see no reason to use C++ here at all. However, if you do so, do not use namespace std.

-
Finally, do you have any evidence that this is faster than fgets?

Code Snippets

if (begin != buf_end)

Context

StackExchange Code Review Q#85497, answer score: 4

Revisions (0)

No revisions yet.