HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Parsing HTTP Headers in C++

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
httpheadersparsing

Problem

I am building a web server from scratch and trying to make certain tasks faster by embedding C code into the mix for performance. Specifically I'm worried about how the std::string class with the .find() and other functions compare to straight pointer arithmetic.

```
#include
#include
#include

std::map http_request;

void parse_header( void * );

int main()
{

char * msg= "GET / HTTP/1.1\r\n"
"Host: 192.241.213.46:6880\r\n"
"Upgrade-Insecure-Requests: 1\r\n"
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8\r\n"
"User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8\r\n"
"Accept-Language: en-us\r\n"
"Accept-Encoding: gzip, deflate\r\n"
"Connection: keep-alive\r\n\r\n";

parse_header( msg );

}

void parse_header( void *msg )
{
char head = (char ) msg;
char *mid;
char *tail = head;

if( sizeof( msg ) == 0 )
{
return;
}

// Find request type
while( *head++ != ' ');
http_request[ "Type" ] = std::string( ( char * ) msg ).substr( 0 , ( head - 1) - tail );

// Find path
tail = head;
while( *head++ != ' ');
http_request[ "Path" ] = std::string( ( char ) msg ).substr( tail - ( char )msg , ( head - 1) - tail );

// Find HTTP version
tail = head;
while( *head++ != '\r');
http_request[ "Version" ] = std::string( ( char ) msg ).substr( tail - ( char )msg , ( head - 1) - tail );

// Map all headers from a key to a value
while( true )
{
tail = head + 1;
while( *head++ != '\r' );
mid = strstr( tail, ":" );

// Look for the failed strstr
if( tail > mid )
break;

http_request[ std::string( ( char ) msg ).substr( tail - ( char )msg , ( mid ) - tail ) ] = std::string( ( char * ) msg ).substr( mid +

Solution

First of all, The Http server send all the headers Line by Line. In your sample i see you created a dummy sample with the next buffer:

char * msg= "GET / HTTP/1.1\r\n"
            "Host: 192.241.213.46:6880\r\n"
            "Upgrade-Insecure-Requests: 1\r\n"
            "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n"
            "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8\r\n"
            "Accept-Language: en-us\r\n"
            "Accept-Encoding: gzip, deflate\r\n"
            "Connection: keep-alive\r\n\r\n";


I suppose that you first recive all the header before analyzing it with your method void parse_header( void *msg )... and that is the reason that the task to get each Key-Value is so hard because you are using parsers like this:

http_request[ std::string( ( char * ) msg ).substr( tail - ( char *)msg , ( mid ) - tail  ) ] = std::string( ( char * ) msg ).substr( mid + 2 - ( char *) msg , ( head - 3 ) - mid );


What will I do to improve your code:

-
I will keep the map to store all headers.

-
Read the header one line by time and, before to continue with the other line, analyze it.

-
I don't see any issue with the code to parse the first but I will modify it to analyze is a single line and simplify the parsers.

If you read and analyze one line by line, the code to map all the keys with values could be like this:

void parseFirstLine(string line)
{

    string key = "";
    string value = "";
    int  position, lpost;

    // Find request type
    position = line.find(' ');
    http_request[ "Type" ] = line.substr(0, position);
    position++; //Skip character ' '

    // Find path
    position = line.find(' ', lpost);
    http_request[ "Path" ] = line.substr(lpost, (position-lpost));
    position++; //Skip character ' '

    // Find HTTP version
    http_request[ "Version" ] = line.substr(position);
}

void parseHeader(string line)
{

    string key = "";
    string value = "";

    if(data.size() == 0) return;

    int posFirst = line.find(":",0); //Look for separator ':'

    key = line.substr(0, posFirst);
    value = line.substr(posFirst + 1);

    http_request[key] = value;
}


(*) This code could have some bugs because I'm writing this code without any developer environment to review this.

Is your decision where you get each line to analyze, at the moment you get it from socket reading or after finishing of read all the header.

Code Snippets

char * msg= "GET / HTTP/1.1\r\n"
            "Host: 192.241.213.46:6880\r\n"
            "Upgrade-Insecure-Requests: 1\r\n"
            "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n"
            "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8\r\n"
            "Accept-Language: en-us\r\n"
            "Accept-Encoding: gzip, deflate\r\n"
            "Connection: keep-alive\r\n\r\n";
http_request[ std::string( ( char * ) msg ).substr( tail - ( char *)msg , ( mid ) - tail  ) ] = std::string( ( char * ) msg ).substr( mid + 2 - ( char *) msg , ( head - 3 ) - mid );
void parseFirstLine(string line)
{

    string key = "";
    string value = "";
    int  position, lpost;

    // Find request type
    position = line.find(' ');
    http_request[ "Type" ] = line.substr(0, position);
    position++; //Skip character ' '

    // Find path
    position = line.find(' ', lpost);
    http_request[ "Path" ] = line.substr(lpost, (position-lpost));
    position++; //Skip character ' '

    // Find HTTP version
    http_request[ "Version" ] = line.substr(position);
}

void parseHeader(string line)
{

    string key = "";
    string value = "";

    if(data.size() == 0) return;

    int posFirst = line.find(":",0); //Look for separator ':'

    key = line.substr(0, posFirst);
    value = line.substr(posFirst + 1);

    http_request[key] = value;
}

Context

StackExchange Code Review Q#157024, answer score: 2

Revisions (0)

No revisions yet.