patterncppMinor
Parsing HTTP Headers in C++
Viewed 0 times
httpheadersparsing
Problem
I am building a web server from scratch and trying to make certain tasks faster by embedding C code into the mix for performance. Specifically I'm worried about how the
```
#include
#include
#include
std::map http_request;
void parse_header( void * );
int main()
{
char * msg= "GET / HTTP/1.1\r\n"
"Host: 192.241.213.46:6880\r\n"
"Upgrade-Insecure-Requests: 1\r\n"
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8\r\n"
"User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8\r\n"
"Accept-Language: en-us\r\n"
"Accept-Encoding: gzip, deflate\r\n"
"Connection: keep-alive\r\n\r\n";
parse_header( msg );
}
void parse_header( void *msg )
{
char head = (char ) msg;
char *mid;
char *tail = head;
if( sizeof( msg ) == 0 )
{
return;
}
// Find request type
while( *head++ != ' ');
http_request[ "Type" ] = std::string( ( char * ) msg ).substr( 0 , ( head - 1) - tail );
// Find path
tail = head;
while( *head++ != ' ');
http_request[ "Path" ] = std::string( ( char ) msg ).substr( tail - ( char )msg , ( head - 1) - tail );
// Find HTTP version
tail = head;
while( *head++ != '\r');
http_request[ "Version" ] = std::string( ( char ) msg ).substr( tail - ( char )msg , ( head - 1) - tail );
// Map all headers from a key to a value
while( true )
{
tail = head + 1;
while( *head++ != '\r' );
mid = strstr( tail, ":" );
// Look for the failed strstr
if( tail > mid )
break;
http_request[ std::string( ( char ) msg ).substr( tail - ( char )msg , ( mid ) - tail ) ] = std::string( ( char * ) msg ).substr( mid +
std::string class with the .find() and other functions compare to straight pointer arithmetic.```
#include
#include
#include
std::map http_request;
void parse_header( void * );
int main()
{
char * msg= "GET / HTTP/1.1\r\n"
"Host: 192.241.213.46:6880\r\n"
"Upgrade-Insecure-Requests: 1\r\n"
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8\r\n"
"User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8\r\n"
"Accept-Language: en-us\r\n"
"Accept-Encoding: gzip, deflate\r\n"
"Connection: keep-alive\r\n\r\n";
parse_header( msg );
}
void parse_header( void *msg )
{
char head = (char ) msg;
char *mid;
char *tail = head;
if( sizeof( msg ) == 0 )
{
return;
}
// Find request type
while( *head++ != ' ');
http_request[ "Type" ] = std::string( ( char * ) msg ).substr( 0 , ( head - 1) - tail );
// Find path
tail = head;
while( *head++ != ' ');
http_request[ "Path" ] = std::string( ( char ) msg ).substr( tail - ( char )msg , ( head - 1) - tail );
// Find HTTP version
tail = head;
while( *head++ != '\r');
http_request[ "Version" ] = std::string( ( char ) msg ).substr( tail - ( char )msg , ( head - 1) - tail );
// Map all headers from a key to a value
while( true )
{
tail = head + 1;
while( *head++ != '\r' );
mid = strstr( tail, ":" );
// Look for the failed strstr
if( tail > mid )
break;
http_request[ std::string( ( char ) msg ).substr( tail - ( char )msg , ( mid ) - tail ) ] = std::string( ( char * ) msg ).substr( mid +
Solution
First of all, The Http server send all the headers Line by Line. In your sample i see you created a dummy sample with the next buffer:
I suppose that you first recive all the header before analyzing it with your method
What will I do to improve your code:
-
I will keep the
-
Read the header one line by time and, before to continue with the other line, analyze it.
-
I don't see any issue with the code to parse the first but I will modify it to analyze is a single line and simplify the parsers.
If you read and analyze one line by line, the code to map all the keys with values could be like this:
(*) This code could have some bugs because I'm writing this code without any developer environment to review this.
Is your decision where you get each line to analyze, at the moment you get it from socket reading or after finishing of read all the header.
char * msg= "GET / HTTP/1.1\r\n"
"Host: 192.241.213.46:6880\r\n"
"Upgrade-Insecure-Requests: 1\r\n"
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n"
"User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8\r\n"
"Accept-Language: en-us\r\n"
"Accept-Encoding: gzip, deflate\r\n"
"Connection: keep-alive\r\n\r\n";I suppose that you first recive all the header before analyzing it with your method
void parse_header( void *msg )... and that is the reason that the task to get each Key-Value is so hard because you are using parsers like this:http_request[ std::string( ( char * ) msg ).substr( tail - ( char *)msg , ( mid ) - tail ) ] = std::string( ( char * ) msg ).substr( mid + 2 - ( char *) msg , ( head - 3 ) - mid );What will I do to improve your code:
-
I will keep the
map to store all headers.-
Read the header one line by time and, before to continue with the other line, analyze it.
-
I don't see any issue with the code to parse the first but I will modify it to analyze is a single line and simplify the parsers.
If you read and analyze one line by line, the code to map all the keys with values could be like this:
void parseFirstLine(string line)
{
string key = "";
string value = "";
int position, lpost;
// Find request type
position = line.find(' ');
http_request[ "Type" ] = line.substr(0, position);
position++; //Skip character ' '
// Find path
position = line.find(' ', lpost);
http_request[ "Path" ] = line.substr(lpost, (position-lpost));
position++; //Skip character ' '
// Find HTTP version
http_request[ "Version" ] = line.substr(position);
}
void parseHeader(string line)
{
string key = "";
string value = "";
if(data.size() == 0) return;
int posFirst = line.find(":",0); //Look for separator ':'
key = line.substr(0, posFirst);
value = line.substr(posFirst + 1);
http_request[key] = value;
}(*) This code could have some bugs because I'm writing this code without any developer environment to review this.
Is your decision where you get each line to analyze, at the moment you get it from socket reading or after finishing of read all the header.
Code Snippets
char * msg= "GET / HTTP/1.1\r\n"
"Host: 192.241.213.46:6880\r\n"
"Upgrade-Insecure-Requests: 1\r\n"
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n"
"User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8\r\n"
"Accept-Language: en-us\r\n"
"Accept-Encoding: gzip, deflate\r\n"
"Connection: keep-alive\r\n\r\n";http_request[ std::string( ( char * ) msg ).substr( tail - ( char *)msg , ( mid ) - tail ) ] = std::string( ( char * ) msg ).substr( mid + 2 - ( char *) msg , ( head - 3 ) - mid );void parseFirstLine(string line)
{
string key = "";
string value = "";
int position, lpost;
// Find request type
position = line.find(' ');
http_request[ "Type" ] = line.substr(0, position);
position++; //Skip character ' '
// Find path
position = line.find(' ', lpost);
http_request[ "Path" ] = line.substr(lpost, (position-lpost));
position++; //Skip character ' '
// Find HTTP version
http_request[ "Version" ] = line.substr(position);
}
void parseHeader(string line)
{
string key = "";
string value = "";
if(data.size() == 0) return;
int posFirst = line.find(":",0); //Look for separator ':'
key = line.substr(0, posFirst);
value = line.substr(posFirst + 1);
http_request[key] = value;
}Context
StackExchange Code Review Q#157024, answer score: 2
Revisions (0)
No revisions yet.