patternjavaMinor
HTTP request reader
Viewed 0 times
requesthttpreader
Problem
I'm working on a proxy service that will run on top of a Java application in order to manipulate some headers. I've just completed the first step and that is intercepting the request message and storing them.
This code features only the storage part, and not the edit and send part.
Some information about HTTP requests:
Some assumptions I make in this code:
Extra caution may be paid to if the trade-off of bloated code due to increased low level performance versus concise code is worth it.
```
public final class Converter {
private Converter() {
throw new UnsupportedOperationException();
}
private static final int LINE_FEED_BYTE = 10; //'\n'
private static final int CARRIAGE_RETURN_BYTE = 13; //'\r'
public static void socketToMessages(final Socket socket, final BlockingQueue messages) {
listeningLoop:
while (true) {
List headers = new ArrayList<>();
int newlineCount = 0;
StringBuilder headerStringBuilder = new StringBuilder();
try {
byte[] byteArray = new byte[1];
InputStream inputStream = socket.getInputStream();
boolean readingHeader = true;
while (readingHeader) {
int b = inputStream.read();
if (b == -1) {
break listeningLoop;
}
else if (b == LINE_FEED_BYTE) {
newlineCount++;
String header = headerStrin
This code features only the storage part, and not the edit and send part.
Some information about HTTP requests:
- A request consists of header lines, seperated by a newline.
- The header is terminated by an extra newline.
- After that possibly there is additional content, the message payload.
Some assumptions I make in this code:
- It assumes the data is input over a
Socket.
- It works for both
\nand\r\nnewlines.
- It assumes the
US-ASCIIcharacter set for header.
- It can only deal with either no
Content-Lengthor with normal content length, it does not know about chunked data (yet).
Extra caution may be paid to if the trade-off of bloated code due to increased low level performance versus concise code is worth it.
```
public final class Converter {
private Converter() {
throw new UnsupportedOperationException();
}
private static final int LINE_FEED_BYTE = 10; //'\n'
private static final int CARRIAGE_RETURN_BYTE = 13; //'\r'
public static void socketToMessages(final Socket socket, final BlockingQueue messages) {
listeningLoop:
while (true) {
List headers = new ArrayList<>();
int newlineCount = 0;
StringBuilder headerStringBuilder = new StringBuilder();
try {
byte[] byteArray = new byte[1];
InputStream inputStream = socket.getInputStream();
boolean readingHeader = true;
while (readingHeader) {
int b = inputStream.read();
if (b == -1) {
break listeningLoop;
}
else if (b == LINE_FEED_BYTE) {
newlineCount++;
String header = headerStrin
Solution
socketToMessages()
There are a number of things in here to report:
-
there is no need for the labelled loop
-
You read a single byte at a time from the InputStream when you are processing the headers. This is really slow (especially since the input source is from the network, where the network stack is often not very fast for small requests). You need to create a buffer, and read a chunk of data at a time. When the data is in the buffer, you need to be smart about how you loop through it, looking for the newlines and header-end sections.
-
This code is probably quite accurate, but could be simplified:
as
That converts the int val to a char, and it's done. You are already assuming US_ASCII so there is no loss in here.
getHeaderValue()
Why is this a static? It should be an instance method, and there should be no need for the List input.
Also, there are a few bugs in here.....
-
The HTTP specification says:
Each header field consists of a name followed by a colon (":") and the field value. Field names are case-insensitive. The field value MAY be preceded by any amount of LWS, though a single SP is preferred. Header fields can be extended over multiple lines by preceding each extra line with at least one SP or HT.
Your code is using a case-sensitive match, and also, it is not very efficient.
-
HTTP allows some headers to be present multiple times:
Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma.
What this means is that you may have to scan for multiple instances of the same field if you don't know whether the field is specified as
-
You expect a space between the colon in the header, and the field value. This space is not required. The header
I would recommend a helper function that does:
Then, in your code, you can loop through your headers like:
The code is much simpler to read, and it is only doing string manipulation on a single field. It is not breaking and reconstructing the data in spaces, and it is just plain simpler.
There are a number of things in here to report:
-
there is no need for the labelled loop
listeningLoop. Th places you use the label, you have break listeningLoop;, and those can simply be replaced with return.-
You read a single byte at a time from the InputStream when you are processing the headers. This is really slow (especially since the input source is from the network, where the network stack is often not very fast for small requests). You need to create a buffer, and read a chunk of data at a time. When the data is in the buffer, you need to be smart about how you loop through it, looking for the newlines and header-end sections.
-
This code is probably quite accurate, but could be simplified:
byteArray[0] = (byte)b;
headerStringBuilder.append(new String(byteArray, StandardCharsets.US_ASCII));as
headerStringBuilder.append((char)b);That converts the int val to a char, and it's done. You are already assuming US_ASCII so there is no loss in here.
getHeaderValue()
Why is this a static? It should be an instance method, and there should be no need for the List input.
public Optional getHeaderValue(final String key) {
Objects.requireNonNull(key, "key");
for (String header : headers) {
.....
}
return Optional.empty();
}Also, there are a few bugs in here.....
-
The HTTP specification says:
Each header field consists of a name followed by a colon (":") and the field value. Field names are case-insensitive. The field value MAY be preceded by any amount of LWS, though a single SP is preferred. Header fields can be extended over multiple lines by preceding each extra line with at least one SP or HT.
Your code is using a case-sensitive match, and also, it is not very efficient.
-
HTTP allows some headers to be present multiple times:
Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma.
What this means is that you may have to scan for multiple instances of the same field if you don't know whether the field is specified as
#(values).-
You expect a space between the colon in the header, and the field value. This space is not required. The header
Content-length:1029 is fine.... though a single-space is 'preferred'.I would recommend a helper function that does:
private static final String extractValue(final String key, final String field) {
if (key.length() >= field.length()) {
return null;
}
int pos = 0;
while (pos < key.length()) {
if (Character.toLowerCase(key.charAt(pos)) != Character.toLowerCase(field.charAt(pos))) {
return null;
}
pos++;
}
if (field.charAt(pos++) != ':') {
return null;
}
return field.substring(pos);
}Then, in your code, you can loop through your headers like:
public Optional getHeaderValue(final String key) {
Objects.requireNonNull(key, "key");
for (String header : headers) {
String val = extractValue(key, header);
if (val != null) {
return Optional.of(val);
}
return Optional.empty();
}The code is much simpler to read, and it is only doing string manipulation on a single field. It is not breaking and reconstructing the data in spaces, and it is just plain simpler.
Code Snippets
byteArray[0] = (byte)b;
headerStringBuilder.append(new String(byteArray, StandardCharsets.US_ASCII));headerStringBuilder.append((char)b);public Optional<String> getHeaderValue(final String key) {
Objects.requireNonNull(key, "key");
for (String header : headers) {
.....
}
return Optional.empty();
}private static final String extractValue(final String key, final String field) {
if (key.length() >= field.length()) {
return null;
}
int pos = 0;
while (pos < key.length()) {
if (Character.toLowerCase(key.charAt(pos)) != Character.toLowerCase(field.charAt(pos))) {
return null;
}
pos++;
}
if (field.charAt(pos++) != ':') {
return null;
}
return field.substring(pos);
}public Optional<String> getHeaderValue(final String key) {
Objects.requireNonNull(key, "key");
for (String header : headers) {
String val = extractValue(key, header);
if (val != null) {
return Optional.of(val);
}
return Optional.empty();
}Context
StackExchange Code Review Q#56487, answer score: 7
Revisions (0)
No revisions yet.