patternjavascriptMinor
ChunkerTransformStream, a transform stream to take arbitrary chunk sizes and make them consistent
Viewed 0 times
streamchunkconsistenttakearbitraryandmakesizeschunkertransformstreamtransform
Problem
I have some code to interface with a wire protocol that requires data to be inserted into a stream at regular byte intervals. Every 8KB (or at some other definable interval), a small chunk will be inserted. To make this easy, I decided to create a transform stream that would take a flowing stream and write fixed chunk sizes. That is, this stream can be written to in any size (2KB here, 500KB there, 5 bytes next, etc.) and it will output chunks in 8KB of size every time.
This transform stream will be used heavily in my code, having several megabit pushed through it per second. Is this the most efficient way to achieve what I want? I am most concerned about my buffer operations. It's my understanding that
Any feedback, on performance or otherwise, is welcomed.
var stream = require('stream');
function ChunkerTransformStream (chunkSize) {
chunkSize = chunkSize || 8192;
var buffer = new Buffer(0);
var chunker = new stream.Transform({objectMode: true});
chunker._transform = function (chunk, encoding, done) {
buffer = Buffer.concat([buffer, chunk]);
while (buffer.length >= chunkSize) {
this.push(buffer.slice(0, chunkSize));
buffer = buffer.slice(chunkSize);
}
done();
}
chunker._flush = function (done) {
if (buffer.length) {
this.push(buffer);
done();
}
}
return chunker;
}
module.exports = ChunkerTransformStream;This transform stream will be used heavily in my code, having several megabit pushed through it per second. Is this the most efficient way to achieve what I want? I am most concerned about my buffer operations. It's my understanding that
Buffer.concat() is very expensive, as it allocates an entirely new buffer and copies the first two to it.Any feedback, on performance or otherwise, is welcomed.
Solution
First of all, your code looks great. It brought me to study the internal workings of Node's stream API.
The piece that you're concerned with, and rightfully so, is:
That's the way to do it. There are no other ways to do what you want to do without using bufferjs, which likely handles things the same way, or less efficiently.
Now, there is one optimization you can make here. By adding the
Name Description Required? Type
list List of Buffer objects to concat Required array
totalLength Total length of the buffers when concatenated. Optional number
If totalLength is not provided, it is read from the buffers in the
list. However, this adds an additional loop to the function, so it is
faster to provide the length explicitly. - See more at:
http://www.w3resource.com/node.js/nodejs-buffer.php#sthash.qISHPlCO.dpuf
Not sure if this will help or not, but other than that, I don't see any possible optimizations.
The piece that you're concerned with, and rightfully so, is:
buffer = Buffer.concat([buffer, chunk]);That's the way to do it. There are no other ways to do what you want to do without using bufferjs, which likely handles things the same way, or less efficiently.
Now, there is one optimization you can make here. By adding the
totalLength parameter of Buffer.concat Buffer.concat(list, [totalLength]) you can improve efficiency by avoiding an additional loop in the function, which would need to check the new length of the new buffer.Name Description Required? Type
list List of Buffer objects to concat Required array
totalLength Total length of the buffers when concatenated. Optional number
If totalLength is not provided, it is read from the buffers in the
list. However, this adds an additional loop to the function, so it is
faster to provide the length explicitly. - See more at:
http://www.w3resource.com/node.js/nodejs-buffer.php#sthash.qISHPlCO.dpuf
Not sure if this will help or not, but other than that, I don't see any possible optimizations.
Code Snippets
buffer = Buffer.concat([buffer, chunk]);Context
StackExchange Code Review Q#57492, answer score: 6
Revisions (0)
No revisions yet.