patterncppMinor
Base64 encoder/decoder optimizations
Viewed 0 times
optimizationsdecoderbase64encoder
Problem
I've written a Base64 encoder/decoder, which works great. Now I want to see if I can get it working better. I've optimized as much as I can think of, but it may be missing some things. The encoder can encode a 160 MB file in 30 seconds, but the decoder takes nearly 60.
So far the optimizations I've done are:
One possible optimization that I don't know how to make better is the use of a
Encoder:
```
#include "Base64Encoder.h"
#include
#include
const char Base64Encoder::EncodingTable[64] = {'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z', //0-25
'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z', //26-51
'0','1','2','3','4','5','6','7','8','9', //52-61
'+','/'}; //62-63
const char Base64Encoder::PADDING_CHAR = '=';
Base64Encoder::Base64Encoder() { / DO NOTHING / }
Base64Encoder::~Base64Encoder() { / DO NOTHING / }
int Base64Encoder::GetFirstSymbolIndex(char* encoding_buffer) {
return ((encoding_buffer[0] & 0xFC) >> 2);
}
int Base64Encoder::GetSecondSymbolIndex(char* encoding_buffer) {
return (((encoding_buffer[0] & 0x03) > 4));
}
int Base64Encoder::GetThirdSymbolIndex(char* encoding_buffer) {
return (((encoding_buffer[1] & 0x0F) > 6));
}
int Base64Encoder::GetFour
So far the optimizations I've done are:
- Pre-allocated the file size using the formula on Wikipedia for encoding.
- Pre-allocate the file size using the reciprocal of the encoding formula for decoding.
- Use bitwise operations for byte and symbol manipulation.
- Use a in-memory array for encoding.
One possible optimization that I don't know how to make better is the use of a
std::map for decoding. O(log n) for searching and O(log n) for inserting for building the map (albeit only once).Encoder:
```
#include "Base64Encoder.h"
#include
#include
const char Base64Encoder::EncodingTable[64] = {'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z', //0-25
'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z', //26-51
'0','1','2','3','4','5','6','7','8','9', //52-61
'+','/'}; //62-63
const char Base64Encoder::PADDING_CHAR = '=';
Base64Encoder::Base64Encoder() { / DO NOTHING / }
Base64Encoder::~Base64Encoder() { / DO NOTHING / }
int Base64Encoder::GetFirstSymbolIndex(char* encoding_buffer) {
return ((encoding_buffer[0] & 0xFC) >> 2);
}
int Base64Encoder::GetSecondSymbolIndex(char* encoding_buffer) {
return (((encoding_buffer[0] & 0x03) > 4));
}
int Base64Encoder::GetThirdSymbolIndex(char* encoding_buffer) {
return (((encoding_buffer[1] & 0x0F) > 6));
}
int Base64Encoder::GetFour
Solution
I realise this is an old post but just came across it and couldn't help but notice the following pattern in the original post which was not addressed in the review:
Constructs like that are dangerous as the data type of the
As a general rule it's better to do the shift first and apply the mask later.
The original code may or may not work on the designated platform but is certainly not portable.
encoding_buffer[1] & 0xF0) >> 4Constructs like that are dangerous as the data type of the
encoding_buffer is char instead of unsigned char and it's up to the compiler to decide whether to use arithmetic (repeat the leftmost bit on the left) or logical (fill 0s to the left) right shift. Far safer would be to rewrite the expression as:(encoding_buffer[1] >> 4) & 0x0FAs a general rule it's better to do the shift first and apply the mask later.
The original code may or may not work on the designated platform but is certainly not portable.
Code Snippets
encoding_buffer[1] & 0xF0) >> 4(encoding_buffer[1] >> 4) & 0x0FContext
StackExchange Code Review Q#15780, answer score: 7
Revisions (0)
No revisions yet.