patterncppMinor
Optimize Huffman Compressor in C++
Viewed 0 times
compressorhuffmanoptimize
Problem
I'm currently writing a Huffman compressor for educational purposes.
I want it to compress/decompress files less then 5mb. The time limit is 5 seconds.
Now it compresses a 5mb file in almost 8 seconds. The decompression speed is really awful. It requires almost 6 seconds on a file of the size approx 0.1mb.
The problem is in reading/writing. I can get all code tables in less than 1 second, but then writing subroutine works about 6 seconds on 5mb file.
How can I improve speed of compression/decompression?
main.cpp
archiver.h
```
#include "huffman.h"
#include "bitstring.h"
#ifndef HUFFMAN_ARCHIVER_H
#define HUFFMAN_ARCHIVER_H
class Archiver{
private:
std::map, char> codes;
std::map > lookup;
public:
Archiver(){};
void compress(std::ifstream&, std::ofstream&, HuffmanTree*);
void decompress(std::ifstream&, std::ofstream&, HuffmanTree*);
void encodeTree(BitStringWrite&, TreeNode*);
TreeNode* decodeTree(BitStringRead&);
void buildCodes(TreeNode*, std::vector);
std::map, char>& getCodes(){
return codes;
};
std::map& createFreqTable(const std::string&, std::map&);
void buildTa
I want it to compress/decompress files less then 5mb. The time limit is 5 seconds.
Now it compresses a 5mb file in almost 8 seconds. The decompression speed is really awful. It requires almost 6 seconds on a file of the size approx 0.1mb.
The problem is in reading/writing. I can get all code tables in less than 1 second, but then writing subroutine works about 6 seconds on 5mb file.
How can I improve speed of compression/decompression?
main.cpp
#include
#include
#include "huffman.h"
#include "archiver.h"
using namespace std::chrono;
int main() {
high_resolution_clock::time_point t1 =
high_resolution_clock::now();
Archiver ar;
//encoding
/*
std::map m;
ar.createFreqTable("test.pdf", m);
HuffmanTree t(m);
std::ifstream ifs("test.pdf");
std::ofstream ofs("test.out");
ar.compress(ifs, ofs, &t);
*/
//decoding
// /*
//test.out - 123 930 bytes
HuffmanTree nt;
std::ifstream ifs2("test.out");
std::ofstream ofs2("result.pdf");
ar.decompress(ifs2, ofs2, &nt);
//*/
high_resolution_clock::time_point t2 = high_resolution_clock::now();
auto duration = duration_cast( t2 - t1 ).count();
std::cout << duration; // 5349
return 0;
}archiver.h
```
#include "huffman.h"
#include "bitstring.h"
#ifndef HUFFMAN_ARCHIVER_H
#define HUFFMAN_ARCHIVER_H
class Archiver{
private:
std::map, char> codes;
std::map > lookup;
public:
Archiver(){};
void compress(std::ifstream&, std::ofstream&, HuffmanTree*);
void decompress(std::ifstream&, std::ofstream&, HuffmanTree*);
void encodeTree(BitStringWrite&, TreeNode*);
TreeNode* decodeTree(BitStringRead&);
void buildCodes(TreeNode*, std::vector);
std::map, char>& getCodes(){
return codes;
};
std::map& createFreqTable(const std::string&, std::map&);
void buildTa
Solution
I don't have time to do a full write-up, but I can take an educated guess at why your file I/O is the bottleneck. You're literally reading and writing a single byte at a time:
and
You should buffer much more before writing to disk, and for reading you should read in more than you need, and pull from the buffer until it's empty. I recommend starting out with a 4096 byte buffer and profiling it to see if it's improved. You can try other sizes to see how it affects the performance.
_out_f.write(&_byte, sizeof(char));and
_in_f.read(&_byte, sizeof(char));You should buffer much more before writing to disk, and for reading you should read in more than you need, and pull from the buffer until it's empty. I recommend starting out with a 4096 byte buffer and profiling it to see if it's improved. You can try other sizes to see how it affects the performance.
Code Snippets
_out_f.write(&_byte, sizeof(char));_in_f.read(&_byte, sizeof(char));Context
StackExchange Code Review Q#162077, answer score: 3
Revisions (0)
No revisions yet.