patterncMinor
LZW encoder and decoder
Viewed 0 times
andlzwdecoderencoder
Problem
Recently, I reviewed this question about an LZW decoder. In order to properly review that question, I needed to write my own encoder and decoder to be able to test their program. Someone thought it would be interesting to submit these programs for review, so here they are:
encode.c
```
/*
* LZW encoder
*
* - Uses fixed length 12-bit encodings.
* - Outputs in MSB format.
* - When encoding table fills up, then table is reset back to the initial
* 256 entries.
* - Written in C89 style.
*/
#include
#include
#include
#include
static void encode(FILE in, FILE out);
int main(int argc, char *argv[])
{
FILE *in = stdin;
FILE *out = stdout;
if (argc > 1) {
in = fopen(argv[1], "rb");
if (in == NULL) {
fprintf(stderr, "Can't open %s.\n", argv[1]);
return 1;
}
}
if (argc > 2) {
out = fopen(argv[2], "wb");
if (out == NULL) {
fprintf(stderr, "Can't open %s.\n", argv[2]);
return 1;
}
}
encode(in, out);
if (argc > 1)
fclose(in);
if (argc > 2)
fclose(out);
return 0;
}
/* The LZW encoder builds a trie out of the input file, but only adds one
* new trie node per sequence that it outputs. There will be a maximum
* of DICT_MAX sequences, so the child trie pointers can be uint16_t values
* which are the node indices of the child nodes. An index of 0 is like
a NULL pointer, because no node can point to node 0. /
typedef struct DictNode {
uint16_t child[256];
} DictNode;
#define DICT_BITS 12
#define DICT_MAX (1 > 4, out);
fputc(curNode > 8), out);
fputc(curNode, out);
}
break;
}
// Follow the new byte down the trie.
uint16_t nextNode = dictionary[curNode].child[curByte];
if (nextNode != 0) {
// The sequence exists, keep searching down the trie.
curNode = nextNod
encode.c
```
/*
* LZW encoder
*
* - Uses fixed length 12-bit encodings.
* - Outputs in MSB format.
* - When encoding table fills up, then table is reset back to the initial
* 256 entries.
* - Written in C89 style.
*/
#include
#include
#include
#include
static void encode(FILE in, FILE out);
int main(int argc, char *argv[])
{
FILE *in = stdin;
FILE *out = stdout;
if (argc > 1) {
in = fopen(argv[1], "rb");
if (in == NULL) {
fprintf(stderr, "Can't open %s.\n", argv[1]);
return 1;
}
}
if (argc > 2) {
out = fopen(argv[2], "wb");
if (out == NULL) {
fprintf(stderr, "Can't open %s.\n", argv[2]);
return 1;
}
}
encode(in, out);
if (argc > 1)
fclose(in);
if (argc > 2)
fclose(out);
return 0;
}
/* The LZW encoder builds a trie out of the input file, but only adds one
* new trie node per sequence that it outputs. There will be a maximum
* of DICT_MAX sequences, so the child trie pointers can be uint16_t values
* which are the node indices of the child nodes. An index of 0 is like
a NULL pointer, because no node can point to node 0. /
typedef struct DictNode {
uint16_t child[256];
} DictNode;
#define DICT_BITS 12
#define DICT_MAX (1 > 4, out);
fputc(curNode > 8), out);
fputc(curNode, out);
}
break;
}
// Follow the new byte down the trie.
uint16_t nextNode = dictionary[curNode].child[curByte];
if (nextNode != 0) {
// The sequence exists, keep searching down the trie.
curNode = nextNod
Solution
The code is very well written in my opinion. I have only very marginal questions:
-
in the main function you have a small repetition when checking argv to determine if the FILE should be closed or not. I would prefer checking
-
i would prefer
-
i would expect less invasive error checking. Since your
-
i would always enclose with braces the blocks of an
-
in the main function you have a small repetition when checking argv to determine if the FILE should be closed or not. I would prefer checking
(in != stdin) which is telling the real reason for the check. Alternatively I would simply neglect closing the files... since you are in the main function, at the end all file descriptors are guaranteed to be closed anyway.-
i would prefer
for(;;){...} instead of do {...} while(1);. I see three reasons for this: it is shorter, you don't introduce the arbitrary constant 1, it is clear from the beginning that the loop has no termination condition.-
i would expect less invasive error checking. Since your
encode and decode functions are perfect candidates to be reused in a library, you should not write errors to stderr and should not terminate the execution with exit. A better approach would be to use return codes as signals for errors and let the main function write the messages and terminate.-
i would always enclose with braces the blocks of an
if statement even when there is a single line. This is to avoid the risk to forget the braces when a line is later added. As an alternative I would put the single line on the same line as the if statement.Context
StackExchange Code Review Q#98498, answer score: 5
Revisions (0)
No revisions yet.