HiveBrain v1.2.0
Get Started
← Back to all entries
patterncModerate

Decompression implementation

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
implementationdecompressionstackoverflow

Problem

Here is a simple decompression implementation. I'd very much appreciate any advice or comments about the structure/logic of the program, style of the code, or reusability of the features. I'm a real novice and really eager to learn from people here, so fire away!

:

/**
 *  lzw.h
 *  Function declarations for lempel ziv-welch decompression algorithm.
 */

#ifndef LZW_H
#define LZW_H

#include 
#include 

/**
 *  Frees all blocks between 'from' and 'to' (inclusive) in the dict.
 */
void free_dict (unsigned char* dict[], int from, int to);

/**
 *  Enters the first 256 ASCII values into dict.
 */
void load_dict_defaults (unsigned char* dict[]);

/**
 *  Reads 12 bits from source and returns them as an int.
 */
int fread_twelve(FILE* source);

/**
 *  LZW decompression of source to dest.
 *  Returns 0 if successful, a positive integer otherwise.
 */
int decompress (FILE* source, FILE* dest);

#endif // LZW_H


/**
 *  decompress.c
 *  Implementation of lempel ziv-welch decompression algorithm using 12-bit fixed-width compression format.
 *
 *  Usage: decompress
 */

#include "lzw.h"
#include 
#include 

int main (int argc, char* argv[])
{   
    // check for correct number of args
    if (argc != 3)
    {
        printf("Usage: decompress source destination\n");
        return 1;
    }

    // open compressed file
    FILE* source = fopen(argv[1], "r");
    if (source == NULL)
    {
        printf("Error: cannot open %s\n", argv[1]);
        return 1;
    }

    // open destination file
    FILE* dest = fopen(argv[2], "w");
    if (dest == NULL)
    {
        printf("Error: cannot open %s\n", argv[2]);
        return 1;
    }

    // decompress
    if (decompress(source, dest) == 1)
    {
        printf("Decompression failed\n");
        fclose(source);
        fclose(dest);
        return 1;
    }
    else
    {
        fclose(source);
        fclose(dest);
        return 0;
    }
}


```
/**
* lzw.c
* Implementation of lempel ziv-welch decompressio

Solution

Easy to read

Before I start getting critical of the code, I first want to say that it was well commented and the spacing/style of it was good. You used variable names that were descriptive and made sense. So the code was easy to read and understand.

Bugs: Binary file handling

Your program doesn't handle binary files correctly:

  • The first minor problem is that you don't open the source and dest file with "rb" and "wb". This doesn't make any difference on a Unix system, but on a Windows system it might, because it may accidentally convert newline characters when you don't want that to happen.



-
The bigger problem is that you don't output to your destination file correctly. Your current output method:

fprintf(dest, "%s", currString);


will mistakenly stop at any null characters in currString. That means that you can't decompress any file with a null character in it (such as a binary file).

  • The same problem exists in your dictionary maintenance. The way you extend your dictionary string is with strncat(). But with strncat() you can't have any null characters in your dictionary. You should be using memcpy() instead. Similarly, you need to avoid using strlen() and instead keep track of the dictionary entry sizes yourself (perhaps by using a separate array called dictionaryLengths).



Bugs: Incorrect decoding

During encoding, the LZW algorithm outputs a 12 bit code for the longest dictionary match it can find, and then it creates a new dictionary entry for the previous match plus the next character. On the next iteration, it can immediately use the new entry it just added. Your decoder has a problem using the just created entry. For example, encoding a file containing aaa:

  • Add all single characters to dictionary (slots 0-255).



  • Encode a as 97 (12 bits).



  • Add aa into the dictionary as entry #256.



  • Encode aa as 256 (12 bits).



The output file would consist of the two 12-bit entries 97 and 256. Your decoder:

  • Decodes 97 as 'a'. This is outside the main while loop.



  • Starts the main loop:



while ( (currCode = fread_twelve(source)) && currCode )
{       
    if (dictionary[currCode] == NULL)
    {
        return 1;
    }


Here, currCode will be 256, but dictionary[256] is NULL because you haven't created it yet. The case of currCode == dictCode needs to be handled specially, where you add the dictionary entry first (using the code you already have under if (currCode > dictPos)), and then you process the entry.

NULL check problems

Your current check to see if currCode is valid uses a NULL check:

if (dictionary[currCode] == NULL)


This doesn't work after you reset the dictionary because in free_dict(), you free the dictionary entries but you don't set the entries back to NULL. So on the second time through the dictionary, you might end up using an entry that has already been freed.

A better check than a NULL check would be to just check if currCode is in the valid range of [0..dictCode].

Extraneous line causes memory leak

In your main loop, you have these two lines where you call calloc() twice in a row. It was probably a copy and paste error:

currString = calloc(size + 2, sizeof(char)); //+2 for upcoming char and null
    currString = calloc((strlen((char*)dictionary[currCode]) + 2), sizeof(char));


Avoid backward fseek

There's no need to seek backwards in your file. What you could do instead is remember the previous byte using a static variable. Then you use that byte in the "right" case. Something like this:

static bool left = true;
static unsigned char prevByte = 0;

if (left)
{
    // ...
    prevByte = buf[1];
}
else
{
    if (fread(buf, 1, 1, source) != 1) {return 0;}
    value = ((prevByte & 0x0F) << 8) | buf[0];
    left = true;
}

Code Snippets

fprintf(dest, "%s", currString);
while ( (currCode = fread_twelve(source)) && currCode )
{       
    if (dictionary[currCode] == NULL)
    {
        return 1;
    }
if (dictionary[currCode] == NULL)
currString = calloc(size + 2, sizeof(char)); //+2 for upcoming char and null
    currString = calloc((strlen((char*)dictionary[currCode]) + 2), sizeof(char));
static bool left = true;
static unsigned char prevByte = 0;

if (left)
{
    // ...
    prevByte = buf[1];
}
else
{
    if (fread(buf, 1, 1, source) != 1) {return 0;}
    value = ((prevByte & 0x0F) << 8) | buf[0];
    left = true;
}

Context

StackExchange Code Review Q#98270, answer score: 12

Revisions (0)

No revisions yet.