HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

Simple compression algorithm

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
compressionsimplealgorithm

Problem

An implementation of a simple compression algorithm that's featured in a programming practice book.

My goals:

  • Robust: All error conditions must be handled properly. The specification indicated in the self-documenting comment must be met exactly.



  • Secure: No opportunities for buffer overflow in the program.



  • Efficient: The implementation must be at most O(n) in time, be as efficient as possible in time, and result in using as little memory as possible.



  • Compatible: The implementation should be able to compile and run on any desktop, workstation, or server with a C99 compiler.



  • Well documented: The implementation should be clearly documented and re-usable.



Yes, this algorithm would probably never be used production. It was a test of my ability to write production quality C code. Is this efficient and secure C? Please provide any and all criticisms, including the way in which I documented the code.

a3compress.h:

#ifndef _A3COMPRESS_H_
#define _A3COMPRESS_H_
/* Compress an input string in the following manner:

   For each contiguous range of repeated chars, print the char followed by a
   decimal number indicating the number of repetitions.

   NOTE: 
       This function calls malloc, it returns NULL if malloc failed.
       In any other case the return value must be freed.

   Example:
       input: aaabbbbbccccaaa
       output: a3b5c4a3

   If the string contains a digit (0 - 9), the string will not be compressed,
   because the resulting compressed string would be ambiguous. A copy of the
   string is returned that must be freed.

   If the compressed string would not be of shorter length than the original
   string, a copy of the string is returned that must be freed. */
char * const a3compress (const char * const str);
#endif /* _A3COMPRESS_H_ */


a3compress.c:

```
#include
#include
#include
#include "a3compress.h"

inline unsigned int count_digits (unsigned int n) {
if (n == 0) {return 1;}

unsigned int test = 1;
unsigned

Solution

Comments

The count_digits function could need an explanation (what is its purpose?).

A lot of your comments describe what the code does. This is redundant, I can read the code to see what your code is doing. Instead, a comment should describe why something is done or why something is done in this specific way.

For example, the comment "If this is the last char in the input string, increase the count" is not very helpful. I can see that the code is doing that, but the question is: why is count increased only for the last character?

malloc

Don't cast the malloc return type..

Also, why malloc((idx + 1) + 1) ? Why not +2, and what's the reason for adding 2 in the first place? It needs a comment.

Logic

You're not handling an empty string correctly. You'll allocate 3 bytes instead of 1 byte and won't assign the 0 termination, which might lead to crashes or at least funny garbage when printing.

Return value

In case something is wrong (like invalid input string 123) you return a copy of the original string. But this is likely to be invalid or nonsense input for your decompression. So you can't distinguish between a valid and invalid return value. I would expect to have NULL returned on error and errno set to a reasonable value like EINVAL.

Standard conformance

strlcpy is a BSD function and probably not available everywhere. Since you've explicitly listed "compability" and only mentioned C99, you might want to use strncpy instead. But be careful to correctly append the 0 termination (see also the problem with the empty string).

Context

StackExchange Code Review Q#61928, answer score: 6

Revisions (0)

No revisions yet.