patterncMinor
Analyzer for compression algorithms (or tools)
Viewed 0 times
compressionalgorithmsfortoolsanalyzer
Problem
This file contains the main functions of my tool, rac. In particular, I care a lot about the Shannon function.
Would you please give some hints to make the file cleaner? I have a perversion for concise and good indented code.
Would you please give some hints to make the file cleaner? I have a perversion for concise and good indented code.
#include
#include
#include "headers/utils.h"
double ccount(char * foo)
{
FILE * file = fopen(foo, "r");
int a = 0, b;
while ((b = getc(file)) != EOF)
a++;
fclose(file);
return a;
}
int cmp(double foo, double bar)
{
if (foo >= bar)
return 1;
else
return 0;
}
double ratio(double foo, double bar)
{
return (100*(foo - bar)/ foo);
}
void stat(char * file, double vector[256])
{
int c;
FILE * ftopen = fopen(file, "r");
while ((c = getc(ftopen)) != EOF)
{
if (c >= 0 && c <= 256)
vector[c]++;
}
fclose(ftopen);
}
void freq(char * foo, double vector[256])
{
int i;
for (i = 0; i < 256; i++)
{
if (vector[i] == 0)
continue;
else
{
vector[i] = (vector[i] / ccount(foo));
}
}
}
double shannon(double vector[256])
{
int i;
double shan = 0.0;
for (i = 0; i < 256; i++)
{
if (vector[i] == 0.0)
continue;
else
{
shan = shan + (log2(pow(vector[i], vector[i])));
}
}
return (shan * -1.0); //entropy of the file
}Solution
MAGIC NUMBERS
The number 256 is used in multiple places as an array size. Rather than using the raw number
256, it would be better to use a symbolic constant. Raw numbers in code are sometimes referred to as magic numbers, because it isn't clear what they mean.
There are at least two reasons for this,
An Example in C:
An Example in C++:
Use Meaningful Variable and Function Names
The code contains
The names should indicate what the variables are, so that if you or someone else needs
to change the code 2 years in the future it is an easy task rather than having to go
through all the code to see what needs to be changed.
Another exmaple of this is the function
A third example is the array
this array is used for.
Inconsistent Coding Style
In a professional setting where multiple people may have to maintain the code it is best
to be as consistent as possible.
The else clauses almost always have
I do this even for single statements because if the code needs to be updated by adding a statement
to a block it is easier for anyone to see where the line of code should be inserted. This was
also required as part of a coding standard at a company where I worked.
Empty Clauses in IF/ELSE
In the function shannon() the code contains:
This would be clearer if it was written
An else clause can be added at a later time if it is needed. The if portion of the current
implementation is meaningless and would be optimized out by the compiler.
The number 256 is used in multiple places as an array size. Rather than using the raw number
256, it would be better to use a symbolic constant. Raw numbers in code are sometimes referred to as magic numbers, because it isn't clear what they mean.
There are at least two reasons for this,
- It identifies what the number means.
- If at some point in the future the number has to be changed for all uses, the code only needs to be changed in one place.
An Example in C:
#define ARRAY_SIZE 256
void stat(char * file, double vector[ARRAY_SIZE])
{
int c;
FILE * ftopen = fopen(file, "r");
while ((c = getc(ftopen)) != EOF)
{
if (c >= 0 && c <= ARRAY_SIZE)
vector[c]++;
}
fclose(ftopen);
}An Example in C++:
constexpr int ARRAY_SIZE = 256;Use Meaningful Variable and Function Names
The code contains
foo and bar in a number of places. These are not meaningful names.The names should indicate what the variables are, so that if you or someone else needs
to change the code 2 years in the future it is an easy task rather than having to go
through all the code to see what needs to be changed.
Another exmaple of this is the function
ccount(), which could be renamed file_size().A third example is the array
vector. Other than array if doubles it is not clear whatthis array is used for.
Inconsistent Coding Style
In a professional setting where multiple people may have to maintain the code it is best
to be as consistent as possible.
The else clauses almost always have
{ block} but the if statements do not.
This is true whether or not the block contains multiple statement or not.
You should choose one way or the other.
In the for loops the { is immediately under the for, but in the else clauses it
is indented. Choose one or the other. The most common usage is how you do the for loops.
This is partially a matter of choice, I use { block }` on all control statements:if ()
{
}
else
{
}
for ()
{
}
while ()
{
}
do
{
} while ();I do this even for single statements because if the code needs to be updated by adding a statement
to a block it is easier for anyone to see where the line of code should be inserted. This was
also required as part of a coding standard at a company where I worked.
Empty Clauses in IF/ELSE
In the function shannon() the code contains:
if (vector[i] == 0.0)
continue;
else
{
shan = shan + (log2(pow(vector[i], vector[i])));
}This would be clearer if it was written
if (vector[i] != 0.0)
{
shan = shan + (log2(pow(vector[i], vector[i])));
}An else clause can be added at a later time if it is needed. The if portion of the current
implementation is meaningless and would be optimized out by the compiler.
Code Snippets
#define ARRAY_SIZE 256
void stat(char * file, double vector[ARRAY_SIZE])
{
int c;
FILE * ftopen = fopen(file, "r");
while ((c = getc(ftopen)) != EOF)
{
if (c >= 0 && c <= ARRAY_SIZE)
vector[c]++;
}
fclose(ftopen);
}constexpr int ARRAY_SIZE = 256;if ()
{
}
else
{
}
for ()
{
}
while ()
{
}
do
{
} while ();if (vector[i] == 0.0)
continue;
else
{
shan = shan + (log2(pow(vector[i], vector[i])));
}if (vector[i] != 0.0)
{
shan = shan + (log2(pow(vector[i], vector[i])));
}Context
StackExchange Code Review Q#140121, answer score: 2
Revisions (0)
No revisions yet.