patterncMinor
Averaging numbers from a file, obeying Single Responsibility Principle
Viewed 0 times
principlefileaveragingnumberssingleresponsibilityobeyingfrom
Problem
Task:
Given
Calculate the average value.
For example: 1,2,3 should be equal 2 (float) as result.
Here is my solution in C:
This is my very first time coding in C. I used to code in Ruby, Java and PHP, so maybe this code would be a little bit "high leveled".
Given
comma_separated.txt file with string:1,2,3,4,5,6,7,8,9,10,11,12...,nCalculate the average value.
For example: 1,2,3 should be equal 2 (float) as result.
Here is my solution in C:
#include
#include
#include
int element_counter;
float* all_elements;
void mark_element_as_read() {
element_counter += 1;
}
int get_number_of_read_elements() {
return element_counter;
}
float* expand_array_stack_for(float* elements) {
return realloc(elements, get_number_of_read_elements() * sizeof(elements));
}
void append_float(float element) {
all_elements = expand_array_stack_for(all_elements);
all_elements[get_number_of_read_elements()] = element;
}
float* get_collection() {
return all_elements;
}
float* read_from_file(char* filename) {
FILE *file_opened;
file_opened = fopen(filename, "r");
char elements[255];
char * element;
while (fgets(elements, sizeof(elements), file_opened) != NULL) {
element = strtok(elements, ",");
while (element != NULL) {
append_float(strtof(element, NULL));
mark_element_as_read();
element = strtok(NULL, ",");
}
}
free(element);
fclose(file_opened);
return get_collection();
}
int main() {
float sum = 0.0, average;
float *elements = read_from_file("comma_separated.txt");
int i = 0;
for (i = 0; i < get_number_of_read_elements(); i++) {
sum += elements[i];
}
average = sum / get_number_of_read_elements();
printf("average = %f\n", average);
return 0;
}This is my very first time coding in C. I used to code in Ruby, Java and PHP, so maybe this code would be a little bit "high leveled".
Solution
Things you can improve
Variables/Initialization
-
Floating-point math is challenging in surprising places. It's easy to write down a reasonable algorithm that introduces 0.01% error on every step, which over 1,000 iteration turns the results into complete slop. You can easily find volumes filled with advice about how to avoid such surprises. Much of it is valid today, but much of it is easy to handle quickly: use
The problem is how many bits each type uses to store significant digits.
-
Don't use global variables.
The problem with global variables is that since every function has access to these, it becomes increasingly hard to figure out which functions actually read and write these variables.
To understand how the application works, you pretty much have to take into account every function which modifies the global state. That can be done, but as the application grows it will get harder to the point of being virtually impossible (or at least a complete waste of time).
If you don't rely on global variables, you can pass state around between different functions as needed. That way you stand a much better chance of understanding what each function does, as you don't need to take the global state into account.
So instead of using global variables, initialize the variables in
Standards
-
Instead, use
-
The first time you call the function, send in the string to be parsed as the first argument.
-
On subsequent calls, send in
-
The last argument is the scratch string. You don't have to initialize it on first use; on subsequent uses it will hold the string as it is parsed so far.
To demonstrate its use, I've written a simple line counter (of only non-blank lines) using the POSIX standard one. I'll leave the choice of what version to use and implementation into your program up to you.
If you implement this, you have your question answered.
Also, in my use of
of the string after the delimiter? Is that hanging out in memory
somewhere?
It's not hanging out in memory, it's in the scratch string you provided.
-
You don't have to return
C99 & C11 §5.1.2.2(3)
...reaching the
value of
Syntax/Styling
-
Put the variable declarations to separate lines.
From Code Complete, 2d Edition, p. 759:
With statements on their own lines, the code reads from top to bottom,
instead of top to bottom and left to right. When you’re looking for a
specific line of code, your eye should be able to follow the left
margin of the code. It shouldn’t have to dip into each and every line
just because a single line might contain two statements.
-
Use
-
Declare your parameters as
-
You can simplify your
-
Initialize
Variables/Initialization
-
Floating-point math is challenging in surprising places. It's easy to write down a reasonable algorithm that introduces 0.01% error on every step, which over 1,000 iteration turns the results into complete slop. You can easily find volumes filled with advice about how to avoid such surprises. Much of it is valid today, but much of it is easy to handle quickly: use
double instead of float, and for intermediate values in calculations, it doesn't hurt to use long double.The problem is how many bits each type uses to store significant digits.
float uses 32 bits, and double uses 64 bits. 64 bits is enough to reliably store 15 significant digit, which is plenty in order to solve all of our floating-point problems for now. All we had to do was throw twice as much space at the problem, and suddenly all sorts of considerations are basically irrelevant. Even if there is a perceptible speed difference between a program written with all doubles and one written with all floats, it's worth extra microseconds to be able to ignore so many caveats.-
Don't use global variables.
int element_counter;
float* all_elements;The problem with global variables is that since every function has access to these, it becomes increasingly hard to figure out which functions actually read and write these variables.
To understand how the application works, you pretty much have to take into account every function which modifies the global state. That can be done, but as the application grows it will get harder to the point of being virtually impossible (or at least a complete waste of time).
If you don't rely on global variables, you can pass state around between different functions as needed. That way you stand a much better chance of understanding what each function does, as you don't need to take the global state into account.
So instead of using global variables, initialize the variables in
main(), and pass them as arguments to functions if necessary.Standards
-
strtok is limited to tokenizing only one string (with one set of delimiters) at a time, and it can't be used while threading. Therefore, strtok is considered deprecated. Instead, use
strtok_r or strtok_s which are threading-friendly versions of strtok. The POSIX standard provided strtok_r, and the C11 standard provides strtok_s. The use of either is a little awkward, because the first call is different from the subsequent calls.-
The first time you call the function, send in the string to be parsed as the first argument.
-
On subsequent calls, send in
NULL as the first argument.-
The last argument is the scratch string. You don't have to initialize it on first use; on subsequent uses it will hold the string as it is parsed so far.
To demonstrate its use, I've written a simple line counter (of only non-blank lines) using the POSIX standard one. I'll leave the choice of what version to use and implementation into your program up to you.
#include // strtok_r
int countLines(char* instring)
{
int counter = 0;
char *scratch, *txt;
char *delimiter = "\n";
for (; (txt = strtok_r((!counter ? instring : NULL), delimiter, &scratch)); counter++);
return counter;
}If you implement this, you have your question answered.
Also, in my use of
strtok(), what should I be doing with the balanceof the string after the delimiter? Is that hanging out in memory
somewhere?
It's not hanging out in memory, it's in the scratch string you provided.
-
You don't have to return
0 at the end of main(), just like you wouldn't bother putting return; at the end of a void-returning function. The C standard knows how frequently this is used, and lets you not bother.C99 & C11 §5.1.2.2(3)
...reaching the
} that terminates the main() function returns avalue of
0.Syntax/Styling
-
Put the variable declarations to separate lines.
float sum = 0.0, average;From Code Complete, 2d Edition, p. 759:
With statements on their own lines, the code reads from top to bottom,
instead of top to bottom and left to right. When you’re looking for a
specific line of code, your eye should be able to follow the left
margin of the code. It shouldn’t have to dip into each and every line
just because a single line might contain two statements.
-
Use
strtod() instead of strtof(). (This goes with my point about double vs. float).-
Declare your parameters as
void if you don't take in any arguments in your methods.int main(void)-
You can simplify your
NULL checks.while(element) // same as while (element != NULL)-
Initialize
i inside of your for loops.(C99)for (int i = 0; i < get_number_of_read_elements(); i++)Code Snippets
int element_counter;
float* all_elements;#include <string.h> // strtok_r
int countLines(char* instring)
{
int counter = 0;
char *scratch, *txt;
char *delimiter = "\n";
for (; (txt = strtok_r((!counter ? instring : NULL), delimiter, &scratch)); counter++);
return counter;
}float sum = 0.0, average;int main(void)while(element) // same as while (element != NULL)Context
StackExchange Code Review Q#45779, answer score: 9
Revisions (0)
No revisions yet.