HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Determine column averages

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
averagesdeterminecolumn

Problem

I'm trying to write terse Perl code to calculate the average of each column in a file. The file can have >= 1 columns.

use strict;
use warnings;
use List::Util qw/sum/;

my @aoc = ();
while () {
  my $c = 0;
  push @{ $aoc[$c++] }, $_ foreach (split);
}
print join (" ", map { sum(@$_)/@$_ } @aoc), $/;

#output: 2.66666666666667 26.6666666666667 266.666666666667

__DATA__
1 10 100
3 30 300
4 40 400


When the code is run, this is the output:

2.66666666666667 26.6666666666667 266.666666666667


Can this code be any terser? I tried to remove while () by replacing split with map { $c = 0; split } but this does not reset $c to 0 for each loop.

Any ideas on how to reduce this code further?

Solution

It is great that you:

  • Used strict and warnings



  • Leveraged other people's code by using the List::Util module



However, there are some aspects of the code I find hard to understand.
I realize you are striving for as little code as possible, but sometimes
this sacrifices readability.

If you were to keep the code as-is, I recommend adding comments
to summarize what it is doing. For example:

# Calculate the average of each column in a file.
# The file can have one or more columns.
# Each column is separated by a single space.


In this expression, sum(@$_)/@$_, I like the sum call, but the rest is
meant more for code golf.

The variable names are not meaningful. If $c is for columns, $col
would be better. @aoc might be better as @matrix.

Two-space indentation can be hard to read. I recommend 4 spaces per indent level.

There is no need to initialize arrays to an empty list:

my @aoc = ();


This is simpler:

my @aoc;


foreach is identical to for. I recommend for: less to type, less to read.

The usage of $/ for the print statement is strange. The special variable
is the "input record separator", but you are using it for output.
Again, this code is great for golf,
but I think the code would be more straightforward just using "\n".

Unless the input is known to be good, you could add checking to
make sure the input lines only have numbers and that all lines have
the same number of columns.

If you plan to use the @aoc array-of-arrays elsewhere in the code, I agree
with the other answer that you should consider a CPAN module.

However, if you don't need the variable elsewhere, consider a different
approach. You could create a simple array of sums of the columns.

I tossed in a call to sprintf just in case you don't really need
all those digits of precision in your output.

Here is code for this new approach, with some the suggestions above:

use strict;
use warnings;

my @sums;
my $values_in_col;
while () {
    my $col = 0;
    for (split) {
        $sums[$col] += $_;
        $col++;
    }
    $values_in_col++;
}
print join(' ', map { sprintf '%.2f', $_/$values_in_col } @sums), "\n";

__DATA__
1 10 100
3 30 300
4 40 400


Output:

2.67 26.67 266.67

Code Snippets

# Calculate the average of each column in a file.
# The file can have one or more columns.
# Each column is separated by a single space.
my @aoc = ();
use strict;
use warnings;

my @sums;
my $values_in_col;
while (<DATA>) {
    my $col = 0;
    for (split) {
        $sums[$col] += $_;
        $col++;
    }
    $values_in_col++;
}
print join(' ', map { sprintf '%.2f', $_/$values_in_col } @sums), "\n";

__DATA__
1 10 100
3 30 300
4 40 400
2.67 26.67 266.67

Context

StackExchange Code Review Q#107883, answer score: 2

Revisions (0)

No revisions yet.