HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Calculating percentages in arbitrary number of columns

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
numbercolumnsarbitrarycalculatingpercentages

Problem

Given this sample input:

ID Sample1 Sample2 Sample3
One 10 0 5
Two 3 6 8
Three 3 4 7


I needed to produce this output using GNU AWK (awk in Linux, gawk in BSD):

ID Sample1 Sample2 Sample3
One 62.50 0.00 25.00
Two 18.75 60.00 40.00
Three 18.75 40.00 35.00


This is how I solved it:

`function percent(value, total) {
return sprintf("%.2f", 100 * value / total)
}
{
label[NR] = $1
for (i = 2; i

What do you think about this implementation? How would you improve it?

Solution

It's quite good. I can only slightly simplify it:

NR==1 {print; next} 
{
    label[NR] = $1
    for (i=2; i<=NF; i++) { sum[i] += $i; val[NR,i] = $i }
}
END {
    OFS = "\t"
    for (nr=2; nr<=NR; nr++) {
        $1 = label[nr]
        for (i=2; i<=NF; i++) $i = sprintf("%.2f", 100*val[nr,i]/sum[i])
        print
    }
}


  • don't need to store the header line, just print it and move on



  • I find sum[i] += col[i][NR] = $i needlessly complicated



  • take advantage of OFS and awk can build the line itself

Code Snippets

NR==1 {print; next} 
{
    label[NR] = $1
    for (i=2; i<=NF; i++) { sum[i] += $i; val[NR,i] = $i }
}
END {
    OFS = "\t"
    for (nr=2; nr<=NR; nr++) {
        $1 = label[nr]
        for (i=2; i<=NF; i++) $i = sprintf("%.2f", 100*val[nr,i]/sum[i])
        print
    }
}

Context

StackExchange Code Review Q#74630, answer score: 8

Revisions (0)

No revisions yet.