patternbashMinor

ECG Bash selection tool

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

selectionecgbashtool

Problem

I made the following bash script for extracting a group of ECG signals from ECG files. I would like to know if there is any mistakes and/or weaknesses. I have experienced difficulties in integrating bash parameters to it as a function because of AWK part.

I think it would be better not to use so much different separate tools because of such problems, but not sure how to replace, for instance, the AWK part by something more stable together with bash.

Each ECG file contains two columns where the first column is the original signal and the second column is the improved ECG signal.

The database is AAMI MIT-BIH Arrhythmia. The script must be stable and must be valid, so I have not used wildcard characters there. The users give IDs which they want. They give also which ECG signal they want (1 or 2).

Now, the type of ECG signal has to be manually corrected because I cannot integrate $ecg in awk one-liner.

Logic of the script:

Get a list of wanted ECG columns into ECGs; there is a repetition of the ID 118 because repetition should be allowed and duplicate IDs should not removed

Greate and/or empty temporary files; keep iteration individual ECG in /tmp/test.csv and the combination result in result.csv

Loop through ECGs to have them in result.csv

Add a header to the beginning of the file by ids

getEcgs.bash

```
#!/bin/bash
ids=(101 118 201 103 118)
dir="/home/masi/Documents/CSV/"
#Ecgs=()
index=0
ecg=2 # ecg=1 ecg; ecg=2 improved ecg # change AWK line $2/$1 to corresponding number manually for change; buggy AWK with bash params

#printf '%s\n' "${#ids[@]}"
#printf '%s\n' "${ids[0]}"
#printf '%s\n' "${ids[1]}"

for id in "${ids[@]}";
do
input=$(echo "${dir}P${id}C1.csv")
# take second column of the file here
file=$(awk -F "\",\"" '{print $2}' $input) # http://stackoverflow.com/a/19602188/54964 # http://stackoverflow.com/a/19075707/54964

# printf '%s\n' "${id}"
# printf '%s\n' "$index"

Ecgs[${index}]="${file}"

index=

Solution

Shell scripts that do complex line-oriented text processing using Awk and other tools are usually better done using Awk alone. Not only would the script be more efficient, it would be more coherent, and have fewer quoting issues. Consider the following script, which I'll call ecg:

#!/usr/bin/gawk -f

# https://www.gnu.org/software/gawk/manual/html_node/Join-Function.html
@include "join.awk"

BEGIN {
    FS = "\"*,\"*";
    last_row = 0;
}

BEGINFILE {
    rows[0][ARGIND] = gensub(".*P([0-9]*)C.*", "\\1", "g", FILENAME);
}

{
    rows[FNR][ARGIND] = $col;
    if (FNR > last_row) { last_row = FNR; }
}

END {
    for (r = 0; r <= last_row; r++) {
        print join(rows[r], 1, ARGC - 1, ",");
    }
}

Observe what happens when you run it:

$ ./ecg -v col=2 P{101,118,201,118}C1.csv
101,118,201,118
1.61,-1.84,-0.245,-1.84
0.67,-0.71,-0.22,-0.71
0.695,-0.49,-0.2,-0.49
0.38,-0.26,-0.2,-0.26
0.43,0.07,-0.195,0.07

Note that $col extracts the column specified by the parameter col.

Since you are using GNU/Linux, I have taken advantage of some features specific to GNU Awk in the script above:

Multidimensional arrays. Traditional Awk only has one-dimensional arrays which can be indexed using tuples to simulate extra dimensions.

The BEGINFILE special pattern and the ARGIND special variable.

The gensub() function to extract the ID from the filename.

The join() function.

Code Snippets

#!/usr/bin/gawk -f

# https://www.gnu.org/software/gawk/manual/html_node/Join-Function.html
@include "join.awk"

BEGIN {
    FS = "\"*,\"*";
    last_row = 0;
}

BEGINFILE {
    rows[0][ARGIND] = gensub(".*P([0-9]*)C.*", "\\1", "g", FILENAME);
}

{
    rows[FNR][ARGIND] = $col;
    if (FNR > last_row) { last_row = FNR; }
}

END {
    for (r = 0; r <= last_row; r++) {
        print join(rows[r], 1, ARGC - 1, ",");
    }
}

$ ./ecg -v col=2 P{101,118,201,118}C1.csv
101,118,201,118
1.61,-1.84,-0.245,-1.84
0.67,-0.71,-0.22,-0.71
0.695,-0.49,-0.2,-0.49
0.38,-0.26,-0.2,-0.26
0.43,0.07,-0.195,0.07

Context

StackExchange Code Review Q#146360, answer score: 3

Revisions (0)

No revisions yet.