HiveBrain v1.2.0
Get Started
← Back to all entries
snippetbashMinor

Filter shell script to find lines that contain all specified patterns

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
scriptallcontainspecifiedpatternsshellfilterthatfindlines

Problem

I wrote a script that does the following:

  • Run another script on the system



  • Filter the output to find lines that contain ALL of the given patterns



  • Pipe the output to a second script on the system



I feel like the way I did it is a dirty hack. I use a control character to join the arguments, and then replace them with / && / for awk. Is there a better way? Also, this script is probably vulnerable to injection; I don't really need to worry about hostile attackers being able to manipulate the input to awk but I do need to worry about typos screwing up the regular expression.

#!/bin/bash

function join {
local IFS=$'\x02'
echo "$*" | sed 's/'$'\x02''/\/ \&\& \//g'
}

/path/to/first/script |
awk "/$(join $@)/" |
/path/to/second/script

Solution

join is the name of a common Unix command, so using that name for your function could create confusion.

I don't recommend munging the patterns to dynamically write an awk program. As you pointed out, your technique is vulnerable to injection, resulting in execution of arbitrary awk code. Instead, I'd write a fixed awk script that does the job.

#!/usr/bin/env awk -f

BEGIN {
    # Treat command-line arguments as patterns rather than input filenames.
    for (i = 1; i < ARGC; i++) {
        patterns[i - 1] = ARGV[i];
    }
    # Truncate argument list, so that awk always read from standard input.
    ARGC = 1;
}

{
    for (i in patterns) {
        if (!match($0, patterns[i])) {
            next;
        }
    }
    print;
}


To search for literal strings instead of regular expressions, use the index(haystack, needle) function instead of match(haystack, needle).

Code Snippets

#!/usr/bin/env awk -f

BEGIN {
    # Treat command-line arguments as patterns rather than input filenames.
    for (i = 1; i < ARGC; i++) {
        patterns[i - 1] = ARGV[i];
    }
    # Truncate argument list, so that awk always read from standard input.
    ARGC = 1;
}

{
    for (i in patterns) {
        if (!match($0, patterns[i])) {
            next;
        }
    }
    print;
}

Context

StackExchange Code Review Q#53790, answer score: 3

Revisions (0)

No revisions yet.