HiveBrain v1.2.0
Get Started
← Back to all entries
patternbashMinor

Cleaning a WPA wordlist

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
wordlistwpacleaning

Problem

I have a short bash script that processes gigs and gigs of data. I am looking for any improvements to make it faster. This is my very first bash script so please be gentle. I am really only concerned about the while loop. The rest of it is fine I think. It's the while loop where the real work is done and could use the most enhancement.

```
#!/bin/bash

# This script will clean a WPA wordlist
# It will read every line of the given file
# Remove all whitespace except for newlines
# Delete the line if it is less than 8 chars or greater than 63
# It will then exit with time of execution

IFS=$'\n' # make newlines the only separator

startTime=$(date) # start time of execution
fileToClean=$1 # this is the file we will be sanatizing, 1st cmd line arg
deletedlines=0 # number of lines that did not meet WPA PSK critera
validlines=0 # number of lines that were valid PSKs and added to file

if [ -z $fileToClean ]; then #No file specified
echo ""
echo "No file specified!"
echo ""
exit -1
fi

if [ ! -f $fileToClean ]; then #File does not exist
echo ""
echo "File not found!"
echo ""
exit -1
fi

#By this point I am assuming I entered a valid file and will begin cleaning
echo ""
echo 'Cleaning word list: ' $1
echo "Start Time: " $startTime
echo ""

while read line; do #read every line in file and save to var line

line="${line##*( )}" # trim leading whitespace
line="${line%%*( )}" # trim trailing whitespace

if [ ${#line} -ge 8 ] && [ ${#line} -le 63 ]; then # if trimmedline length >= 8 && > $outputfile
((validlines++))
continue
fi

((deletedlines++))

done &1 1>/dev/null )" # this stores the executio time in the var utime

echo "Processing completed, it took" $utime "and $deletedlines were deleted."
echo $validlines "were added to the output file "$outputfile" as th

Solution

The if statement can be reduced to just:

[[ ${#line} >= 8 && ${#line}  0 ]] 
    && echo $line >> $outputfile || ((deletedlines++))


We use bash's [[ ... ]] shell keyword to collate our conditions together. Helpfully, [[ ... ]] supports the use of >= and ]] && || is quite a useful (IMHO!) construct in bash for implementing a if-then-else logic. The only thing to note is that you can only specify one command each for both branches.

As such, we can 'inline' our increment of validLines and do an always true comparison to simplify our if branching, namely to either (&&) output the line to the output file, or (||) increment our deletedLines. Otherwise, it gets slightly longer due to the required use of { ... } compound command:

[[ ${#line} >= 8 && ${#line} > $outputfile; } || ((deletedlines++))


Now onto the other parts...

  • fileToClean should be set as fileToClean="$1" in case the filename has spaces inside.



-
Multi-line echo can be done as such:

[ ! -f "$fileToClean" ] && cat<<M && exit -1

Oh noes,
We have a multi-line error message
Telling user that the file is not a regular file.

M


-
Your way of checking the elapsed time should be done outside your script, using the time command. time can generate more (and probably more accurate) statistics about the 'performance' of your script and is readily available, hence the suggestion as opposed to manually calculating the value(s) yourself. :)

Now do you really need to do this in bash? Here's an awk one-liner (split into three for readability):

awk '{sub(/^[ ]+/,"");sub(/[ ]+$/,"")};
    length()>=8&&length() "'"$outputfile"'" };
    END{print "Valid lines:",v"\nInvalid lines:",NR-v}' "$fileToClean"


  • Trim line.



  • If length matches, count the line and print to "$outputfile".



  • At the end, print the summary for valid/invalid lines.

Code Snippets

[[ ${#line} >= 8 && ${#line} <= 63 && $((++validLines)) > 0 ]] 
    && echo $line >> $outputfile || ((deletedlines++))
[[ ${#line} >= 8 && ${#line} <= 63 ]] 
    && { ((validLines++)); echo $line >> $outputfile; } || ((deletedlines++))
[ ! -f "$fileToClean" ] && cat<<M && exit -1

Oh noes,
We have a multi-line error message
Telling user that the file is not a regular file.

M
awk '{sub(/^[ ]+/,"");sub(/[ ]+$/,"")};
    length()>=8&&length()<=63{v++;print > "'"$outputfile"'" };
    END{print "Valid lines:",v"\nInvalid lines:",NR-v}' "$fileToClean"

Context

StackExchange Code Review Q#84619, answer score: 3

Revisions (0)

No revisions yet.