patternbashMinor
Cleaning a WPA wordlist
Viewed 0 times
wordlistwpacleaning
Problem
I have a short bash script that processes gigs and gigs of data. I am looking for any improvements to make it faster. This is my very first bash script so please be gentle. I am really only concerned about the
```
#!/bin/bash
# This script will clean a WPA wordlist
# It will read every line of the given file
# Remove all whitespace except for newlines
# Delete the line if it is less than 8 chars or greater than 63
# It will then exit with time of execution
IFS=$'\n' # make newlines the only separator
startTime=$(date) # start time of execution
fileToClean=$1 # this is the file we will be sanatizing, 1st cmd line arg
deletedlines=0 # number of lines that did not meet WPA PSK critera
validlines=0 # number of lines that were valid PSKs and added to file
if [ -z $fileToClean ]; then #No file specified
echo ""
echo "No file specified!"
echo ""
exit -1
fi
if [ ! -f $fileToClean ]; then #File does not exist
echo ""
echo "File not found!"
echo ""
exit -1
fi
#By this point I am assuming I entered a valid file and will begin cleaning
echo ""
echo 'Cleaning word list: ' $1
echo "Start Time: " $startTime
echo ""
while read line; do #read every line in file and save to var line
line="${line##*( )}" # trim leading whitespace
line="${line%%*( )}" # trim trailing whitespace
if [ ${#line} -ge 8 ] && [ ${#line} -le 63 ]; then # if trimmedline length >= 8 && > $outputfile
((validlines++))
continue
fi
((deletedlines++))
done &1 1>/dev/null )" # this stores the executio time in the var utime
echo "Processing completed, it took" $utime "and $deletedlines were deleted."
echo $validlines "were added to the output file "$outputfile" as th
while loop. The rest of it is fine I think. It's the while loop where the real work is done and could use the most enhancement. ```
#!/bin/bash
# This script will clean a WPA wordlist
# It will read every line of the given file
# Remove all whitespace except for newlines
# Delete the line if it is less than 8 chars or greater than 63
# It will then exit with time of execution
IFS=$'\n' # make newlines the only separator
startTime=$(date) # start time of execution
fileToClean=$1 # this is the file we will be sanatizing, 1st cmd line arg
deletedlines=0 # number of lines that did not meet WPA PSK critera
validlines=0 # number of lines that were valid PSKs and added to file
if [ -z $fileToClean ]; then #No file specified
echo ""
echo "No file specified!"
echo ""
exit -1
fi
if [ ! -f $fileToClean ]; then #File does not exist
echo ""
echo "File not found!"
echo ""
exit -1
fi
#By this point I am assuming I entered a valid file and will begin cleaning
echo ""
echo 'Cleaning word list: ' $1
echo "Start Time: " $startTime
echo ""
while read line; do #read every line in file and save to var line
line="${line##*( )}" # trim leading whitespace
line="${line%%*( )}" # trim trailing whitespace
if [ ${#line} -ge 8 ] && [ ${#line} -le 63 ]; then # if trimmedline length >= 8 && > $outputfile
((validlines++))
continue
fi
((deletedlines++))
done &1 1>/dev/null )" # this stores the executio time in the var utime
echo "Processing completed, it took" $utime "and $deletedlines were deleted."
echo $validlines "were added to the output file "$outputfile" as th
Solution
The
We use
As such, we can 'inline' our increment of
Now onto the other parts...
-
Multi-line
-
Your way of checking the elapsed time should be done outside your script, using the
Now do you really need to do this in
if statement can be reduced to just:[[ ${#line} >= 8 && ${#line} 0 ]]
&& echo $line >> $outputfile || ((deletedlines++))We use
bash's [[ ... ]] shell keyword to collate our conditions together. Helpfully, [[ ... ]] supports the use of >= and ]] && || is quite a useful (IMHO!) construct in bash for implementing a if-then-else logic. The only thing to note is that you can only specify one command each for both branches.As such, we can 'inline' our increment of
validLines and do an always true comparison to simplify our if branching, namely to either (&&) output the line to the output file, or (||) increment our deletedLines. Otherwise, it gets slightly longer due to the required use of { ... } compound command:[[ ${#line} >= 8 && ${#line} > $outputfile; } || ((deletedlines++))Now onto the other parts...
fileToCleanshould be set asfileToClean="$1"in case the filename has spaces inside.
-
Multi-line
echo can be done as such:[ ! -f "$fileToClean" ] && cat<<M && exit -1
Oh noes,
We have a multi-line error message
Telling user that the file is not a regular file.
M-
Your way of checking the elapsed time should be done outside your script, using the
time command. time can generate more (and probably more accurate) statistics about the 'performance' of your script and is readily available, hence the suggestion as opposed to manually calculating the value(s) yourself. :)Now do you really need to do this in
bash? Here's an awk one-liner (split into three for readability):awk '{sub(/^[ ]+/,"");sub(/[ ]+$/,"")};
length()>=8&&length() "'"$outputfile"'" };
END{print "Valid lines:",v"\nInvalid lines:",NR-v}' "$fileToClean"- Trim line.
- If length matches, count the line and print to
"$outputfile".
- At the end, print the summary for valid/invalid lines.
Code Snippets
[[ ${#line} >= 8 && ${#line} <= 63 && $((++validLines)) > 0 ]]
&& echo $line >> $outputfile || ((deletedlines++))[[ ${#line} >= 8 && ${#line} <= 63 ]]
&& { ((validLines++)); echo $line >> $outputfile; } || ((deletedlines++))[ ! -f "$fileToClean" ] && cat<<M && exit -1
Oh noes,
We have a multi-line error message
Telling user that the file is not a regular file.
Mawk '{sub(/^[ ]+/,"");sub(/[ ]+$/,"")};
length()>=8&&length()<=63{v++;print > "'"$outputfile"'" };
END{print "Valid lines:",v"\nInvalid lines:",NR-v}' "$fileToClean"Context
StackExchange Code Review Q#84619, answer score: 3
Revisions (0)
No revisions yet.