HiveBrain v1.2.0
Get Started
← Back to all entries
patternbashMinor

Filetype backup script

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
scriptfiletypebackup

Problem

This was one of my assignments in which I had to create a backup script that will individually compress all files (of an arbitrary number) of file-types (indicated by their .extension).

Example: backup a b c will take all files ending in .a, .b and .c and compress them using tar, keeping the original file, and naming the new compressed file the same as before except with .tar.gz at the end.

The -t flag can be used to specify the target directory.

The -d flag can be used to specify the destination directory.

Example: backup -t ~/Desktop/stuff -d goesHere gif where goesHere is a directory in your current path and all files ending with .gif come from ~/Desktop/stuff.

$ ls
. .. backup cat.gif dog.gif funny_video.mov media
$ ./backup -d media gif mov
backup: compressed .gif files
backup: compressed .mov files
backup: files saved to "media"
$ cd media
$ ls
. .. cat.gif.tar.gz dog.gif.tar.gz funny_video.mov.tar.gz
$


Both flags are optional, allowing you to specify either one or both. If you specify both flags you must use the order -t -d .

Granted I really don't like the specifications of the assignment, but that is what they are. The fact we're asked to compress all files by their .extension is kind of anti-*nix anyways (everyone knows cat.png could be a plain text file).

```
#!/bin/bash

list_begin=1 # where the filetype arguments begins
list_end=$ # where the filetype arguments end

target_directory=$(pwd)
destination_directory=$(pwd)

if [[ $# < 1 ]]; then
echo "backup: not enough arguments"
exit
fi

# determine if any flags are used and if so update directories and $list_begin
if [[ $1 == "-t" ]]; then # -t
target_directory=$2
list_begin=3
if [[ $3 == "-d" ]]; then # -t, -td
destination_directory=$4
list_begin=5
fi
elif [[ $1 == "-d" ]]; then # -d
destination_directory=$2
list_begin=3
fi

# confirm directories are real
if [[ ! -d "${target_directory}" ]]; then # invalid -

Solution

As you say, the requirements are, let's say, unconventional. But in some ways, that's good preparation for crazy requirements you'll get in working life! Except that as a student, you might not get to ask the "five whys" to determine whether the requirements are actually grounded, and to get them changed.

There are still some clarifications of the requirements:

  • Should only regular files be archived, or should we include directories and devices, too?



  • Should the archive contain full pathnames of the contained files, or relative names, or just basenames? If only basenames, then how should we deal with conflicts?



  • Should the destination directory be created if it doesn't already exist?



As this is a homework assignment, you may not be in a position to ask these questions - in that case, you should assume a position and state it clearly in your answer. If possible, localise the effects of a wrong assumption, to make it easy to change when you get more information (again, that's a good reflection of working life).

Working through the code from the beginning:

list_begin=1 # where the filetype arguments begins
list_end=$ # where the filetype arguments end

target_directory=$(pwd)
destination_directory=$(pwd)


It looks like you can save two invocations of pwd by using . as the default directory name - you never need an absolute path.

A more common approach is to remove option arguments using shift as they are parsed; then all of $* are your inputs. I'm assuming $ there is a typo for $#?

Here's the outline:

while test -n "${1+:}"
do
    case "$1" in
        -t|--target)
            target="$2"
            shift 2
            ;;
        -d|--destination)
            destination="$2"
            shift 2
            ;;
        *)
            process_file "$1"
            shift
            ;;
    esac
done


I've included long-name alternatives just to show how it's done, not because that's a requirement. This approach allows the user to specify -t and -d more than once, to change those settings during a run. It's not clear whether you are required to enforce the ordering constraints you were given, or if that's just limiting the problem space to make it easier. If the former, you can just process them in turn:

if [ "-t" = "$1" ]
then
    target="$2"
    shift 2
fi
if [ "-d" = "$1" ]
then
    destination="$2"
    shift 2
fi


and so on.

I've also fixed the quoting above - you'll find that if you miss it out, then people will want at target or destination with whitespace in the name!

# confirm directories are real
if [[ ! -d "${target_directory}" ]]; then # invalid -t
    echo "backup: \"${target_directory}\" is not valid"
    exit
elif [[ ! -d "${destination_directory}" ]]; then # invalid -d
    echo "backup: \"${destination_directory}\" is not valid"
    exit
fi


You don't need to use Bash [[ here - it's more portable to stick to plain old [ (aka test). You'll want your error messages to go to the error stream (>&2), and you'll want to exit with a non-zero value, so that anything executing your program will know it didn't succeed. I like to define a small function to do that:

die() {
    echo "$@" >&2
    exit 1
}


Then I can write:

test -d "$target_directory" || die "$0: $target_directory is not a directory"
test -r "$target_directory" && test -x "$target_directory" || die "$0: $target_directory cannot be read"
test -d "$destination_directory" || die "$0: $destination_directory is not a directory"
test -w "$destination_directory" || die "$0: $destination_directory is not writable"


declare -a filetypes # array to store filetypes

for ((i=${list_begin}, j=0; i<=${list_end}; i++, j++)); do # fetch all filetype arguments and add to array
    eval dir=\${$i}
    filetypes[${j}]=${dir}
done


I don't think it's necessary to do all this copying. You can simply process each argument as you come to it.

for i in "${filetypes[@]}"; do # for each filetype
    f_list="${target_directory}/*.${i}" # directory to iterate through


You shouldn't be quoting the * there, because you want the shell to expand it. If you expand it later, you may well end up splitting $i.

My version

```
#!/bin/sh
set -e

die() {
echo "$@" >&2
exit 1
}

process_suffix() {
for file in "${source-.}"/*."$1"
do
# Warning: may overwrite existing archive
test -e "$file" \
&& tar cfz "${destination:-.}/${file##*/}.tar.gz" -C "${source-.}" -- "${file#${source-.}/}"
done
echo "$0: finished *.$1 files"
}

while test -n "${1+:}"
do
case "$1" in
-t|--source)
test -d "$2" || die "$0: $2 is not a directory"
test -r "$2" && test -x "$2" || die "$0: $2 cannot be read"
source="$2"
shift 2
;;
-d|--destination)
test -d "$2" || die "$0: $2 is not a directory"
test -w "$2" || die "$0: $2 is not writable"

Code Snippets

list_begin=1 # where the filetype arguments begins
list_end=$ # where the filetype arguments end

target_directory=$(pwd)
destination_directory=$(pwd)
while test -n "${1+:}"
do
    case "$1" in
        -t|--target)
            target="$2"
            shift 2
            ;;
        -d|--destination)
            destination="$2"
            shift 2
            ;;
        *)
            process_file "$1"
            shift
            ;;
    esac
done
if [ "-t" = "$1" ]
then
    target="$2"
    shift 2
fi
if [ "-d" = "$1" ]
then
    destination="$2"
    shift 2
fi
# confirm directories are real
if [[ ! -d "${target_directory}" ]]; then # invalid -t
    echo "backup: \"${target_directory}\" is not valid"
    exit
elif [[ ! -d "${destination_directory}" ]]; then # invalid -d
    echo "backup: \"${destination_directory}\" is not valid"
    exit
fi
die() {
    echo "$@" >&2
    exit 1
}

Context

StackExchange Code Review Q#143853, answer score: 2

Revisions (0)

No revisions yet.