patternModerate

Regular backup/snapshots

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

snapshotsregularbackup

Problem

A (long) while ago I set up a file server in my basement running Linux. I am OCD when it comes to backups.

I set the server up with (remember, this was a while ago):

disk for OS

disk for 'valuable' things

disk for backups

The idea was that if any one disk failed, I could replace things with minimal loss. The OS was supposed to be stuff that was easy to reinstall. The 'valuable' things are data that is irreplaceable (e-mail, photos, documents, etc.). The backup disk contains a copy of the valuable data.

I also have learned (a long time ago) that keeping backups of your current data is not very useful if you corrupt (or delete) your current data, and then replace your backups with the corrupt version. As a result, I keep 'snapshots' of my valuable data at regular intervals, and I can go back to any snapshot to retrieve the data as it was at the time of the snapshot. If I delete a file now, I can go to a previous snapshot, and restore it.

I started using a script written in bash to do this for me.... Mike's handy backup script!

This worked for a while, but I discovered it had problems:

I wanted to make it configurable (specify the folders to back up outside the script)

sometimes the backup was long-running, and another backup could start before the previous one completed (these snapshots are taken at every hour).

My monthly backup routine copies the entire snapshot disk to an external drive, and this takes hours, and I want the data to be consistent

rm -rf on deeply-hard-linked files is a slow process.... so I had to remove it from the script....

I extended Mike's script, and then ended up rewriting it in perl.

Over time things have changed. I now have a RAID array mounted on /valuable which contains things that are supposedly 'valuable' to me. The regular filesystem is still mounted as /. The destination for these snapshots is the disk (also a RAID array) mounted at /snapshotng. There is about 2TB of valuable data (many large phot

Solution

Well, this is actually pretty decent code. There still is a lot of stuff that could be improved, I tried to focus on some more relevant ones.

Subroutine Prototypes

In a declaration like sub foo ($@&*) { ... }, we call the weird thing in parens a prototype. The prototype primarily changes how a call to that sub is parsed, and can set properties like context for the arguments. Contrary to popular belief, prototypes are not required in order to call a subroutine without parens – predeclaration (e.g. via sub foo;) is sufficient.

Prototypes do not generally verify the type of the arguments. Instead, $ or @ or % impose a certain context on that argument – my @foo = (1, 2, 3); bar @foo will pass 1, 2, 3 as arguments to bar if it has no prototype or some list prototype, or it will evaluate the expression @foo is scalar context with a $ prototype, thus passing 3 – the length of the array.

Prototypes are useless, as they can be disabled by invoking a function via the & sigil: &foo(...). Prototypes are a hindrance, because it's often not obvious that they change the context of an expression. Prototypes should not be used, unless you're trying to write something like push or map, which actually need them.

Your mysys has the (;@) prototype, which is especially funny. ; starts optional arguments, and @ allows a list of arbitrary length (including zero arguments). So it's equivalent to (@) which is equivalent to no prototype at all.

Your lastX and cleanAndLatest subs use prototypes in a misguided attempt to limit the number of arguments. Instead, check the size of @_ if you have to do this, e.g. die "The sub lastHour takes exactly one argument" if @_ != 1.

say > print

The [say]say builtin function is available since perl 5.10.0 (released 2007). It behaves exactly like print, except that it will append "\n" to the output instead of $\. This makes it insanely handy for text output. You can activate this function with

use feature 'say';

Shell commands vs. builtins

Perl's history as an Unix sysadmin language means that it has many builtin functions. Using these has the advantage that you can do better error handling, don't waste resources by starting an external process, and have a truly portable interface. The downside is that they are sometimes not quite as flexible, and might need a bit of boilerplate.

Instead of lockfile, use the flock builtin. E.g:

use POSIX qw/:flock/; # import constants

open my $lock_fh, ">", $lockfile or die "Can't open $lockfile: $!";
flock $lock_fh, LOCK_EX or die "Can't obtain lock on $lockfile";

...

flock $lock_fh, LOCK_UN or die "Can't unlock $lockfile";
close $lock_fh;
unlink $lockfile or die "Can't remove lockfile $lockfile";

Oh, all those horrible or dies. Since perl 5.10.1 (released 2009), the autodie module is available which replaces most built-in functions by versions that die on failure rather than relying on you to handle their return code. I highly recommend you use it.

Back to flock: It places an advisory lock on the file, and is blocking by default – there is no way to wait for x seconds, then fail (unless you whish to set an alarm). You can however use the nonblocking version flock $fh, LOCK_EX | LOCK_UN which returns immediately, possibly failing.

Instead of rm, you can use the unlink builtin.

Instead of mv, you can sometimes use rename. However, there are annoying portability issues and it won't work across file system boundaries (which is the case here). The File::Copy module (in Core since perl 5.002 (released 1996)) comes to the rescue, and offers the move and copy functions:

use File::Copy;

# move will either rename, or copy, then unlink
move $source => $destination or die "Couldn't move $source to $destination: $!";
copy $source => $destination or die "Couldn't copy $source to $destination: $!";

While the shell commands are certainly more handy, you should evaluate whether the alternative may be safer.

Assorted Spotlights

-
This snippet here is hilarious:

my @cl;
push @cl, @_;
my $msg = '"' . join ('" "', @cl) . '"';
print "Running $msg\n";

I would write it as:

my ($program, @arguments) = @_;
say "Running ", join ' ' => $program, map qq("$_"), @arguments;

While equally obfuscated, the argument unpacking clearly shows what the parameters mean – @cl isn't the best variable name ever. Using map qq("$_") clearly shows the intent that you want to put quotes around each item, join ' ' shows that you want a space in between. Your solution is equivalent, but harder to grok if you aren't aware of that specific pattern.

The part where your solution is better than mine is using a variable for the combined command.

-
In order to avoid accidentally global variables, I tend to put the main part of a script into a main subroutine, then invoke it as exit main(@ARGV) at the top, directly after initializing those variables t

Code Snippets

use feature 'say';

use POSIX qw/:flock/; # import constants

open my $lock_fh, ">", $lockfile or die "Can't open $lockfile: $!";
flock $lock_fh, LOCK_EX or die "Can't obtain lock on $lockfile";

...

flock $lock_fh, LOCK_UN or die "Can't unlock $lockfile";
close $lock_fh;
unlink $lockfile or die "Can't remove lockfile $lockfile";

use File::Copy;

# move will either rename, or copy, then unlink
move $source => $destination or die "Couldn't move $source to $destination: $!";
copy $source => $destination or die "Couldn't copy $source to $destination: $!";

my @cl;
push @cl, @_;
my $msg = '"' . join ('" "', @cl) . '"';
print "Running $msg\n";

my ($program, @arguments) = @_;
say "Running ", join ' ' => $program, map qq("$_"), @arguments;

Context

StackExchange Code Review Q#41928, answer score: 13

Revisions (0)

No revisions yet.