patternbashMinor
Fast incremental backup algorithm using RSYNC
Viewed 0 times
fastalgorithmusingincrementalrsyncbackup
Problem
I'm running a simple Bash script that uses
-
Hourly backups for 24 hours.
-
Daily backups for 1 week.
-
Weekly backups for 1 month.
-
Monthly backups from that point on.
I'll figure out when to delete the monthly backups based upon when I run out of space. So we're not worried about that. I'm also familiar with the various wrappers for
Here's the actual commented code that runs every hour:
Basically:
That seems to work but I'm wondering if anyone can come up with a simpler, more efficient algori
rsync to do an incremental backup of my web server every hour. What I'm looking for is an efficient algorithm to delete the proper backups so that in the end I keep:-
Hourly backups for 24 hours.
-
Daily backups for 1 week.
-
Weekly backups for 1 month.
-
Monthly backups from that point on.
I'll figure out when to delete the monthly backups based upon when I run out of space. So we're not worried about that. I'm also familiar with the various wrappers for
rsync like rdiff-backup and rsnapshot but those aren't necessary. I prefer to write code myself whenever possible even if it means reinventing the wheel sometimes. At least that way if I get a flat tire I know how to fix it :)Here's the actual commented code that runs every hour:
#if it's not Sunday and it's not midnight
if [ $(date +%w) -ne 0 ] && [ $(date +%H) -ne 0 ]; then
#remove the backup from one day ago and one week ago
rm -rf $TRG1DAYAGO
rm -rf $TRG1WEEKAGO
fi
#if it's Sunday
if [ $(date +%w) -eq 0 ]; then
#if it's midnight
if [ $(date +%H) -eq 0 ]; then
#if the day of the month is greater than 7
# we know it's not the first Sunday of the month
if [ $(date +%d) -gt 7 ]; then
#delete the previous week's files
rm -rf $TRG1WEEKAGO
fi
#if it's not midnight
else
#delete the previous day and week
rm -rf $TRG1DAYAGO
rm -rf $TRG1WEEKAGO
fi
fiBasically:
If it's not Sunday and it's not 3am:
- delete the backup from one day ago
- delete the backup from one week ago
If it is Sunday:
If it is Midnight:
If the day of the month is greater than 7:
-delete the backup from one week ago
Else (if it's not Midnight)
- delete the backup from one day ago
- delete the backup from one week ago
That seems to work but I'm wondering if anyone can come up with a simpler, more efficient algori
Solution
That seems to work but I'm wondering if anyone can come up with a simpler, more efficient algorithm or add any ideas for a better way of accomplishing this.
I think so. Use a naming scheme with a common prefix, and a variable suffix depending on the period, for example:
-
Hourly backups for 24 hours:
-
Daily backups for 1 week:
-
Weekly backups for 1 month:
You'll never have to delete anything. You will have the same set of files,
and
As for the posted code,
the single biggest problem is the duplicate calls to
I think so. Use a naming scheme with a common prefix, and a variable suffix depending on the period, for example:
-
Hourly backups for 24 hours:
hourly-$(date +%H).gz, results in:hourly-00.gz
hourly-01.gz
hourly-02.gz
- ... and so on until
hourly-23.gzafter which it starts over fromhourly-00.gz
-
Daily backups for 1 week:
daily-$(date +%a).gz, results in:daily-Sun.gz
daily-Mon.gz
daily-Tue.gz
- ... and so on until
daily-Sat.gz, after which it starts over
-
Weekly backups for 1 month:
weekly-$(($(date +%W) % 4)).gz, results in:weekly-00.gz
weekly-01.gz
weekly-02.gz
weekly-03.gzafter which it starts over fromweekly-00.gz
You'll never have to delete anything. You will have the same set of files,
and
rsync (with appropriate parameters) will copy only the changed ones.As for the posted code,
the single biggest problem is the duplicate calls to
$(date ...) with the same parameters:- Inefficient: multiple unnecessary process executions
- Bad practice: duplicated logic
- Unnecessary and error prone: the multiple calls to
$(date +%w)(for example) probably expect to get the same result. So there should be only one call, saved in a variable. If the day happens to turn between two calls, may have a nasty bug, and in any case it's completely unintended situation.
Context
StackExchange Code Review Q#77112, answer score: 2
Revisions (0)
No revisions yet.