patternpythonMinor
Backup files script
Viewed 0 times
scriptfilesbackup
Problem
I'm creating own script for backup user's directories.
CentOS 6.5;
Python 2.6.
Directories looks like:
As script big enought, I'll post only three 'most important' functions. So please note, that they using some additional functions.
First - function to create backup directories:
Second - 'incremental' backup, for files changed last twenty-four hours:
```
def inc_backup(dir_to_backup, backupname):
now = time.time()
cutoff = 86400
print('Creating archive %s...' % backupname)
out = tarfile.open(backupname, mode='w:bz2')
# start walk via all files to find changed last 24 hours
for root, dirs, files in os.walk(dir_to_backup, followlinks=True):
for file in files:
file = os.path.join(root, file)
try:
filemodtime = os.stat(file).st_mtime
if now - filemodtime < cutoff:
if os.path.isfile(file):
print('Adding file: %s...' % file)
out.add(file)
CentOS 6.5;
Python 2.6.
Directories looks like:
# tree -L 2 -d /var/www/vhosts/
/var/www/vhosts/
├── freeproxy
│ └── freeproxy.in.ua
├── hudeem
│ └── hudeem.kiev.ua
├── profy
│ └── profy.kiev.ua
├── rtfm
│ └── rtfm.co.ua
├── setevoy
│ ├── forum.setevoy.kiev.ua
│ ├── postfixadmin.setevoy.org.ua
│ ├── setevoy.org.ua
│ └── webmail.setevoy.org.ua
├── worlddesign
│ └── worlddesign.org.ua
└── zabbix
└── zabbix.setevoy.org.uaAs script big enought, I'll post only three 'most important' functions. So please note, that they using some additional functions.
First - function to create backup directories:
# global const for user's directory
VHOSTSPATH = '/var/www/vhosts/'
def back_dir_create(user, day):
backdir = '/home/setevoy/backups/temp_new_test/'
# weekly are kept 4 last, daily - 7 copies
# thus - better have separate directories
if day == 'Sun':
type = '/weekly/'
else:
type = '/daily/'
dirname = ('%s%s%s%s-files/' % (backdir, user, type, time.strftime('%Y-%m-%d')))
if not os.path.exists(dirname):
print('Creating directory: %s' % dirname)
os.makedirs(dirname)
return(dirname)Second - 'incremental' backup, for files changed last twenty-four hours:
```
def inc_backup(dir_to_backup, backupname):
now = time.time()
cutoff = 86400
print('Creating archive %s...' % backupname)
out = tarfile.open(backupname, mode='w:bz2')
# start walk via all files to find changed last 24 hours
for root, dirs, files in os.walk(dir_to_backup, followlinks=True):
for file in files:
file = os.path.join(root, file)
try:
filemodtime = os.stat(file).st_mtime
if now - filemodtime < cutoff:
if os.path.isfile(file):
print('Adding file: %s...' % file)
out.add(file)
Solution
Just like you have defined the
it will make sense to do the same for this too:
So that if you move your script to another machine, all the local customizations will be in one very visible place.
For even more benefits, move these files to a dedicated
then it will be really perfectly clear where to customize the script parameters,
and it will be separated from the rest of the code, which will be easily redistributable.
This could be done better:
It's better to avoid variable names that overlap with keywords. How about
The way you assign
and just by reading that line it's not clear that this is intended as a nested directory layout. If I look at the
In this version, thanks to the
now it is explicit that we're talking about a nested hierarchy.
Also, you don't need to worry anymore if all components have a
Just make all components not have such prefix or suffix and let
Finally,
Sure, you may not care about that now, but you never know.
Don't reassign the value of a loop variable:
Use a different name for it, for example:
The reason is that this can be confusing in various ways.
The same is true for function parameters: don't reassign them,
use different names for local variables.
The
By virtue of the way
Except in some extreme conditions where the file might get removed in the middle of walking.
If you are in such environment, then it's fine.
I'm not familiar with
you can probably do this, which will be better:
If this indeed works, then you don't need to worry about closing the
it will be properly closed automtically by Python.
In the original code, if an error happens after
The
VHOSTSPATH global variable at the very top,it will make sense to do the same for this too:
backdir = '/home/setevoy/backups/temp_new_test/'So that if you move your script to another machine, all the local customizations will be in one very visible place.
For even more benefits, move these files to a dedicated
settings.py module and make your script import these values from it.then it will be really perfectly clear where to customize the script parameters,
and it will be separated from the rest of the code, which will be easily redistributable.
This could be done better:
if day == 'Sun':
type = '/weekly/'
else:
type = '/daily/'
dirname = ('%s%s%s%s-files/' % (backdir, user, type, time.strftime('%Y-%m-%d')))type is a keyword in Python. Intelligent code editors should highlight it for you.It's better to avoid variable names that overlap with keywords. How about
subdir?The way you assign
dirname is a bit hard to read,and just by reading that line it's not clear that this is intended as a nested directory layout. If I look at the
type variable I see that all possible values are prefixed and suffixed with the / directory separator, but the code doesn't make it obvious why it is this way. A better way would be like this:if day == 'Sun':
subdir = 'weekly'
else:
subdir = 'daily'
dirname = os.path.join(backdir, user, subdir, '%s-files' % time.strftime('%Y-%m-%d'))In this version, thanks to the
os.path.join,now it is explicit that we're talking about a nested hierarchy.
Also, you don't need to worry anymore if all components have a
/ prefix or suffix or not.Just make all components not have such prefix or suffix and let
os.path.join take care of it.Finally,
os.path.join is portable: it will work on any operating system.Sure, you may not care about that now, but you never know.
Don't reassign the value of a loop variable:
for file in files:
file = os.path.join(root, file)Use a different name for it, for example:
for file in files:
path = os.path.join(root, file)The reason is that this can be confusing in various ways.
The same is true for function parameters: don't reassign them,
use different names for local variables.
The
if condition here seems pointless:for file in files:
file = os.path.join(root, file)
# ...
if os.path.isfile(file):
# ...By virtue of the way
os.walk works, I think os.path.join(root, file) is always a valid file.Except in some extreme conditions where the file might get removed in the middle of walking.
If you are in such environment, then it's fine.
I'm not familiar with
tarfile.open, but instead of this:out = tarfile.open(backupname, mode='w:bz2')
# ... do work
out.close()you can probably do this, which will be better:
with tarfile.open(backupname, mode='w:bz2') as out:
# ... do workIf this indeed works, then you don't need to worry about closing the
out handle,it will be properly closed automtically by Python.
In the original code, if an error happens after
out =, the out.close() might never be reached.The
with open(...) as fh: idiom is the recommended way to work with resources that need to be closed when you're done with them.Code Snippets
backdir = '/home/setevoy/backups/temp_new_test/'if day == 'Sun':
type = '/weekly/'
else:
type = '/daily/'
dirname = ('%s%s%s%s-files/' % (backdir, user, type, time.strftime('%Y-%m-%d')))if day == 'Sun':
subdir = 'weekly'
else:
subdir = 'daily'
dirname = os.path.join(backdir, user, subdir, '%s-files' % time.strftime('%Y-%m-%d'))for file in files:
file = os.path.join(root, file)for file in files:
path = os.path.join(root, file)Context
StackExchange Code Review Q#66546, answer score: 2
Revisions (0)
No revisions yet.