HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythongitMinor

Automated backup of remote git repos

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
reposautomatedgitremotebackup

Problem

This is the first python script I've written. It's used to create a local backup of git repos from BitBucket. It works, but I'd like to make it more pythonic if possible.

```
"""
Script to check BitBucket for all repos accessible by the supplied user and clone,
tar and gzip each one (skipping unmodified and those explicitly told to).
"""

import sys
import argparse
import getpass
import glob
import os
import requests
import tarfile
from datetime import datetime
from os import path
from pprint import pprint
from shutil import rmtree
from subprocess import call
from time import sleep
from urllib import quote

# Required to write backups to the following directory:
backup_dir = '/back/git'

# Required to keep no more than this number of old versions:
version_count_limit = 5

# Get the options we were/should have been called with:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("user", help="your bitbucket username")
parser.add_argument("-p", "--password", help="your bitbucket password; you will be prompted if this is not provided")
parser.add_argument("-s", "--skipfile", type=argparse.FileType('r'), help="the location of a file containing a list of repo names to skip")
parser.add_argument("-v", "--verbose", action='store_true', help="increase output verbosity")
args = parser.parse_args()

# Required:
user = args.user

# Find out how much to display:
verbose = False
if args.verbose is not None:
verbose = args.verbose

# Get the list of repos not to back up, if any:
skip_repos = []
if args.skipfile is not None:
skip_repos = args.skipfile.read().splitlines()
args.skipfile.close

# If we haven't been given a password, request one now:
password = args.password
if password is None:
while True:
password = getpass.getpass()
if password:
break

# Try to get a list of repos from bitbucket:
r = requests.get('https://{user}:{password}@api.bitbucket.org/1.0/user/repositories'.format(user=user, password=passwor

Solution

Looking good, congratulations!

You can improve your error handling. Trying something and just failing if it does not work is not really helpful, and does not improve on the default behavior, which has the advantage of showing you a traceback. So maybe a first step is to just remove all those try/except? For small scripts, it boils down to personal preferences.

Don't use whitespace before colons, and ensure not to use more than 80 columns. See PEP 8, pycodestyle, flake8 and yapf to help you with this.

"""
Script to check BitBucket for all repos accessible by the supplied user and clone,
tar and gzip each one (skipping unmodified and those explicitly told to).
"""

import sys
import argparse
import getpass
import glob
import os
import requests
import tarfile
from datetime import datetime
from os import path
from pprint import pprint
from shutil import rmtree
from subprocess import call
from time import sleep
from urllib import quote


It's often less confusing to use the full path to the functions you're going to call. If see call() in your code, I don't know what it is. But if I see subprocess.call(), I know what it is. It's tempting to use from ... import ..., but don't do it, especially for the standard library.

# Required to write backups to the following directory:
backup_dir = '/back/git'

# Required to keep no more than this number of old versions:
version_count_limit = 5


Constants should be written in uppercase, eg. BACKUP_DIR.

# Get the options we were/should have been called with:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("user", help="your bitbucket username")
parser.add_argument("-p", "--password", help="your bitbucket password; you will be prompted if this is not provided")
parser.add_argument("-s", "--skipfile", type=argparse.FileType('r'), help="the location of a file containing a list of repo names to skip")
parser.add_argument("-v", "--verbose", action='store_true', help="increase output verbosity")
args = parser.parse_args()

# Required:
user = args.user

# Find out how much to display:
verbose = False
if args.verbose is not None:
    verbose = args.verbose


argparse supports default values, doesn't it?

# Get the list of repos not to back up, if any:
skip_repos = []
if args.skipfile is not None:
    skip_repos = args.skipfile.read().splitlines()
    args.skipfile.close


Is args.skipfile.close doing anything?

# If we haven't been given a password, request one now:
password = args.password
if password is None:
    while True:
        password = getpass.getpass()
        if password:
            break


Consider using click wich supports prompting.

# Try to get a list of repos from bitbucket:
r = requests.get('https://{user}:{password}@api.bitbucket.org/1.0/user/repositories'.format(user=user, password=password))


This is not Python related, but you should really use 2-factor authentication, which means you'd need to use OAuth here.

if r.status_code != 200:
    print "Failed trying to fetch repos list: {code} - {error}".format(code=r.status_code, error=r.reason)
    sys.exit(-1)

# Parse the response:
try:
    json = r.json()
except ValueError as e:
    print "Failed to decode JSON ({code}): {error}".format(code=e.errno, error=e.strerror)
    pprint(r.text)
    sys.exit(-1)


Raising exceptions with these message would be more Pythonic and more flexible if this becomes, say an module that can be imported. You should also print to standard error. Consider using Python 3 prints (and consider using Python 3 altogether).

# Backup each repo:
first = True
for repo in json:

    # Don't back up things we've been told to skip:
    name = "{owner}_{name}".format(owner=repo['owner'], name=repo['name'].replace(' ', '_'))
    if name in skip_repos:
        if verbose:
            print "{name} in skip list; skipping".format(name=name)
        continue


Use logging instead of this verbose check. This allows you to log unconditionally and logging will decide to print or not depending on the configuration. The logging module is a bit hard to set up but is worth it.

# We don't want to hammer bitbucket so, on every loop except the first, wait thirty seconds before starting
    if first:
        first = False


Using enumerate() was also an option which avoids you to maintain state in the first variable and instead use if i == 0. There are also smarter options like using futures in threads to ensure you're not operating on more than two repositories at once, say.

```
else:
if verbose:
print "...sleeping..."
sleep(30)

if verbose:
print "Pulling {name} and archiving to {archive}".format(name=name,archive=archive)

# Create a working directory
tmp_dir = "{dir}/{newdir}".format(dir=backup_dir, newdir=name)
try:
os.makedirs(tmp_dir)
except OSError as e:
if not path.isdir(tmp_dir):
print "Failed to create wor

Code Snippets

"""
Script to check BitBucket for all repos accessible by the supplied user and clone,
tar and gzip each one (skipping unmodified and those explicitly told to).
"""

import sys
import argparse
import getpass
import glob
import os
import requests
import tarfile
from datetime import datetime
from os import path
from pprint import pprint
from shutil import rmtree
from subprocess import call
from time import sleep
from urllib import quote
# Required to write backups to the following directory:
backup_dir = '/back/git'

# Required to keep no more than this number of old versions:
version_count_limit = 5
# Get the options we were/should have been called with:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("user", help="your bitbucket username")
parser.add_argument("-p", "--password", help="your bitbucket password; you will be prompted if this is not provided")
parser.add_argument("-s", "--skipfile", type=argparse.FileType('r'), help="the location of a file containing a list of repo names to skip")
parser.add_argument("-v", "--verbose", action='store_true', help="increase output verbosity")
args = parser.parse_args()

# Required:
user = args.user

# Find out how much to display:
verbose = False
if args.verbose is not None:
    verbose = args.verbose
# Get the list of repos not to back up, if any:
skip_repos = []
if args.skipfile is not None:
    skip_repos = args.skipfile.read().splitlines()
    args.skipfile.close
# If we haven't been given a password, request one now:
password = args.password
if password is None:
    while True:
        password = getpass.getpass()
        if password:
            break

Context

StackExchange Code Review Q#140846, answer score: 4

Revisions (0)

No revisions yet.