HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythongitMinor

Base for iterating over git history

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
iteratinghistorygitforoverbase

Problem

I wanted to do some statistics over the history of a Git repository. At first I tried using GitPython, but it wasn't as trivial as I imagined it to be. In the end, I just call the necessary git commands with the submodule module.

Can this code be considered clean and readable or does it have some style issues?

import argparse
import os
import subprocess

def main(args):
    if not os.path.isdir(args.path):
        print "not a directory"
        return
    if ".git" not in os.listdir(args.path):
        print "not a repo"
        return

    os.chdir(args.path)

    commitIDs = []
    start = 0       #load commits in batches, to avoid too long hangs
    max_count = 100
    while True:
        text = subprocess.check_output(["git", "rev-list", args.branch, "--max-count=%s" % max_count, "--skip=%s" % start])
        start += max_count
        for line in text.splitlines():
            commitIDs.append(line)
        #print "loaded", len(commits), "commits"
        if len(text.splitlines()) != max_count:
            break

    for commitID in commitIDs[::-args.skip]:    #start with the oldest commit
        print commitID
        #do something with the commit
        #for example:
        #devnull = open(os.devnull, 'w')
        #subprocess.call(["git", "checkout", commitID], stdout=devnull)

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('path', nargs="?", default=".")
    parser.add_argument('--branch', '-b', default="master")
    parser.add_argument('--skip', '-s', type=int, default=1, help='use every n-th commit')
    args = parser.parse_args()

    main(args)


Update:

```
import argparse
import os
import subprocess
import sys
import git

def main(args):
try:
repo = git.Repo(args.path)
except:
sys.exit("no such repo")

try:
text = repo.git.rev_list(args.branch).splitlines()
except:
sys.exit("no such branch")

commit_ids = text[::args.skip]

print "loaded %s

Solution

Great, glad you got rid of that loop: some more things I'd like to point out. As this is essentially another question now I'll add another answer

-
I don't think a function should call sys.exit(). I'd rather you raised an exception instead and call sys.exit() in __main__

I.e. instead of:

try:
    repo = git.Repo(args.path)
except:
    sys.exit("no such repo")


I would prefer:

from git import Repo, InvalidGitRepositoryError
#...

try:
    repo = git.Repo(args.path)
except InvalidGitRepositoryError, e:
    raise InvalidGitRepositoryError("%s is not a git repository" % (str(e)))


This enables other Python classes to use your main function (which is a terrible name) in a more predictable way (Python doesn't close calling your function).

-
I would also rather you check for the branch than assume if rev-list gives an error that the branch doesn't exist. There may be some other case that causes it to throw an exception.

class InvalidBranchError(Exception):
    pass

if args.branch not in repo.branches:
   raise InvalidBranchError("Branch does not exist: %s" % (args.branch))


-
Instead of main taking an arguments parser, I would prefer you used regular arguments (again so other python modules can use your code more readily):

def main(path=".", branch="master"):
   pass

main(path=args.path, branch=args.branch)

Code Snippets

try:
    repo = git.Repo(args.path)
except:
    sys.exit("no such repo")
from git import Repo, InvalidGitRepositoryError
#...

try:
    repo = git.Repo(args.path)
except InvalidGitRepositoryError, e:
    raise InvalidGitRepositoryError("%s is not a git repository" % (str(e)))
class InvalidBranchError(Exception):
    pass

if args.branch not in repo.branches:
   raise InvalidBranchError("Branch does not exist: %s" % (args.branch))
def main(path=".", branch="master"):
   pass

main(path=args.path, branch=args.branch)

Context

StackExchange Code Review Q#52198, answer score: 3

Revisions (0)

No revisions yet.