patternpythongitMinor
Team git commit cleaner
Viewed 0 times
cleanerteamcommitgit
Problem
I cleaned a big repository for my team with this Python code. My goal was for every developer in my team to check if some bad email are in the commit, and replace the information by the good one. I use
Because I can't make an array in an array in bash, I created a Python script to handle all the developers in my team.
Any idea on how I can optimize this code?
git filter-branch and a for loop in bash.Because I can't make an array in an array in bash, I created a Python script to handle all the developers in my team.
Any idea on how I can optimize this code?
git filter-branch take a long time.# coding=utf-8
import subprocess
import os
def generate_command(dev):
emails_string = ""
for email in dev["emails"]:
emails_string += '"%s" ' % email
return """git filter-branch -f --env-filter 'OLD_EMAILS=(%s)
CORRECT_NAME="%s"
CORRECT_EMAIL="%s"
for email in ${OLD_EMAILS[@]};
do
if [ "$GIT_COMMITTER_EMAIL" = "$email" ]
then
export GIT_COMMITTER_NAME="$CORRECT_NAME"
export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
fi
if [ "$GIT_AUTHOR_EMAIL" = "$email" ]
then
export GIT_AUTHOR_NAME="$CORRECT_NAME"
export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
fi
done' --tag-name-filter cat -- --branches --tags""" % (emails_string.strip(),
dev["author_name"],
dev["author_email"])
developers = [
{
"emails": ["bad_email_author1@mycompany.com", "bad_email2_author1@mycompany.com"],
"author_name": "first dev",
"author_email": "good_email_author1@mycompany.com"
},
{
"emails": ["bad_email_author2@mycompany.com", "bad_email2_author2@mycompany.com"],
"author_name": "second dev",
"author_email": "good_email_author2@mycompany.com"
}
]
if __name__ == '__main__':
for developer in developers:
subprocess.call(generate_command(developer), shell=True)Solution
First reaction: wow this is scary: Python script generating Bash which again calls some Bash in it. But I see the filter-env technique comes straight out from an example in the docs.
I would have written this in pure Bash, using a helper function that takes as parameters:
And then for each bad email address, call
but all in pure Bash.
As far as the Python part is concerned, this can be done better:
Using a list comprehension:
With this, you don't need to
I would have written this in pure Bash, using a helper function that takes as parameters:
- author name
- author email
- one or more bad email addresses
And then for each bad email address, call
git filter-branch like you did,but all in pure Bash.
As far as the Python part is concerned, this can be done better:
emails_string = ""
for email in dev["emails"]:
emails_string += '"%s" ' % emailUsing a list comprehension:
emails_string = " ".join(['"%s"' % email for email in dev["emails"]])With this, you don't need to
.strip() the emails_string when you generate the command string.Code Snippets
emails_string = ""
for email in dev["emails"]:
emails_string += '"%s" ' % emailemails_string = " ".join(['"%s"' % email for email in dev["emails"]])Context
StackExchange Code Review Q#73346, answer score: 2
Revisions (0)
No revisions yet.