patternpythonMinor
Batch renames file's extension in a given directory with Python
Viewed 0 times
renamesdirectoryfilewithbatchextensionpythongiven
Problem
Arguments:
Example:
My Concerns.
- The directory in which you would like to change file extensions
- Old file extension
- And what the new extension should be
Example:
$ touch /tmp/spam.eggs
$ python change-ext.py /tmp .eggs .spam
$ ls /tmp
$ spam.spamMy Concerns.
- Is the code easy to read and comprehend?
- Is the code Pythonic?
- Does the code have any BUGS?
- Should i have used
argparseinstead ofsys.argv
'''
Batch renames file's extension in a given directory
'''
import os
import sys
from os.path import join
from os.path import splitext
def main():
try:
work_dir, old_ext, new_ext = sys.argv[1:]
except ValueError:
sys.exit("Usage: {} directory old-ext new-ext".format(__file__))
for filename in os.listdir(work_dir):
if old_ext == splitext(filename)[1]:
newfile = filename.replace(old_ext, new_ext)
os.rename(join(work_dir, filename), join(work_dir, newfile))
if __name__ == '__main__':
main()Solution
Questions
Is the code easy to read and comprehend?
Import both
Is the code Pythonic?
Yes... But is it good?
It's doesn't follow *nix too much. And
When using
The status should be a number from 0-127, where 0 is successful.
And all the other numbers are an error of sorts.
Also you can set up your terminal to change colour or add words if you get an error response. So when
Does the code have any BUGS?
Yes:
Should i have used
I'd say no, as this is a very small script.
Otherwise, I'd say yes.
If you need flags and such you should use
Code review
As noted above, your usage for
To amend this, I would change your
And then catch it, print the message, and exit with status 1.
To amend the bug, I would use a completely different approach.
A more *nix and Python approach.
*nix encourages regex's, and Python has
They are both micro-languages but regex's are for pattern matching where Python's
I would argue having two dedicated micro-languages, is far better than anything we manually program.
This is as complex as your previous code. However you can now do things like:
As I was asked
Wouldn't the fnmatch module be better suited for matching a pattern in a filename then a regular expression.
I thought I should make a pro's and con's for each method:
Regex
Pros
-
Well know micro-language
-
Allows an extremely large amount of customisation to the query
-
Easy to implement
Cons
glob /
Pros
-
Simple input
-
Easy to implement
Cons
-
No capture groups
-
Good luck with substitution
-
Some inputs are impossible to have reliable output.
glob based regex
So as the main fallback is that you can't have capture-groups in glob,
you can add them.
E.g:
Pros
-
Easy for people who don't know regex
-
Smaller input, but not as small is glob
Cons
-
Hard to implement well, without becoming regex where
However if we were to use
And then there is the edge-case for
As the docs say:
For a literal match, wrap the meta-characters in brackets. For example,
But that make
Conclusion
I chose
You can extract information out of it with ease.
And is super easy to implement.
If you wanted to use glob & regex, you might as well use regex.
And glob is kinda broken:
```
>>> '{0[:-4]}.boo'.format('hello.txt')
TypeError: string indices must be integers
>>> '{0[-4]}.boo'.format('hello.txt')
TypeError: string indices must be integers
Is the code easy to read and comprehend?
Import both
join and splitext on the same line, and it'll be perfect.Is the code Pythonic?
Yes... But is it good?
It's doesn't follow *nix too much. And
sys.exit is just wrong.When using
sys.exit you pass an error status, not a message.The status should be a number from 0-127, where 0 is successful.
And all the other numbers are an error of sorts.
Also you can set up your terminal to change colour or add words if you get an error response. So when
sys.exit(1) is called my prompt goes from green to red.Does the code have any BUGS?
Yes:
>>> '.txt_file.txt'.replace('.txt', '.spam')
'.spam_file.spam'Should i have used
argparse instead of sys.argvI'd say no, as this is a very small script.
Otherwise, I'd say yes.
If you need flags and such you should use
argparse, even if the program is very small. So if you think about adding a --verbose flag, you should.Code review
As noted above, your usage for
sys.exit is wrong.To amend this, I would change your
sys.exit to a standard raise.And then catch it, print the message, and exit with status 1.
class InputError(Exception):pass
def main():
...
except ValueError:
raise InputError("Usage: {} directory old-ext new-ext".format(__file__))
...
if __name__ == '__main__':
try:
main()
except InputError as e:
print(str(e))
sys.exit(1)To amend the bug, I would use a completely different approach.
A more *nix and Python approach.
*nix encourages regex's, and Python has
str.format, but you can use the both of them!They are both micro-languages but regex's are for pattern matching where Python's
format is for output formatting.I would argue having two dedicated micro-languages, is far better than anything we manually program.
import sys
import re
import os
from os.path import join
class InputError(Exception):pass
def main():
try:
work_dir, regex, format_ = sys.argv[1:]
except ValueError:
raise InputError("Usage: {} directory regex format".format(__file__))
regex = re.compile(regex)
for filename in os.listdir(work_dir):
if regex.match(filename) is not None:
newfile = format_.format(*regex.findall(filename)[0])
os.rename(join(work_dir, filename), join(work_dir, newfile))
print('Replaced {} with {}'.format(filename, newfile))
if __name__ == '__main__':
try:
main()
except InputError as e:
print(str(e))
sys.exit(1)This is as complex as your previous code. However you can now do things like:
$ mkdir temp
$ touch temp/txt.boo
$ python change.py temp "(.*)\.(txt|boo)" ".{1}_file.{0}"
Replaced txt.boo with .boo_file.txt
$ ls temp
.boo_file.txt
$ python change.py temp "(.*)\.(txt|boo)" ".{1}_file.{0}"
Replaced .boo_file.txt with .txt_file..boo_file
$ ls temp
.txt_file..boo_file
$ python change.py temp
Usage: change.py directory regex format
err$As I was asked
Wouldn't the fnmatch module be better suited for matching a pattern in a filename then a regular expression.
I thought I should make a pro's and con's for each method:
Regex
Pros
-
Well know micro-language
-
Allows an extremely large amount of customisation to the query
-
Easy to implement
Cons
- Larger input
glob /
fnmatchPros
-
Simple input
-
Easy to implement
Cons
-
No capture groups
-
Good luck with substitution
'{0[:-4]}.boo'.format('hello.txt')-
Some inputs are impossible to have reliable output.
fileglob based regex
So as the main fallback is that you can't have capture-groups in glob,
you can add them.
E.g:
re.compile(fnmatch.translate().replace(r'\(', '(').replace(r'\)', ')'))Pros
-
Easy for people who don't know regex
-
Smaller input, but not as small is glob
Cons
-
Hard to implement well, without becoming regex where
. is substituted with .inp = '\(*\).txt' # Matches r'\\(.*\\)\.txt'.
fnmatch.translate(inp).replace(r'\(', '(').replace(r'\)', ')'))However if we were to use
re.sub.inp = '(*).txt' # Matches r'(.*).txtZ'
re.sub(r'\\(.)', lambda m: m.group(1), fnmatch.translate(inp))
inp = r'(*)\.txt' # Might as well be regex.
re.sub(r'\\(.)', lambda m: m.group(1), fnmatch.translate(inp + '\\'))And then there is the edge-case for
'[a-z]*'.As the docs say:
For a literal match, wrap the meta-characters in brackets. For example,
'[?]' matches the character '?'.But that make
'[a-z][]' turn into '[a-z][]' not '[a-z]*'.Conclusion
I chose
re as more people know it.You can extract information out of it with ease.
And is super easy to implement.
If you wanted to use glob & regex, you might as well use regex.
And glob is kinda broken:
```
>>> '{0[:-4]}.boo'.format('hello.txt')
TypeError: string indices must be integers
>>> '{0[-4]}.boo'.format('hello.txt')
TypeError: string indices must be integers
Code Snippets
>>> '.txt_file.txt'.replace('.txt', '.spam')
'.spam_file.spam'class InputError(Exception):pass
def main():
...
except ValueError:
raise InputError("Usage: {} directory old-ext new-ext".format(__file__))
...
if __name__ == '__main__':
try:
main()
except InputError as e:
print(str(e))
sys.exit(1)import sys
import re
import os
from os.path import join
class InputError(Exception):pass
def main():
try:
work_dir, regex, format_ = sys.argv[1:]
except ValueError:
raise InputError("Usage: {} directory regex format".format(__file__))
regex = re.compile(regex)
for filename in os.listdir(work_dir):
if regex.match(filename) is not None:
newfile = format_.format(*regex.findall(filename)[0])
os.rename(join(work_dir, filename), join(work_dir, newfile))
print('Replaced {} with {}'.format(filename, newfile))
if __name__ == '__main__':
try:
main()
except InputError as e:
print(str(e))
sys.exit(1)$ mkdir temp
$ touch temp/txt.boo
$ python change.py temp "(.*)\.(txt|boo)" ".{1}_file.{0}"
Replaced txt.boo with .boo_file.txt
$ ls temp
.boo_file.txt
$ python change.py temp "(.*)\.(txt|boo)" ".{1}_file.{0}"
Replaced .boo_file.txt with .txt_file..boo_file
$ ls temp
.txt_file..boo_file
$ python change.py temp
Usage: change.py directory regex format
err$re.compile(fnmatch.translate().replace(r'\(', '(').replace(r'\)', ')'))Context
StackExchange Code Review Q#111353, answer score: 4
Revisions (0)
No revisions yet.