HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Using Python to rename multiple csv files in Windows

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
csvfilespythonusingmultiplewindowsrename

Problem

I need to rename a large list of csv files that are generated from 3 different servers. The files are produced with a date stamp as the extension, which I need to move in each file name to retain the date stamp.

The file name format before editing is as:

billing.pps-svr01.csv.2015-09-01

billing.pps-svr02.csv.2015-09-01

billing.pps-svr03.csv.2015-09-01

The file name format after editing should be:

billing.pps-svr01.2015-09-01.csv

billing.pps-svr02.2015-09-01.csv

billing.pps-svr03.2015-09-01.csv

My question is in regard to code efficiency and best practice. The following code seems to work in testing, however I'm very new to Python and programming in general and I'd like to know if there are any other ways to solve this problem that are more efficient...for example, could this be accomplished in a one-liner or something else? I should also note that I intend to incorporate this into a much larger script that parses these files after they've been renamed. Any feedback and/or suggestions would be great.

import os

os.chdir(r'C:\Users\Extract')

for filename in os.listdir('.'):
    if filename.startswith('billing.pps-svr'):
        name = filename.split('.')
        name = name+[name.pop(0-2)]
        new_name = '.'.join(name)
        os.rename(filename, new_name)

Solution

Here are my thoughts:

  • You don't do any error handling if any of the file operation fails. Not to be recommended



  • Your code will flip the latter two parts, all the time. It does not care if the filename has already been fixed or not!



  • The code name + [name.pop(0-2)] troubles me. You are concatenating the name list with the popped value, but in order for this work you need for the popping to happen before the first part to be joined. Scary stuff...



  • Here are some pythonic ways to do stuff with lists:



  • name[:-2] – Get everything but the two last elements



  • name[-2:] – Get only the last two elements



  • name[::-1] – Reverse the element list



Here is some coding displaying the flaw in your original rename code, and two options for how to handle it correctly.

for filename in (
                 'billing.pps-svr01.2014-09-01.csv',
                 'billing.pps-svr02.2014-09-01.csv',
                 'billing.pps-svr01.csv.2015-09-01',
                 'billing.pps-svr02.csv.2015-09-01',
                ):

    print('\nTesting {}:'.format(filename))

    name = filename.split('.')
    name = name + [name.pop(0-2)]
    new_name = '.'.join(name)

    print '    old rename: {} to {}'.format(filename, new_name)

    filename_parts = filename.split('.')
    first_part = filename_parts[:-2]
    last_part = filename_parts[-2:]

    if last_part[-1] != 'csv':
        new_name = '.'.join(first_part + last_part[::-1])
        print '    new rename: {} to {}'.format(filename, new_name)
    else:
        print '    no rename needed'

    if filename_parts[-2] == 'csv':
        new_name = '.'.join(filename_parts[:-2] + filename_parts[-2:][::-1])
        print '    alt rename: {} to {}'.format(filename, new_name)
    else:
        print '    no alternate rename needed'


The output from this are as follows:

Testing billing.pps-svr01.2014-09-01.csv:
old rename: billing.pps-svr01.2014-09-01.csv to billing.pps-svr01.csv.2014-09-01
no rename needed
no alternate rename needed

Testing billing.pps-svr02.2014-09-01.csv:
old rename: billing.pps-svr02.2014-09-01.csv to billing.pps-svr02.csv.2014-09-01
no rename needed
no alternate rename needed

Testing billing.pps-svr01.csv.2015-09-01:
old rename: billing.pps-svr01.csv.2015-09-01 to billing.pps-svr01.2015-09-01.csv
new rename: billing.pps-svr01.csv.2015-09-01 to billing.pps-svr01.2015-09-01.csv
alt rename: billing.pps-svr01.csv.2015-09-01 to billing.pps-svr01.2015-09-01.csv

Testing billing.pps-svr02.csv.2015-09-01:
old rename: billing.pps-svr02.csv.2015-09-01 to billing.pps-svr02.2015-09-01.csv
new rename: billing.pps-svr02.csv.2015-09-01 to billing.pps-svr02.2015-09-01.csv
alt rename: billing.pps-svr02.csv.2015-09-01 to billing.pps-svr02.2015-09-01.csv


Notice how the two first files would have gotten a wrongly rename use your original code.

Code refactor (added)

To accomodate for your question regarding building this into a larger script, and to give example of error handling, I've refactor your code into the following (using the tip from Janne Karila on using rsplit):

import os

def rename_csv_files(directory, required_start):
    """Rename files in  starting with  to csv files

    Go to  and read through all files, and for those
    starting with  and ending with something like 
    *.csv.YYYY-MM-DD and rename these to *.YYYY-MM-DD.
    """
    try:
       os.chdir(directory)
    except OSError, exception:
       print('IOError when changing directory - {}'.format(exception))
       return

    try:
        for filename in os.listdir('.'):
            if filename.startswith(required_start):

                base, ext, date = filename.rsplit('.', 2)
                new_filename = '.'.join((base, date, ext))
                if ext == 'csv' and not os.path.exists(new_filename):
                    try:
                        os.rename(filename, new_filename)
                        print 'Renamed: {}'.format(new_filename)

                    except OSError, exception:
                        print('Failed renaming file - dir: {}, original file:  {}, new file: {} - {}'.format(
                              directory, filename, new_filename, exception))

                elif ext != 'csv':
                    print('Skipped: {}'.format(filename))

                else:
                    print('Skipped: {} - Renamed version already exists'.format(filename))

    except OSError, exception:
        print('Failed traversing directory - dir: {} - {}'.format(directory, exception))

def main():
    rename_csv_files('./test_data', 'billing.pps-svr')

if __name__ == '__main__':
    main()


Running this script against the following test-data:

`$ ls -1d test_data/* | sort -n
test_data/billing.pps-svr01.2014-09-01.csv
test_data/billing.pps-svr01.csv.2015-09-01
test_data/billing.pps-svr02.2014-09-01.csv
test_data/billing.pps-svr02.2015-09-01.csv
test_data/billing.pps-svr02.csv.2015-09-01
test_data/ori

Code Snippets

for filename in (
                 'billing.pps-svr01.2014-09-01.csv',
                 'billing.pps-svr02.2014-09-01.csv',
                 'billing.pps-svr01.csv.2015-09-01',
                 'billing.pps-svr02.csv.2015-09-01',
                ):

    print('\nTesting {}:'.format(filename))

    name = filename.split('.')
    name = name + [name.pop(0-2)]
    new_name = '.'.join(name)

    print '    old rename: {} to {}'.format(filename, new_name)

    filename_parts = filename.split('.')
    first_part = filename_parts[:-2]
    last_part = filename_parts[-2:]

    if last_part[-1] != 'csv':
        new_name = '.'.join(first_part + last_part[::-1])
        print '    new rename: {} to {}'.format(filename, new_name)
    else:
        print '    no rename needed'

    if filename_parts[-2] == 'csv':
        new_name = '.'.join(filename_parts[:-2] + filename_parts[-2:][::-1])
        print '    alt rename: {} to {}'.format(filename, new_name)
    else:
        print '    no alternate rename needed'
import os

def rename_csv_files(directory, required_start):
    """Rename files in <directory> starting with <required_start> to csv files

    Go to <directory> and read through all files, and for those
    starting with <required_start> and ending with something like 
    *.csv.YYYY-MM-DD and rename these to *.YYYY-MM-DD.
    """
    try:
       os.chdir(directory)
    except OSError, exception:
       print('IOError when changing directory - {}'.format(exception))
       return

    try:
        for filename in os.listdir('.'):
            if filename.startswith(required_start):

                base, ext, date = filename.rsplit('.', 2)
                new_filename = '.'.join((base, date, ext))
                if ext == 'csv' and not os.path.exists(new_filename):
                    try:
                        os.rename(filename, new_filename)
                        print 'Renamed: {}'.format(new_filename)

                    except OSError, exception:
                        print('Failed renaming file - dir: {}, original file:  {}, new file: {} - {}'.format(
                              directory, filename, new_filename, exception))

                elif ext != 'csv':
                    print('Skipped: {}'.format(filename))

                else:
                    print('Skipped: {} - Renamed version already exists'.format(filename))

    except OSError, exception:
        print('Failed traversing directory - dir: {} - {}'.format(directory, exception))

def main():
    rename_csv_files('./test_data', 'billing.pps-svr')

if __name__ == '__main__':
    main()

Context

StackExchange Code Review Q#106403, answer score: 4

Revisions (0)

No revisions yet.