HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

File copy in Python for slow networks

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
filenetworksslowforpythoncopy

Problem

Please note: I have a follow up question with updated code, here.

I have written a file-copy routine, as I experienced unreliable results using standard Python when copying file over very slow networks.

The requirements are as follows:

  • to be used on Windows only



  • not to be limited by the 256 character limit of Windows



  • configurable when to use hash checksum checking (source / target)



  • configurable "buffer size" or "chunk size" (it seems that smaller chunks are more reliable on unreliable/slow networks)



  • file only available on target file system AFTER all content has copied. If you copy files, the target filename already exists on the target filesystem, hence a remote application may "see" this file and try to open it while copy still in progress. This is implemented with a temp-target-filename in the code below.



  • configurable console output



So this is what I have, and it tested to be working OK.

I am looking for tips/tricks to improve this code

PS, I know I haven't done "docstrings" yet... (on my to do list)

Usage:

import ccopy
source = r'c:\test1\testfile.ext'
target = r'\\someserver\test2\testfile.ext'
ccopy.filecopy(source, target, True, 'md4', 1024, True, True)


ccopy.py:

```
import os
import uuid
import hashlib
import platform

def normalizefilepath(path):
#inserts '\\?\' at the start of a path to get rid of 256 character limit of Windows
if path[1:3] == ':\\':
return u'\\\\?\\' + os.path.normcase(path)
return os.path.normcase(path)

def hashfile(filepath, hashtype='md4', buffersize=1024):
ha = hashlib.new(hashtype)
f = open(filepath, 'rb')
while True:
chunk = f.read(buffersize)
if not chunk:
break
ha.update(chunk)
f.close()
return ha.digest()

def filecopy(source, target, hashcheck = False, hashtype = 'md4', buffersize = 1024, overwrite = True, consoleoutput = False):

if not platform.system() == 'Windows': raise Exception('Incorre

Solution

I don't think there is much more to do/check in terms of functionality, but I'd like to point out a few coding style issues (in my opinion).

-
Be careful when checking paths. If you are taking any kind of user input, keep in mind that .. is a valid path. You don't want people to use your module to copy files that they're not supposed to. If you're in a situation where you have to handle user-supplied paths, validate them AND use a whitelist to determine if they should be allowed to read the file. Have a look at abspath and/or realpath.

-
I'd move all the checks you have at the beginning of filecopy in a separate function. You may need those same checks at a later time, you never know.

-
You keep using if consoleoutput. This suggest that a function could be handy.

Maybe something like this:

def debug(messages):
    if (consoleoutput):
        for message in messages:
            print(message + '\n');


So the rest of the code could go from this:

if consoleoutput: print('Source File and Temp Target File Hashlib/' + hashtype + ' match') 
if consoleoutput: print()
if consoleoutput: print('Renaming File') 
if consoleoutput: print('Temp Target File: ' + tempfilepath) 
if consoleoutput: print('Target File     : ' + targetfilepath)


to this:

debug(['Source File and Temp Target File Hashlib/' + hashtype + ' match', 
        '', 
        'Renaming File', 
        'Temp Target File: ' + tempfilepath, 
        'Target File     : ' + targetfilepath])

Code Snippets

def debug(messages):
    if (consoleoutput):
        for message in messages:
            print(message + '\n');
if consoleoutput: print('Source File and Temp Target File Hashlib/' + hashtype + ' match') 
if consoleoutput: print()
if consoleoutput: print('Renaming File') 
if consoleoutput: print('Temp Target File: ' + tempfilepath) 
if consoleoutput: print('Target File     : ' + targetfilepath)
debug(['Source File and Temp Target File Hashlib/' + hashtype + ' match', 
        '', 
        'Renaming File', 
        'Temp Target File: ' + tempfilepath, 
        'Target File     : ' + targetfilepath])

Context

StackExchange Code Review Q#147290, answer score: 5

Revisions (0)

No revisions yet.