patternpythonMinor

Batch program to xcopy from host PC to remote destination with multi-processing

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

destinationprocessingmultiwithprogrambatchhostremotexcopyfrom

Problem

This is a batch program to xcopy from host PC to remote destination with multi-processing.

Kepler, Python-3.x are my environment.

```
#!/usr/bin/env python
# -- coding: utf-8 --
# based on Carnival http://ask.python.kr/users/6970/carnival/

import os
import os.path
import csv
import re

from multiprocessing import Process, Lock

class Server:
def __init__(self, addr, path):
self.addr = addr
self.path = path

def distribute_file(server_list, pathname):
print("Read from {}".format(pathname))

with open(pathname, 'rb') as inFile:
buffer = inFile.read()

for server in server_list:
remotepath = "//%s/%s/%s" % (server.addr, server.path, pathname)
print ("Write to {}".format(remotepath))
with open(remotepath, 'wb') as outFile:
outFile.write(buffer)

def multi_distribute_file(lo, server, dirpath, filename, path):
lo.acquire()

pathname = os.path.join(dirpath, filename)
print("Read from {}".format(pathname))

with open(pathname, 'rb') as inFile:
buffer = inFile.read()

if(path == ''):
remotepath = "//%s/%s/%s" % (server.addr, server.path, filename)
else:
remotepath = "//%s/%s/%s/%s" % (server.addr, server.path, path, filename)

print ("Write to {}".format(remotepath))
with open(remotepath, 'wb') as outFile:
outFile.write(buffer)

lo.release()

def make_dir(server_list, path):
for server in server_list:
dirpath = "//%s/%s/%s" % (server.addr, server.path, path)
d = os.path.dirname(dirpath)
if not os.path.exists(d):
os.makedirs(d)
print ("Make dir {}".format(d))
else:
print ("dir {} already exists.".format(d))

if not os.path.exists(dirpath):
os.makedirs(dirpath)
print ("Make dir {}".format(dirpath))
else:
print ("dir {} already exists.".format(dirpath))

def dist_files(server_

Solution

Technical issues:

You are reading whole files into memory at once. Consider what happens if the directory contains huge files.

You use multiprocessing, apparently in an attempt to parallelize things, and use locks to synchronize.

First, why do you need a lock anyway? Which data do you want to protect? The only shared data are in the filesystems, which could be changed by other processes at any time anyway.

Next, each call to multi_distribute_file gets its own lock instance, that is only used once by this call. So there is no synchronization happening at all.

On the other hand, if you would use a single lock for all calls to multi_distribute_file, the processes will execute one after another, as each acquires the lock for its entire runtime. Your parallelizing efforts would be in vain.

You are spawning a new process for each file being copied. This can create a huge number of processes very quickly. Consider what this will do to your system performance.

If something goes wrong in multi_distribute_file and raises an exception, the lock will never be released. And things will go wrong occassionaly as you are communicating over a network. Use a with statement instead.

Don't use string manipulation (format and re) to manipulate paths. Use os.path instead. This is saver and more reliable.

Style issues:

There are no docstrings and no comments (apart from the header)

Avoid single-letter and other abbreviated names like d, l, m, and lo.

mydictionary is a meaningless and misleading name. Meaningless, as it doesn't tell us anything about the purpose of the variable, and misleading as it is actually a list, not a dictionary. A better name might be server_list (like you do elsewhere).

Context

StackExchange Code Review Q#31678, answer score: 2

Revisions (0)

No revisions yet.