patternpythonMinor
Batch program to xcopy from host PC to remote destination with multi-processing
Viewed 0 times
destinationprocessingmultiwithprogrambatchhostremotexcopyfrom
Problem
This is a batch program to xcopy from host PC to remote destination with multi-processing.
Kepler, Python-3.x are my environment.
```
#!/usr/bin/env python
# -- coding: utf-8 --
# based on Carnival http://ask.python.kr/users/6970/carnival/
import os
import os.path
import csv
import re
from multiprocessing import Process, Lock
class Server:
def __init__(self, addr, path):
self.addr = addr
self.path = path
def distribute_file(server_list, pathname):
print("Read from {}".format(pathname))
with open(pathname, 'rb') as inFile:
buffer = inFile.read()
for server in server_list:
remotepath = "//%s/%s/%s" % (server.addr, server.path, pathname)
print ("Write to {}".format(remotepath))
with open(remotepath, 'wb') as outFile:
outFile.write(buffer)
def multi_distribute_file(lo, server, dirpath, filename, path):
lo.acquire()
pathname = os.path.join(dirpath, filename)
print("Read from {}".format(pathname))
with open(pathname, 'rb') as inFile:
buffer = inFile.read()
if(path == ''):
remotepath = "//%s/%s/%s" % (server.addr, server.path, filename)
else:
remotepath = "//%s/%s/%s/%s" % (server.addr, server.path, path, filename)
print ("Write to {}".format(remotepath))
with open(remotepath, 'wb') as outFile:
outFile.write(buffer)
lo.release()
def make_dir(server_list, path):
for server in server_list:
dirpath = "//%s/%s/%s" % (server.addr, server.path, path)
d = os.path.dirname(dirpath)
if not os.path.exists(d):
os.makedirs(d)
print ("Make dir {}".format(d))
else:
print ("dir {} already exists.".format(d))
if not os.path.exists(dirpath):
os.makedirs(dirpath)
print ("Make dir {}".format(dirpath))
else:
print ("dir {} already exists.".format(dirpath))
def dist_files(server_
Kepler, Python-3.x are my environment.
```
#!/usr/bin/env python
# -- coding: utf-8 --
# based on Carnival http://ask.python.kr/users/6970/carnival/
import os
import os.path
import csv
import re
from multiprocessing import Process, Lock
class Server:
def __init__(self, addr, path):
self.addr = addr
self.path = path
def distribute_file(server_list, pathname):
print("Read from {}".format(pathname))
with open(pathname, 'rb') as inFile:
buffer = inFile.read()
for server in server_list:
remotepath = "//%s/%s/%s" % (server.addr, server.path, pathname)
print ("Write to {}".format(remotepath))
with open(remotepath, 'wb') as outFile:
outFile.write(buffer)
def multi_distribute_file(lo, server, dirpath, filename, path):
lo.acquire()
pathname = os.path.join(dirpath, filename)
print("Read from {}".format(pathname))
with open(pathname, 'rb') as inFile:
buffer = inFile.read()
if(path == ''):
remotepath = "//%s/%s/%s" % (server.addr, server.path, filename)
else:
remotepath = "//%s/%s/%s/%s" % (server.addr, server.path, path, filename)
print ("Write to {}".format(remotepath))
with open(remotepath, 'wb') as outFile:
outFile.write(buffer)
lo.release()
def make_dir(server_list, path):
for server in server_list:
dirpath = "//%s/%s/%s" % (server.addr, server.path, path)
d = os.path.dirname(dirpath)
if not os.path.exists(d):
os.makedirs(d)
print ("Make dir {}".format(d))
else:
print ("dir {} already exists.".format(d))
if not os.path.exists(dirpath):
os.makedirs(dirpath)
print ("Make dir {}".format(dirpath))
else:
print ("dir {} already exists.".format(dirpath))
def dist_files(server_
Solution
Technical issues:
Style issues:
- You are reading whole files into memory at once. Consider what happens if the directory contains huge files.
- You use
multiprocessing, apparently in an attempt to parallelize things, and use locks to synchronize.
- First, why do you need a lock anyway? Which data do you want to protect? The only shared data are in the filesystems, which could be changed by other processes at any time anyway.
- Next, each call to
multi_distribute_filegets its own lock instance, that is only used once by this call. So there is no synchronization happening at all.
- On the other hand, if you would use a single lock for all calls to
multi_distribute_file, the processes will execute one after another, as each acquires the lock for its entire runtime. Your parallelizing efforts would be in vain.
- You are spawning a new process for each file being copied. This can create a huge number of processes very quickly. Consider what this will do to your system performance.
- If something goes wrong in
multi_distribute_fileand raises an exception, the lock will never be released. And things will go wrong occassionaly as you are communicating over a network. Use awithstatement instead.
- Don't use string manipulation (
formatandre) to manipulate paths. Useos.pathinstead. This is saver and more reliable.
Style issues:
- There are no docstrings and no comments (apart from the header)
- Avoid single-letter and other abbreviated names like
d,l,m, andlo.
mydictionaryis a meaningless and misleading name. Meaningless, as it doesn't tell us anything about the purpose of the variable, and misleading as it is actually a list, not a dictionary. A better name might beserver_list(like you do elsewhere).
Context
StackExchange Code Review Q#31678, answer score: 2
Revisions (0)
No revisions yet.