patternpythonMinor
Process zip files
Viewed 0 times
zipfilesprocess
Problem
I have zip bundle, for example, abcd.zip, contains more zips like 1.zip, 2.zip etc. Inside of each child zip there is a .jpg file like 1.jpg, 2.jpg etc. There are so many other files but I need only .jpeg.
I need to extract the .jpeg's and and create a zip it again with same parent name like 1.zip.
This works fine, but just wanted to know if I can make it faster. There will be approx 30,000 zip I need to process.
I need to extract the .jpeg's and and create a zip it again with same parent name like 1.zip.
This works fine, but just wanted to know if I can make it faster. There will be approx 30,000 zip I need to process.
def fjpeg(file):
base = os.path.basename(file)
jp = base[:-4]+".jpg"
return jp
def process(bundle):
z1 = zp.ZipFile(bundle, 'r')
for z1file in z1.namelist():
if z1file[-4:] == '.zip':
z2 = zp.ZipFile(z1.extract(z1file, "tmp"), 'r')
z3 = os.path.basename(z2.extract(fjpeg(z1file)))
process_path = "processed" + os.path.sep + os.path.basename(z1file)
with zp.ZipFile(process_path, 'w', mode) as final:
final.write(z3)
z2.close()
os.unlink(os.path.join("tmp", z1file))
os.unlink(z3)
else:
continue
z1.close()Solution
It is not necessary to create temporary files on disk, as
This Stack Overflow question may be useful to you: Unzip nested zip files in python. As mentioned, decompressing zip files requires random access to the archive. If the "bundle" zip stores its contents uncompressed (which is an option), in theory it should be possible to have random access into the files.
zipfile.ZipFile can work in-memory.- Use a
cStringIO.StringIOinstance to hold a zip file in memory.
- Use
ZipFile.readto read a jpeg file into astrvariable.
- Use
ZipFile.writestrto write the jpeg back.
This Stack Overflow question may be useful to you: Unzip nested zip files in python. As mentioned, decompressing zip files requires random access to the archive. If the "bundle" zip stores its contents uncompressed (which is an option), in theory it should be possible to have random access into the files.
Context
StackExchange Code Review Q#73683, answer score: 2
Revisions (0)
No revisions yet.