patternpythonMinor
Python script for automatically extracting an archive file in a subfolder
Viewed 0 times
scriptfileautomaticallysubfolderforpythonextractingarchive
Problem
A Python script for automatically extracting an archive file in a subfolder:
```
#!/usr/bin/env python3
// sub7z.py
import subprocess
import re
import os, sys, pathlib
class Counter:
counter = 0
def getCount(self):
self.counter = self.counter + 1
return self.counter
class ZipFile:
pattern1 = re.compile(r" +Date +Time +Attr +Size +Compressed +Name\n-{4,}[- ]\n(.?)\n-{4,}[- ]*", re.DOTALL)
pattern2 = re.compile(r"\d{4}-\d{2}-\d{2} +\d{2}:\d{2}:\d{2} +\.+ +\d+ +\d+ +(.*?)\n")
def __init__(self, pFilePath):
self.filePath = os.path.abspath(pFilePath)
if not os.path.isfile(self.filePath):
raise Exception("You must give me a file as input!")
self.fileParent = os.path.dirname(self.filePath)
self.fileStem, self.fileExt = os.path.splitext(self.filePath)
self.zipItems = ""
self.withSubfolder = True
self.zipItemsFrom7z()
self.checkSubfolder()
def zipItemsFrom7z(self):
p = subprocess.Popen(["7z", "l", self.filePath], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
self.zipItems, err = p.communicate()
if err:
raise Exception("Failed to fetch zip file info through 7z.")
if not self.zipItems:
raise Exception("Didn't get any output from 7z, check that 7z is working properly.")
self.zipItems = self.zipItems.decode()
self.zipItems = self.pattern1.findall(self.zipItems)[0]
self.zipItems = self.pattern2.findall(self.zipItems)
def checkSubfolder(self):
dirnames = [pathlib.Path(i).parts[0] for i in self.zipItems]
if len(set(dirnames)) > 1:
self.withSubfolder = False
else:
self.subFolder = os.path.join(self.fileParent, dirnames[0])
def stripPathSep(self, folder):
while folder.endswith(os.sep):
folder = folder[:-1]
return folder
def makeNumberedFolder(self, folder):
folder = self.stripPathSep
```
#!/usr/bin/env python3
// sub7z.py
import subprocess
import re
import os, sys, pathlib
class Counter:
counter = 0
def getCount(self):
self.counter = self.counter + 1
return self.counter
class ZipFile:
pattern1 = re.compile(r" +Date +Time +Attr +Size +Compressed +Name\n-{4,}[- ]\n(.?)\n-{4,}[- ]*", re.DOTALL)
pattern2 = re.compile(r"\d{4}-\d{2}-\d{2} +\d{2}:\d{2}:\d{2} +\.+ +\d+ +\d+ +(.*?)\n")
def __init__(self, pFilePath):
self.filePath = os.path.abspath(pFilePath)
if not os.path.isfile(self.filePath):
raise Exception("You must give me a file as input!")
self.fileParent = os.path.dirname(self.filePath)
self.fileStem, self.fileExt = os.path.splitext(self.filePath)
self.zipItems = ""
self.withSubfolder = True
self.zipItemsFrom7z()
self.checkSubfolder()
def zipItemsFrom7z(self):
p = subprocess.Popen(["7z", "l", self.filePath], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
self.zipItems, err = p.communicate()
if err:
raise Exception("Failed to fetch zip file info through 7z.")
if not self.zipItems:
raise Exception("Didn't get any output from 7z, check that 7z is working properly.")
self.zipItems = self.zipItems.decode()
self.zipItems = self.pattern1.findall(self.zipItems)[0]
self.zipItems = self.pattern2.findall(self.zipItems)
def checkSubfolder(self):
dirnames = [pathlib.Path(i).parts[0] for i in self.zipItems]
if len(set(dirnames)) > 1:
self.withSubfolder = False
else:
self.subFolder = os.path.join(self.fileParent, dirnames[0])
def stripPathSep(self, folder):
while folder.endswith(os.sep):
folder = folder[:-1]
return folder
def makeNumberedFolder(self, folder):
folder = self.stripPathSep
Solution
You should follow PEP8, the official style guide of Python.
The
You declared class variable
but then you use the instance variable
While this seems to work,
it's confusing and can have unintended effects.
For example, after you do
all future counter instances you create will start counting from 3.
The purpose of class variables is to share between multiple instances.
Probably this is not your intention.
In which case you should initialize the instance variable in the constructor:
Also note that modern classes should extend
The
The "get" in the name suggests it's a getter,
so it's unexpected that it changes state (increments the count).
It would be better to rename to
Finally, when incrementing a count, it's better to use
The
Many methods rely heavily on side effects,
expecting some instance variables to be set by other methods.
This is fragile design,
making it difficult to maintain the class.
For example,
done by calling
Whenever possible, it's best to have the required input as method parameters.
Making methods share variables makes the design fragile,
for two reasons:
You can fix this by making
That way no other method can see or mess with
and the relationship between these methods becomes perfectly clear.
A similar issue is
The problem is that this instance variable is not mentioned in the constructor,
which makes the class harder to maintain.
It's best to see the initialization of all the fields in the constructor,
to know all the elements it requires for functioning correctly.
In fact you should never let other methods initialize instance variables.
These methods can be called from anywhere,
not only from the constructor,
and an accidental misuse can corrupt the object.
One way to fix it is to change these methods to return values instead of setting instance variables.
Another way is to make them inner functions inside the constructor itself.
I suggest to review the entire class,
while considering these concepts carefully.
The
Counter classYou declared class variable
counter,but then you use the instance variable
self.counter for increments.While this seems to work,
it's confusing and can have unintended effects.
For example, after you do
Counter.counter = 3,all future counter instances you create will start counting from 3.
The purpose of class variables is to share between multiple instances.
Probably this is not your intention.
In which case you should initialize the instance variable in the constructor:
class Counter(object):
def __init__(self):
self.counter = 0Also note that modern classes should extend
object, like I did above.The
getCount method is not great.The "get" in the name suggests it's a getter,
so it's unexpected that it changes state (increments the count).
It would be better to rename to
increment_and_get.Finally, when incrementing a count, it's better to use
x += 1 instead of x = x + 1.The
ZipFile classMany methods rely heavily on side effects,
expecting some instance variables to be set by other methods.
This is fragile design,
making it difficult to maintain the class.
For example,
checkSubfolder expects zipItems to be set already,done by calling
zipItemsFrom7z first.Whenever possible, it's best to have the required input as method parameters.
Making methods share variables makes the design fragile,
for two reasons:
- The dependence between the methods is not obvious
- The variable is exposed to other methods too, and might get modified by accident
You can fix this by making
zipItemsFrom7z return zipItems instead of storing in the instance, and pass this list to checkSubfolder.That way no other method can see or mess with
zipItems,and the relationship between these methods becomes perfectly clear.
A similar issue is
checkSubfolder setting self.subFolder.The problem is that this instance variable is not mentioned in the constructor,
which makes the class harder to maintain.
It's best to see the initialization of all the fields in the constructor,
to know all the elements it requires for functioning correctly.
In fact you should never let other methods initialize instance variables.
These methods can be called from anywhere,
not only from the constructor,
and an accidental misuse can corrupt the object.
One way to fix it is to change these methods to return values instead of setting instance variables.
Another way is to make them inner functions inside the constructor itself.
I suggest to review the entire class,
while considering these concepts carefully.
Code Snippets
class Counter(object):
def __init__(self):
self.counter = 0Context
StackExchange Code Review Q#69258, answer score: 4
Revisions (0)
No revisions yet.