patternpythonMinor
Number of files with specific file size ranges
Viewed 0 times
filenumberwithsizerangesfilesspecific
Problem
I am trying to write a script that will traverse through my directory and sub directory and list the number of files in a specific size.
For example, 0kb-1kb: 3, 1kb-4kb:4, 4-16KB: 4, 16kb-64-kb:11. This goes on in multiples of 4.
I am able to get the list of file numbers, the size in human readable format and find the number of files in a size group. But I feel my code is very messy and not anywhere near the standard.
For example, 0kb-1kb: 3, 1kb-4kb:4, 4-16KB: 4, 16kb-64-kb:11. This goes on in multiples of 4.
I am able to get the list of file numbers, the size in human readable format and find the number of files in a size group. But I feel my code is very messy and not anywhere near the standard.
import os
suffixes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB']
route = raw_input('Enter a location')
def human_Readable(nbytes):
if nbytes == 0: return '0 B'
i = 0
while nbytes >= 1024 and i start and os.path.getsize(os.path.join(path,r)) < end:
counter += 1
print "Number of files greater than %s less than %s:" %(human_Readable(start), human_Readable(end)), counter
file_Dist(route, 0, 1024)
file_Dist(route,1024,4095)
file_Dist(route, 4096, 16383)
file_Dist(route, 16384, 65535)
file_Dist(route, 65536, 262143)
file_Dist(route, 262144, 1048576)
file_Dist(route, 1048577, 4194304)
file_Dist(route, 4194305, 16777216)Solution
-
-
You should name variables with
-
You wrote
-
Look at
-
You are checking
suffixes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB'] you never change the value of suffixes, so it should be a constant, the standard naming of constant in Python is ALL_UPPERCASE, rename it to SUFFIXES-
You should name variables with
snake_case.-
You wrote
for r in files: the details count: the standard one-letter identifier for file is f, use that.-
Look at
f = ('%.2f' % nbytes).rstrip('0').rstrip('.'), f is a meaningless one-letter name. You should use a longer and more meaningful name-
You are checking
and i : it is not possible that the user has a file bigger than one Petabyte (10**15 bytes) so you can safely remove that check. and it is very good that you that, code that crashes when you find an unexpected condition should be avoided with error checking like this one. (Thanks to @Veedrac for correcting me)
f = ('%.2f' % nbytes).rstrip('0').rstrip('.')
return '%s %s' % (f, suffixes[i])
You are a C programmer, aren't you? :) Using "%s" string formatting in Python is considered obsolete and should be avoided, instead use string.format()
print "Number of files greater than %s less than %s:" %(human_Readable(start), human_Readable(end)), counter
You are printing from inside a function this is not nice, you should return values, not print them.
It is common to take the parametres from the command line so:
route = raw_input('Enter a location')
should become:
import sys
route = sys.argv[1]
this also makes the code compatible with Python 3 if you add brackets around the print
You are repeating yourself so much:
file_Dist(route, 0, 1024)
file_Dist(route,1024,4095)
file_Dist(route, 4096, 16383)
file_Dist(route, 16384, 65535)
file_Dist(route, 65536, 262143)
file_Dist(route, 262144, 1048576)
file_Dist(route, 1048577, 4194304)
file_Dist(route, 4194305, 16777216)
It would be better to use a loop:
for start in [1024,4096,16384,262144,1048577,4194305]:
end = start * 4
file_Dist(route,start,end)
You may want to import this script, using if __name__` will allow you to run the tests only if this script is the main filedef test():
for start in [1024,4096,16384,262144,1048577,4194305]:
end = start * 4
file_Dist(route,start,end)
if __name__ == "__main__":
test()Code Snippets
f = ('%.2f' % nbytes).rstrip('0').rstrip('.')
return '%s %s' % (f, suffixes[i])print "Number of files greater than %s less than %s:" %(human_Readable(start), human_Readable(end)), counterroute = raw_input('Enter a location')import sys
route = sys.argv[1]file_Dist(route, 0, 1024)
file_Dist(route,1024,4095)
file_Dist(route, 4096, 16383)
file_Dist(route, 16384, 65535)
file_Dist(route, 65536, 262143)
file_Dist(route, 262144, 1048576)
file_Dist(route, 1048577, 4194304)
file_Dist(route, 4194305, 16777216)Context
StackExchange Code Review Q#77008, answer score: 3
Revisions (0)
No revisions yet.