HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Number of files with specific file size ranges

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
filenumberwithsizerangesfilesspecific

Problem

I am trying to write a script that will traverse through my directory and sub directory and list the number of files in a specific size.

For example, 0kb-1kb: 3, 1kb-4kb:4, 4-16KB: 4, 16kb-64-kb:11. This goes on in multiples of 4.

I am able to get the list of file numbers, the size in human readable format and find the number of files in a size group. But I feel my code is very messy and not anywhere near the standard.

import os
suffixes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB']
route = raw_input('Enter a location')
def human_Readable(nbytes):
    if nbytes == 0: return '0 B'
    i = 0
    while nbytes >= 1024 and i  start and os.path.getsize(os.path.join(path,r)) < end:
                counter += 1
        print "Number of files greater than %s less than %s:" %(human_Readable(start), human_Readable(end)),  counter
file_Dist(route, 0, 1024)
file_Dist(route,1024,4095)
file_Dist(route, 4096, 16383)
file_Dist(route, 16384, 65535)
file_Dist(route, 65536, 262143)
file_Dist(route, 262144, 1048576)
file_Dist(route, 1048577, 4194304)
file_Dist(route, 4194305, 16777216)

Solution

-
suffixes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB'] you never change the value of suffixes, so it should be a constant, the standard naming of constant in Python is ALL_UPPERCASE, rename it to SUFFIXES

-
You should name variables with snake_case.

-
You wrote for r in files: the details count: the standard one-letter identifier for file is f, use that.

-
Look at f = ('%.2f' % nbytes).rstrip('0').rstrip('.'), f is a meaningless one-letter name. You should use a longer and more meaningful name

-
You are checking and i : it is not possible that the user has a file bigger than one Petabyte (10**15 bytes) so you can safely remove that check. and it is very good that you that, code that crashes when you find an unexpected condition should be avoided with error checking like this one. (Thanks to @Veedrac for correcting me)

f = ('%.2f' % nbytes).rstrip('0').rstrip('.')
return '%s %s' % (f, suffixes[i])


You are a C programmer, aren't you? :) Using "%s" string formatting in Python is considered obsolete and should be avoided, instead use
string.format()

print "Number of files greater than %s less than %s:" %(human_Readable(start), human_Readable(end)),  counter


You are printing from inside a function this is not nice, you should return values, not print them.

It is common to take the parametres from the command line so:

route = raw_input('Enter a location')


should become:

import sys
route = sys.argv[1]


this also makes the code compatible with Python 3 if you add brackets around the
print

You are repeating yourself so much:

file_Dist(route, 0, 1024)
file_Dist(route,1024,4095)
file_Dist(route, 4096, 16383)
file_Dist(route, 16384, 65535)
file_Dist(route, 65536, 262143)
file_Dist(route, 262144, 1048576)
file_Dist(route, 1048577, 4194304)
file_Dist(route, 4194305, 16777216)


It would be better to use a loop:

for start in [1024,4096,16384,262144,1048577,4194305]:
    end = start * 4
    file_Dist(route,start,end)


You may want to import this script, using
if __name__` will allow you to run the tests only if this script is the main file

def test():
    for start in [1024,4096,16384,262144,1048577,4194305]:
        end = start * 4
        file_Dist(route,start,end)

if __name__ == "__main__":
    test()

Code Snippets

f = ('%.2f' % nbytes).rstrip('0').rstrip('.')
return '%s %s' % (f, suffixes[i])
print "Number of files greater than %s less than %s:" %(human_Readable(start), human_Readable(end)),  counter
route = raw_input('Enter a location')
import sys
route = sys.argv[1]
file_Dist(route, 0, 1024)
file_Dist(route,1024,4095)
file_Dist(route, 4096, 16383)
file_Dist(route, 16384, 65535)
file_Dist(route, 65536, 262143)
file_Dist(route, 262144, 1048576)
file_Dist(route, 1048577, 4194304)
file_Dist(route, 4194305, 16777216)

Context

StackExchange Code Review Q#77008, answer score: 3

Revisions (0)

No revisions yet.