patternpythonMinor
n largest files in a directory
Viewed 0 times
largestdirectoryfiles
Problem
This is a script I wrote to find the n biggest files in a given directory (recursively):
It can be run as, eg:
Ignoring the complete lack of error handling, is there any obvious way to make this clearer, or at least more succinct? I am thinking about, eg, using
import heapq
import os, os.path
import sys
import operator
def file_sizes(directory):
for path, _, filenames in os.walk(directory):
for name in filenames:
full_path = os.path.join(path, name)
yield full_path, os.path.getsize(full_path)
num_files, directory = sys.argv[1:]
num_files = int(num_files)
big_files = heapq.nlargest(
num_files, file_sizes(directory), key=operator.itemgetter(1))
print(*("{}\t{:>}".format(*b) for b in big_files))It can be run as, eg:
bigfiles.py 5 ~. Ignoring the complete lack of error handling, is there any obvious way to make this clearer, or at least more succinct? I am thinking about, eg, using
namedtuples in file_sizes, but is there also any way to implement file_sizes in terms of a generator expression? (I'm thinking probably not without having two calls to os.path, but I'd love to be proven wrong :-)Solution
You could replace your function with:
However, I'm not sure that really helps the clarity.
I found doing this:
Actually runs slightly quicker then your version.
file_names = (os.path.join(path, name) for path, _, filenames in os.walk(directory)
for name in filenames)
file_sizes = ((name, os.path.getsize(name)) for name in file_names)However, I'm not sure that really helps the clarity.
I found doing this:
big_files = heapq.nlargest(
num_files, file_names, key=os.path.getsize)
print(*("{}\t{:>}".format(b, os.path.getsize(b)) for b in big_files))Actually runs slightly quicker then your version.
Code Snippets
file_names = (os.path.join(path, name) for path, _, filenames in os.walk(directory)
for name in filenames)
file_sizes = ((name, os.path.getsize(name)) for name in file_names)big_files = heapq.nlargest(
num_files, file_names, key=os.path.getsize)
print(*("{}\t{:>}".format(b, os.path.getsize(b)) for b in big_files))Context
StackExchange Code Review Q#8958, answer score: 4
Revisions (0)
No revisions yet.