patternbashMinor
Directory's disk usage list
Viewed 0 times
usagedirectorydisklist
Problem
For my classes, I had to finish this task:
Directory's disk usage list
For indicated directory print a list of files and subdirectories in
descending order according to their total disk usage. Disk usage of a
simple file is equivalent to its size, for a subdirectory - a sum of
the sizes of all files in its filesystem branch (subtree). Printed
list format:
is in case of a file not being a directory an owner, otherwise, the
holder of a directory is the owner of the files having greatest total
disk usage in it. `` is a letter: d for a directory, - for a
standard file, l - for a symbolic link etc.
Below file disk usage list, print a summary:
Total disk usage of the files in directory's subtree, the list of
total disk usage of files in directory's subtree with respect to the
owners (sums over owners). CAUTION: during recursive directories
listing do not dereference symbolic links.
Here is my code:
Is it ok? Should I make any changes?
Directory's disk usage list
For indicated directory print a list of files and subdirectories in
descending order according to their total disk usage. Disk usage of a
simple file is equivalent to its size, for a subdirectory - a sum of
the sizes of all files in its filesystem branch (subtree). Printed
list format:
In the preceding print formatis in case of a file not being a directory an owner, otherwise, the
holder of a directory is the owner of the files having greatest total
disk usage in it. `` is a letter: d for a directory, - for a
standard file, l - for a symbolic link etc.
Below file disk usage list, print a summary:
Total disk usage of the files in directory's subtree, the list of
total disk usage of files in directory's subtree with respect to the
owners (sums over owners). CAUTION: during recursive directories
listing do not dereference symbolic links.
Here is my code:
#!/bin/bash
cd "$1"
ls -lA | awk 'NR != 1{
name=$8
for ( i=9; i/dev/null"
while ( find_proc | getline )
{
sum+=$1;usr_sum[$2]+=$1;
sum_all+=$1;usr_sum_all[$2]+=$1;
}
close(find_proc)
owner=""
owner_sum=-1
for ( iowner in usr_sum ) if ( usr_sum[iowner] > owner_sum ) {owner = iowner; owner_sum=usr_sum[iowner]}
print sum " " owner " d " name
}
}
END {
print "Space taken: " sum_all
for ( iowner in usr_sum_all ) print "\t user: " iowner " " usr_sum_all[iowner]
}' | sort -grIs it ok? Should I make any changes?
Solution
I just copied your code, pasted it in to a terminal, ran it, and it did exactly what it is supposed to do. Now, how to make it better?
You don't mention it in the question, but using
I certainly would not have solved this problem this way (I would have used perl).
Still, assuming your code meets the expectations for tool use (as in, you are supposed to use awk), there are some things that are off.
Single-line multi-statements
You have a number of 1-liners that you should have expanded. You already have the massive awk code-block, why mix the compact 1-liner and block statements like you have done?
The above should be:
Note, the use of block
These other statements should also be on their own lines (and the for statement should have a
Negative if
Your main if-block would be better if done as a positive check, not a negative check. your code is:
but would be more readable as:
Breathing space
Your code is suffocating since it has little breathing space. Many of your statements are compressed around the operators and other symbols. Consider lines like:
Which should be written as:
Performance
The multiple calls to
Still, it ran fast enough for me on my machine, so it is probably not a real problem.
You don't mention it in the question, but using
ls, awk, find, and sort together somehow seems unlikely to be the answer your tutor expected. You don't mention what your course is in, but I would have expected there to be a less 'cobbled together' solution, using perhaps bash only, or perl, python, or something else.I certainly would not have solved this problem this way (I would have used perl).
Still, assuming your code meets the expectations for tool use (as in, you are supposed to use awk), there are some things that are off.
Single-line multi-statements
You have a number of 1-liners that you should have expanded. You already have the massive awk code-block, why mix the compact 1-liner and block statements like you have done?
for ( iowner in usr_sum ) if ( usr_sum[iowner] > owner_sum ) {owner = iowner; owner_sum=usr_sum[iowner]}The above should be:
for ( iowner in usr_sum )
{
if ( usr_sum[iowner] > owner_sum )
{
owner = iowner
owner_sum=usr_sum[iowner]
}
}Note, the use of block
{} braces for the single if statement, and not using the semi-colon ; to separate statements on a single line.These other statements should also be on their own lines (and the for statement should have a
{} block):for ( i=9; i<=NF; i++) name=name " " $i
...
sum=0;delete usr_sum;
...
sum+=$1;usr_sum[$2]+=$1;
sum_all+=$1;usr_sum_all[$2]+=$1;Negative if
Your main if-block would be better if done as a positive check, not a negative check. your code is:
if (type!="d")
{
... do not a directory stuff
}
else
{
... do directory stuff
}but would be more readable as:
if (type == "d")
{
... do directory stuff
}
else
{
... do not a directory stuff
}Breathing space
Your code is suffocating since it has little breathing space. Many of your statements are compressed around the operators and other symbols. Consider lines like:
print $5,$3,type,name
sum_all+=$5
usr_sum_all[$3]+=$5Which should be written as:
print $5, $3, type, name
sum_all += $5
usr_sum_all[$3] += $5Performance
The multiple calls to
find may be hurting your performance. In general I can't help but think there's a better way to do this without all the nested OS calls.Still, it ran fast enough for me on my machine, so it is probably not a real problem.
Code Snippets
for ( iowner in usr_sum ) if ( usr_sum[iowner] > owner_sum ) {owner = iowner; owner_sum=usr_sum[iowner]}for ( iowner in usr_sum )
{
if ( usr_sum[iowner] > owner_sum )
{
owner = iowner
owner_sum=usr_sum[iowner]
}
}for ( i=9; i<=NF; i++) name=name " " $i
...
sum=0;delete usr_sum;
...
sum+=$1;usr_sum[$2]+=$1;
sum_all+=$1;usr_sum_all[$2]+=$1;if (type!="d")
{
... do not a directory stuff
}
else
{
... do directory stuff
}if (type == "d")
{
... do directory stuff
}
else
{
... do not a directory stuff
}Context
StackExchange Code Review Q#27603, answer score: 3
Revisions (0)
No revisions yet.