HiveBrain v1.2.0
Get Started
← Back to all entries
patternbashMinor

Fine tuning of an informative replacement for rm

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
finetuningforinformativereplacement

Problem

I want to replace rm by a more informative variant: I would like to see which files will be deleted, along with their size, and I would like this information to use the same coloring as ls. As an example (please imagine the coloring):

# remove -r foo/ bar/*
4K   bar/file1
1.2M foo/
6.1M bar/file2
remove -r 3 files, 7.3M [yn]? _


I built on suggestions found in another question, and now have a bash script that I begin to like. I have been using it for a few days without noticing any obvious errors. However, I would appreciate any help with making sure that the script will not misbehave in cases I did not foresee.

Here is my code for the coloring:

du_colored (){
    # read ls --color output into ls_colored_array
    # use \13 as a trick to handle names with spaces 
    read -d '\n' -r -a ls_colored_array <<< $(ls -Ad --color "$@" | tr " " "\13")

    if [[ ${#ls_colored_array[@]} = 0 ]]; then return 1; fi

    # - loop over the array and issue du -sh for every element (without coloring)
    # - exchange du's ouput with ls's
    # - finally sort the output
    for i in "${ls_colored_array[@]}"; do
        i=`echo $i | tr "\13" " "`
        printf '%s' "${i}" | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g" | \
            xargs -n1 -0 du -sh | awk -v i="$i" '{ printf "%-4s ", $1; print i }'
    done | sort -h
}


and the removal script:

```
# Separate options to rm from 'real' arguments
for i in "$@"; do
case $i in
-*) options+=( "$i" ) ;;
*) toRemove+=( "$i" ) ;;
esac
done

# Print file list
# Abort if no files match
du_colored "${toRemove[@]}"
if [ "_$?" != "_0" ]; then exit 1; fi

# Print summary
count=$(find "${toRemove[@]}" 2>/dev/null | wc -l 2>/dev/null)
size=$(du -sch "${toRemove[@]}" 2>/dev/null | tail -1 | tr '\t' ' ')
plural="s"
if [ $count -eq 1 ]; then plural=""; fi
printf "remove ${options[@]} ($count file$plural, $size) [ny]? " ; read
if [ "_$REPLY" = "_y" ]; then
/bin/rm ${options[@]} ${toRemove[@]

Solution

Bonus question: Does ls -C help?

-C     list entries by columns


You need -r to remove directories, so do you intend to impose this check as well when directories are specified?

plural="s"
if [ $count -eq 1 ]; then plural=""; fi


That can be rewritten as

[ $count -eq 1 ] && plural="" || plural="s"


Is the use of _ really necessary here?

if [ "_$?" != "_0" ]; then exit 1; fi
if [ "_$REPLY" = "_y" ]; then


edit: Bonus question, round 2...

You could try this for du_colored:

du_colored() {
    du -sh "$@" | awk -F'\t' -v COL=4 \
        '{if(++c>COL){c=1;print""}("ls -Ad --color \""$2"\"")|getline entry;
        printf "%6s %s\t",$1,entry}END{print""}' | \
    column -s 

Instead of parsing ls output carefully to feed into du, I started from the output of du and then replacing each entry with its ls --Ad --color output. COL is used to fix the number of 'columns' we want column to pick up afterwards, which is in turn used to print the appropriate newlines.

Finally, I feed the output to column as a final attempt at pretty-printing. \t is used as the delimiter throughout.

It still doesn't look as nice as the default non-piped ls output, but at least it's something I suppose...

edit 2: Bonus question, round 3...

I hope this is what's required... :)

du_colored() {
    awk -F'/' 'NR==FNR{k=$0;gsub(/\x1B\[[;[:digit:]]*m/,"",k);s[k]=$0}
        NR!=FNR{l=$0;for(i=3;iC){c=1;print""}printf "/%6s/%s\t",$1,$2}
        END{print""}' | column -s 

-
Get both the output of ls -Ad --color "$*" and du -sh "$@" ... using bash process substitution to pass into awk.

  • The du -sh command is in turn piped to an 'internal' awk command, similar to my earlier suggestion, for column to render properly. What's crucial to note here is that the start, the size and the file/directory entry are delimited using the / character, which is the safest delimiter as files and directories cannot have that (nested directories yes though).



-
The 'main' awk command uses / to split each line into 1 + C * 2 columns (C=4 used above). By using the combination of NR==FNR/NR!=FNR conditions, we can specify different handling for the two 'files'.

  • For the ls output, i.e. NR==FNR, we construct a map from the file/directory (by stripping away the color codes - gsub(/\x1B\[[;[:digit:]]*m/,"",k)) to its colored output.



  • For the du output, we then substitute the file/directory in the appropriate fields (after trimming the extra spaces behind - gsub(/[ ]+$/,"",v)) with the colored output from the mapping.



  • After substituting each / character back to a ` character, we reprint the entire line, which now contains the colored output.



This is arguably more efficient as both
ls and du are performed once on the arguments. The lesson learned here is that column doesn't play ball with most control characters (seemingly \t is the only exception here). For some reason, sort-ing on du`'s output is still introducing minor quirks in my own testing, so I'm leaving that out for now.\t' -t }


Instead of parsing ls output carefully to feed into du, I started from the output of du and then replacing each entry with its ls --Ad --color output. COL is used to fix the number of 'columns' we want column to pick up afterwards, which is in turn used to print the appropriate newlines.

Finally, I feed the output to column as a final attempt at pretty-printing. \t is used as the delimiter throughout.

It still doesn't look as nice as the default non-piped ls output, but at least it's something I suppose...

edit 2: Bonus question, round 3...

I hope this is what's required... :)

%%CODEBLOCK_5%%

-
Get both the output of ls -Ad --color "$*" and du -sh "$@" ... using bash process substitution to pass into awk.

  • The du -sh command is in turn piped to an 'internal' awk command, similar to my earlier suggestion, for column to render properly. What's crucial to note here is that the start, the size and the file/directory entry are delimited using the / character, which is the safest delimiter as files and directories cannot have that (nested directories yes though).



-
The 'main' awk command uses / to split each line into 1 + C * 2 columns (C=4 used above). By using the combination of NR==FNR/NR!=FNR conditions, we can specify different handling for the two 'files'.

  • For the ls output, i.e. NR==FNR, we construct a map from the file/directory (by stripping away the color codes - gsub(/\x1B\[[;[:digit:]]*m/,"",k)) to its colored output.



  • For the du output, we then substitute the file/directory in the appropriate fields (after trimming the extra spaces behind - gsub(/[ ]+$/,"",v)) with the colored output from the mapping.



  • After substituting each / character back to a ` character, we reprint the entire line, which now contains the colored output.



This is arguably more efficient as both
ls and du are performed once on the arguments. The lesson learned here is that column doesn't play ball with most control characters (seemingly \t is the only exception here). For some reason, sort-ing on du`'s output is still introducing minor quirks in my own testing, so I'm leaving that out for now.\t' -t) }


-
Get both the output of ls -Ad --color "$*" and du -sh "$@" ... using bash process substitution to pass into awk.

  • The du -sh command is in turn piped to an 'internal' awk command, similar to my earlier suggestion, for column to render properly. What's crucial to note here is that the start, the size and the file/directory entry are delimited using the / character, which is the safest delimiter as files and directories cannot have that (nested directories yes though).



-
The 'main' awk command uses / to split each line into 1 + C * 2 columns (C=4 used above). By using the combination of NR==FNR/NR!=FNR conditions, we can specify different handling for the two 'files'.

  • For the ls output, i.e. NR==FNR, we construct a map from the file/directory (by stripping away the color codes - gsub(/\x1B\[[;[:digit:]]*m/,"",k)) to its colored output.



  • For the du output, we then substitute the file/directory in the appropriate fields (after trimming the extra spaces behind - gsub(/[ ]+$/,"",v)) with the colored output from the mapping.



  • After substituting each / character back to a ` character, we reprint the entire line, which now contains the colored output.



This is arguably more efficient as both
ls and du are performed once on the arguments. The lesson learned here is that column doesn't play ball with most control characters (seemingly \t is the only exception here). For some reason, sort-ing on du`'s output is still introducing minor quirks in my own testing, so I'm leaving that out for now.\t' -t }

Instead of parsing ls output carefully to feed into du, I started from the output of du and then replacing each entry with its ls --Ad --color output. COL is used to fix the number of 'columns' we want column to pick up afterwards, which is in turn used to print the appropriate newlines.

Finally, I feed the output to column as a final attempt at pretty-printing. \t is used as the delimiter throughout.

It still doesn't look as nice as the default non-piped ls output, but at least it's something I suppose...

edit 2: Bonus question, round 3...

I hope this is what's required... :)

%%CODEBLOCK_5%%

-
Get both the output of ls -Ad --color "$*" and du -sh "$@" ... using bash process substitution to pass into awk.

  • The du -sh command is in turn piped to an 'internal' awk command, similar to my earlier suggestion, for column to render properly. What's crucial to note here is that the start, the size and the file/directory entry are delimited using the / character, which is the safest delimiter as files and directories cannot have that (nested directories yes though).



-
The 'main' awk command uses / to split each line into 1 + C * 2 columns (C=4 used above). By using the combination of NR==FNR/NR!=FNR conditions, we can specify different handling for the two 'files'.

  • For the ls output, i.e. NR==FNR, we construct a map from the file/directory (by stripping away the color codes - gsub(/\x1B\[[;[:digit:]]*m/,"",k)) to its colored output.



  • For the du output, we then substitute the file/directory in the appropriate fields (after trimming the extra spaces behind - gsub(/[ ]+$/,"",v)) with the colored output from the mapping.



  • After substituting each / character back to a ` character, we reprint the entire line, which now contains the colored output.



This is arguably more efficient as both
ls and du are performed once on the arguments. The lesson learned here is that column doesn't play ball with most control characters (seemingly \t is the only exception here). For some reason, sort-ing on du`'s output is still introducing minor quirks in my own testing, so I'm leaving that out for now.

Code Snippets

-C     list entries by columns
plural="s"
if [ $count -eq 1 ]; then plural=""; fi
[ $count -eq 1 ] && plural="" || plural="s"
if [ "_$?" != "_0" ]; then exit 1; fi
if [ "_$REPLY" = "_y" ]; then
du_colored() {
    du -sh "$@" | awk -F'\t' -v COL=4 \
        '{if(++c>COL){c=1;print""}("ls -Ad --color \""$2"\"")|getline entry;
        printf "%6s %s\t",$1,entry}END{print""}' | \
    column -s $'\t' -t
}

Context

StackExchange Code Review Q#96297, answer score: 3

Revisions (0)

No revisions yet.