HiveBrain v1.2.0
Get Started
← Back to all entries
patternbashModerate

Bash script that lowercases files

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
scriptlowercasesfilesthatbash

Problem

I have the following bash script that:

  • Finds all files with .cfc and .cfm extension and converts them to lowercase



  • Stores the relative file paths to those files (filenames.txt)



  • Chops those files to get only the name, excluding the extension (files.txt)



  • Loops through 1500 files checking for references to any of the other 1500 files and converts those to lower case



#!/bin/bash
# Search for references to JS function is all .cfm and .cfc functions
# Prompt to make sure
while true; do
    read -p "All .cfc and .cfm files in the theradoc/ directory and lower directories will be converted. Do you wish to contiue? (y/n)" yn
case $yn in
    [Yy]* ) make install; break;;
    [Nn]* ) exit;;
    * ) echo "Please answer yes or no.";;
esac
done
echo "Renaming files..."
for f in `find theradoc/ -d -name '*.cfc'`;
    do mv -v $f `echo $f | tr '[A-Z]' '[a-z]'`
done
for f in `find theradoc/ -d -name '*.cfm'`;
    do mv -v $f `echo $f | tr '[A-Z]' '[a-z]'`
done
echo "Indexing file names..."
find theradoc/ -d -name '*.cfc' > filenames.txt
find theradoc/ -d -name '*.cfm' >> filenames.txt
echo "Editing file names..."
sed 's/theradoc.*\///g' filenames.txt > tmp.txt
sed 's/\.cf.*//g' tmp.txt > files.txt
rm tmp.txt
echo "Searching all files..."
a=($(wc filenames.txt))
lines=${a[0]}
count=0
while read fn; do
    echo "$fn | $count/$lines finished..."
    while read f; do
        perl -pi -e "s/$f/$f/gi" "$fn"
    done < files.txt
    count=$((count+1))
done < filenames.txt


Runtime: 4 hours

Hardware: MacBook 16GB RAM

I would definitely like to decrease the runtime of this as it may be needed to run on other systems with the same files.

Solution

In my testing, constructing one Perl script and running it repeatedly is much faster (0.5s versus 3.6s) then running a new Perl instance for each replacement:

while read f; do
    echo "s/$f/$f/gi;" 
done  s.pl

while read fn; do
    perl -pi s.pl "$fn"
    echo "$fn | $count/$lines finished..."
    count=$((count+1))
done < filenames.txt

rm s.pl


But it seems even faster (0.05s) to rewrite the whole thing to Perl:

#! /usr/bin/perl
use warnings;
use strict;

use File::Find;

my $dir = 'theradoc2';

my %change;

find(sub {
    return unless -f;
    undef $change{$_};
    rename $_, lc $_;
}, $dir);

my $regex = join '|',
            map quotemeta,
            sort { length $b  length $a }
            keys %change;
find(sub {
    return unless -f;
    my $file = $_;
    open my $IN, '', "$file.new" or die $!;
    while () {
        s/($regex)/\L$1/g;
        print {$OUT} $_;
    }
    close $OUT or die $!;
    unlink $file or die $!;
    rename "$file.new", $file or die $!;
}, $dir);

Code Snippets

while read f; do
    echo "s/$f/$f/gi;" 
done < files.txt > s.pl

while read fn; do
    perl -pi s.pl "$fn"
    echo "$fn | $count/$lines finished..."
    count=$((count+1))
done < filenames.txt

rm s.pl
#! /usr/bin/perl
use warnings;
use strict;

use File::Find;

my $dir = 'theradoc2';

my %change;

find(sub {
    return unless -f;
    undef $change{$_};
    rename $_, lc $_;
}, $dir);

my $regex = join '|',
            map quotemeta,
            sort { length $b <=> length $a }
            keys %change;
find(sub {
    return unless -f;
    my $file = $_;
    open my $IN, '<', $file or die $!;
    open my $OUT, '>', "$file.new" or die $!;
    while (<$IN>) {
        s/($regex)/\L$1/g;
        print {$OUT} $_;
    }
    close $OUT or die $!;
    unlink $file or die $!;
    rename "$file.new", $file or die $!;
}, $dir);

Context

StackExchange Code Review Q#134684, answer score: 11

Revisions (0)

No revisions yet.