patternMinor
Deleting most recent files by parsing filename
Viewed 0 times
deletingrecentfilenamefilesparsingmost
Problem
I have hundreds of .mp3 files in a single directory of the same naming format,
Here is an example two different
I need to keep
Here is my idea:
This is working as I expected, but since I am using this to delete large numbers of files regularly, I would like an expert take on this. I do not want to accidentally delete wrong files. Plus, I am interested in any general improvements or more elegant solutions.
title_YYYY-MM-DD.mp3, with maybe 30 different titles.Here is an example two different
titles:vision_am_2015-08-04.mp3
vision_am_2015-08-03.mp3
vision_am_2015-07-31.mp3
vision_am_2015-07-30.mp3
lum_pro_2015-08-04.mp3
lum_pro_2015-08-03.mp3
lum_pro_2015-08-01.mp3
lum_pro_2015-07-31.mp3
lum_pro_2015-07-30.mp3
lum_pro_2015-07-29.mp3
lum_pro_2015-07-28.mp3
lum_pro_2015-07-27.mp3I need to keep
X number of most recent files for each title. I figured that since the date format is YYYY-MM-DD, after building a data structure for the files, I can make sure the files are sorted in descending order. Then iterate through them. Then safely delete with confidence each file after the Xth iteration.Here is my idea:
my $num_to_keep = 2; # or get from @ARGV
$num_to_keep = $num_to_keep - 1;
my $dir = "/home/mp3files";
opendir my $DH, "$dir" or die "$! not open";
my $dateRE = qr/\d{4}-\d{2}-\d{2}/;
my $fileRE = qr/^.+_$dateRE\.mp3$/; # only mp3s
my @files = sort grep {/$fileRE/ && -f "$dir/$_"} readdir($DH);
close $DH;
my %hash = ();
for my $file (reverse @files) {
my ($fname) = $file =~ m/(.*)?_$dateRE/;
push(@{ $hash{$fname} }, $file);
}
for my $fname (sort keys %hash) {
my @files = @{$hash{$fname}};
print "\n\n\nFILE: $fname $num_to_keep) {
unlink "$dir/$files[$i]";
}else{
print "\t\t\t\tI will keep this file $files[$i]\n";
}
}
}This is working as I expected, but since I am using this to delete large numbers of files regularly, I would like an expert take on this. I do not want to accidentally delete wrong files. Plus, I am interested in any general improvements or more elegant solutions.
Solution
You can reduce number of loops, sorts, and matches, so this should perform faster,
my $num_to_keep = 2;
my $dir = "/home/mp3files";
opendir my $DH, $dir or die "$! $dir";
# only mp3s
my $fileRE = qr/(.+) _ (\d{4}-\d{2}-\d{2}) \.mp3$/x;
my %count;
# files to delete
my @files = map {
my $basename = $_->[1];
(++$count{$basename} > $num_to_keep) ? $_->[0] : ();
}
sort {
$b->[2] cmp $a->[2] # sort descending by date
}
map {
my @match = /$fileRE/;
(@match && -f "$dir/$_") ? ["$dir/$_", @match] : ();
}
readdir($DH);
close $DH;
unlink(@files);Code Snippets
my $num_to_keep = 2;
my $dir = "/home/mp3files";
opendir my $DH, $dir or die "$! $dir";
# only mp3s
my $fileRE = qr/(.+) _ (\d{4}-\d{2}-\d{2}) \.mp3$/x;
my %count;
# files to delete
my @files = map {
my $basename = $_->[1];
(++$count{$basename} > $num_to_keep) ? $_->[0] : ();
}
sort {
$b->[2] cmp $a->[2] # sort descending by date
}
map {
my @match = /$fileRE/;
(@match && -f "$dir/$_") ? ["$dir/$_", @match] : ();
}
readdir($DH);
close $DH;
unlink(@files);Context
StackExchange Code Review Q#99049, answer score: 4
Revisions (0)
No revisions yet.