patternMinor
Find common preamble of a list of strings
Viewed 0 times
preamblefindliststringscommon
Problem
Think of a set of text lines starting with a common string, e.g. indented code.
Note that after the first invocation chances are high that
I'm wondering if there is a better approach in comparing the two strings in the reduce block. The while loop does not seem to be the best approach. It's also less readable since it does not convey the intent of the code (find common preamble of
Update:
After feedback from amon here is an alternative:
I think this is an improvement in both performance and readability.
my $preamble = reduce {
my $len = min(length $a, length $b);
--$len while substr($a, 0, $len) ne substr($b, 0, $len);
return substr($a, 0, $len);
} @lines;Note that after the first invocation chances are high that
$a already has the final result.I'm wondering if there is a better approach in comparing the two strings in the reduce block. The while loop does not seem to be the best approach. It's also less readable since it does not convey the intent of the code (find common preamble of
$a and $b).Update:
After feedback from amon here is an alternative:
my $preamble = reduce {
my $len = min(length $a, length $b);
my ($current_prefix, $string) = (substr($a, 0, $len), substr($b, 0, $len));
while($current_prefix ne $string) {
chop $current_prefix;
chop $string;
}
return $current_prefix;
} @lines;I think this is an improvement in both performance and readability.
Solution
First of all, I think it is very good that you used
One problem in your code is that you keep using
Then, I would shorten the
I did not use the statement modifier form of
If you do not like
Either way, the concept of shortening the prefix by removing one character at a time until it fits should be easier to read than all the
Cutting the prefix to the length of the string would just be an optimization:
Put together, we would get the following code:
The code might be easier to understand for people who don't know
Oh look, it's shorter too!
reduce, as it clearly shows how the algorithm works. At least, to a reader who understands functional idioms.One problem in your code is that you keep using
$a and $b. These two names do not convey any meaning. We could do instead:my $prefix = reduce {
my ($current_prefix, $string) = ($a, $b);
...Then, I would shorten the
$current_prefix until it is at the beginning of the other $string:until (0 == index $string, $current_prefix) {
chop $current_prefix;
}I did not use the statement modifier form of
until or while – I don't believe that they make code much easier to read the way a postfix if or for can do. Note that the empty string occurs at the beginning of any string, so the termination condition of that loop is safe.If you do not like
index or chop, you could equivalently use the less efficientuntil ($string =~ /\A\Q$current_prefix/) {
$current_prefix =~ s/.\z//s;
}Either way, the concept of shortening the prefix by removing one character at a time until it fits should be easier to read than all the
substringing you used.Cutting the prefix to the length of the string would just be an optimization:
$current_prefix = substr $current_prefix, 0, length $string;Put together, we would get the following code:
my $prefix = reduce {
my ($current_prefix, $string) = ($a, $b);
# the prefix cannot be longer than the string
$current_prefix = substr $current_prefix, 0, length $string;
# remove characters from the prefix until it occurs at the beginning.
# "" is always a prefix, so the loop properly terminates.
until (0 == index $string, $current_prefix) {
chop $current_prefix;
}
return $current_prefix;
} @strings;The code might be easier to understand for people who don't know
reduce if you express it in the imperative form:my ($prefix, @strings) = @original_strings;
for my $string (@strings) {
# the prefix cannot be longer than the string
$prefix = substr $prefix, 0, length $string;
# remove characters from the prefix until it occurs at the beginning.
# "" is always a prefix, so the loop properly terminates.
until (0 == index $string, $prefix) {
chop $prefix;
}
}
# now $prefix is the prefix of all @original_stringsOh look, it's shorter too!
Code Snippets
my $prefix = reduce {
my ($current_prefix, $string) = ($a, $b);
...until (0 == index $string, $current_prefix) {
chop $current_prefix;
}until ($string =~ /\A\Q$current_prefix/) {
$current_prefix =~ s/.\z//s;
}$current_prefix = substr $current_prefix, 0, length $string;my $prefix = reduce {
my ($current_prefix, $string) = ($a, $b);
# the prefix cannot be longer than the string
$current_prefix = substr $current_prefix, 0, length $string;
# remove characters from the prefix until it occurs at the beginning.
# "" is always a prefix, so the loop properly terminates.
until (0 == index $string, $current_prefix) {
chop $current_prefix;
}
return $current_prefix;
} @strings;Context
StackExchange Code Review Q#37054, answer score: 3
Revisions (0)
No revisions yet.