patternMinor
Slow anagrams detection
Viewed 0 times
detectionslowanagrams
Problem
This is my code for this question on hackerrank.
The idea is to find the amount of substrings of a given word that are anagrams of each other.
The code works fine, but it times out with longer strings. I've downloaded the test cases to make sure the result is correct and it is, it just takes too long.
I can do this in C and pass all tests, but I was wondering if there is a faster alternative in Perl.
I also tried using numbers, i.e. multiplying the
That is much faster, but only works on small strings because it overflows quickly.
In this specific case, I'm not really that concerned about style, as much as performance. How can this be improved?
```
use strict;
use warnings;
use Data::Dumper;
sub get_all_substr {
my ($input_string) = @_;
my @results;
push @results, $input_string =~ /(?=(.{$_}))/g for 1 .. length($input_string);
return \@results;
}
sub is_anagram {
my ($string_a, $string_b) = @_;
my $len_a = length($string_a);
return 0 if (length($string_b) != $len_a);
my %vec_a;
my %vec_b;
foreach my $i (0..$len_a-1) {
$vec_a{substr($string_a, $i, 1)}++;
$vec_b{substr($string_b, $i, 1)}++;
}
return 0 if (scalar(keys(%vec_a)) != scalar(keys(%vec_b)));
foreach my $key (keys(%vec_a)) {
if (!defined($vec_b{$key}) || ($vec_a{$key} != $vec_b{$key})) {
return 0;
}
}
return 1;
}
my @results;
my $t = ;
chomp $t;
while($t--) {
my $input_string = ;
chomp $input_string;
my $count = 0;
my $strings;
$strings = get_all_substr($input_string);
my $max_size = scalar(@{$strings})-1;
foreach my $i (0..$max_size) {
foreach my $k ($i+1..$max_size) {
if (is_anagram($strings->[$k], $strings->[$i])) {
$count++;
}
}
}
push @results, $count;
}
The idea is to find the amount of substrings of a given word that are anagrams of each other.
The code works fine, but it times out with longer strings. I've downloaded the test cases to make sure the result is correct and it is, it just takes too long.
I can do this in C and pass all tests, but I was wondering if there is a faster alternative in Perl.
I also tried using numbers, i.e. multiplying the
ord()-97 of the characters in the two substrings, and then doing a single check on the number, to see if they are anagrams of each other.That is much faster, but only works on small strings because it overflows quickly.
In this specific case, I'm not really that concerned about style, as much as performance. How can this be improved?
```
use strict;
use warnings;
use Data::Dumper;
sub get_all_substr {
my ($input_string) = @_;
my @results;
push @results, $input_string =~ /(?=(.{$_}))/g for 1 .. length($input_string);
return \@results;
}
sub is_anagram {
my ($string_a, $string_b) = @_;
my $len_a = length($string_a);
return 0 if (length($string_b) != $len_a);
my %vec_a;
my %vec_b;
foreach my $i (0..$len_a-1) {
$vec_a{substr($string_a, $i, 1)}++;
$vec_b{substr($string_b, $i, 1)}++;
}
return 0 if (scalar(keys(%vec_a)) != scalar(keys(%vec_b)));
foreach my $key (keys(%vec_a)) {
if (!defined($vec_b{$key}) || ($vec_a{$key} != $vec_b{$key})) {
return 0;
}
}
return 1;
}
my @results;
my $t = ;
chomp $t;
while($t--) {
my $input_string = ;
chomp $input_string;
my $count = 0;
my $strings;
$strings = get_all_substr($input_string);
my $max_size = scalar(@{$strings})-1;
foreach my $i (0..$max_size) {
foreach my $k ($i+1..$max_size) {
if (is_anagram($strings->[$k], $strings->[$i])) {
$count++;
}
}
}
push @results, $count;
}
Solution
Here's my Perl solution to this problem, for comparison. It simply iterates through every non-empty substring of the original and sorts the characters of each one. The hash
Once the counts have been established, the problem requires that the number of possible unordered pairs be calculated for each set of equivalent substrings
Because all of N items may be paired with N-1 others we have N (N-1) possible pairs. But a pair like N0 N1 is the same as N1 N0, so we must divide that expression by two, so the number of possible unordered pairs of N items is N (N-1) / 2. That expression appears literally in the code below
%count keeps track of the count of equivalent substringsOnce the counts have been established, the problem requires that the number of possible unordered pairs be calculated for each set of equivalent substrings
Because all of N items may be paired with N-1 others we have N (N-1) possible pairs. But a pair like N0 N1 is the same as N1 N0, so we must divide that expression by two, so the number of possible unordered pairs of N items is N (N-1) / 2. That expression appears literally in the code below
use strict;
use warnings 'all';
my ($n) = split ' ', <>;
while ( <> ) {
my @s = /[a-z]/g;
my %count;
for my $i ( 0 .. $#s ) {
for my $j ( $i .. $#s ) {
my $ss = join '', sort @s[$i .. $j];
++$count{$ss};
}
}
my $total = 0;
for my $n ( values %count ) {
$total += $n * ($n-1) / 2;
}
print $total, "\n";
}Code Snippets
use strict;
use warnings 'all';
my ($n) = split ' ', <>;
while ( <> ) {
my @s = /[a-z]/g;
my %count;
for my $i ( 0 .. $#s ) {
for my $j ( $i .. $#s ) {
my $ss = join '', sort @s[$i .. $j];
++$count{$ss};
}
}
my $total = 0;
for my $n ( values %count ) {
$total += $n * ($n-1) / 2;
}
print $total, "\n";
}Context
StackExchange Code Review Q#139151, answer score: 2
Revisions (0)
No revisions yet.