patternMinor
Performance optimization in function for datastructure mapping
Viewed 0 times
functionoptimizationdatastructureforperformancemapping
Problem
I want to optimize a Perl function which is frequently used in my application. The function creates a special datastructure from the results of
The new datastructure must contain the data in the following form (all row-values for every column in a single arrayref)
The new datastructure is a Hash of Arrays and is used in the whole application. I cannot change the format (its too frequently used). I wrote a function for this conversion. I've already done some some performance optimization after profiling my application. But it's not enough. Now the function looks like:
My Question:
Is there a way to further optimize this function (maybe using $[...])? I found some hints here at page 18 and 19, but I don't have any experience in using $ in different contexts.
I have to say that the function listed above is the best I can do. There may be
DBI::fetchall_arrayref which looks like:$columns = ['COLNAME_1','COLNAME_2','COLNAME_3']
$rows = [ ['row_1_col_1', 'row_1_col_2', 'row_1_col_3'],
['row_2_col_1', 'row_2_col_2', 'row_2_col_3'],
['row_3_col_1', 'row_3_col_2', 'row_3_col_3']
];The new datastructure must contain the data in the following form (all row-values for every column in a single arrayref)
$retval = {
row_count => 3,
col_count => 3,
COLNAME_1 => ['row_1_col_1', 'row_2_col_1', 'row_3_col_1' ],
COLNAME_2 => ['row_1_col_2', 'row_2_col_2', 'row_3_col_2' ],
COLNAME_3 => ['row_1_col_3', 'row_2_col_3', 'row_3_col_3' ]
}The new datastructure is a Hash of Arrays and is used in the whole application. I cannot change the format (its too frequently used). I wrote a function for this conversion. I've already done some some performance optimization after profiling my application. But it's not enough. Now the function looks like:
sub reorganize($) {
my ($self,$columns,$rows) = @_;
my $col_count = scalar(@$columns);
my $row_count = scalar(@$rows);
my $col_index = 0;
my $row_index = 0;
my $retval = { # new datastructure
row_count => $row_count,
col_count => $col_count
};
# iterate through all columns
for($col_index=0; $col_index[$row_index] = $rows->[$row_index][$col_index];
}
# Assign the arrayref to the hash. The hash-key is the name of the column
$retval->{$columns->[$col_index]} = $tmp;
}
return $retval;
}My Question:
Is there a way to further optimize this function (maybe using $[...])? I found some hints here at page 18 and 19, but I don't have any experience in using $ in different contexts.
I have to say that the function listed above is the best I can do. There may be
Solution
The following code is about 35% faster (measured with Benchmark). The tricks:
-
no anonymous array created for
-
explicit
-
variables created in place where their value is needed.
Some of the tricks added just a 3%, the first one seemed the most important. YMMV.
I experimented with
-
no anonymous array created for
$tmp.-
explicit
return removed.-
variables created in place where their value is needed.
Some of the tricks added just a 3%, the first one seemed the most important. YMMV.
I experimented with
$_ and maps, too, but it seems the plain old C-style loop is the fastest.sub faster {
my ($self, $columns, $rows) = @_;
my $retval = {
row_count => my $row_count = @$rows,
col_count => my $col_count = @$columns,
};
for (my $col_index = 0 ; $col_index [$row_index] = $rows->[$row_index][$col_index];
}
$retval->{$columns->[$col_index]} = $tmp;
}
$retval
}Code Snippets
sub faster {
my ($self, $columns, $rows) = @_;
my $retval = {
row_count => my $row_count = @$rows,
col_count => my $col_count = @$columns,
};
for (my $col_index = 0 ; $col_index < $col_count ; $col_index++) {
my $tmp;
for (my $row_index = 0 ; $row_index < $row_count ; $row_index++) {
$tmp->[$row_index] = $rows->[$row_index][$col_index];
}
$retval->{$columns->[$col_index]} = $tmp;
}
$retval
}Context
StackExchange Code Review Q#44059, answer score: 3
Revisions (0)
No revisions yet.