patterncppMinor
Fastest C++ code to get common prefix length
Viewed 0 times
prefixlengthgetfastestcodecommon
Problem
In a C++ program, I have the following function to find the length of the longest common prefix between to char arrays
Is there a faster char array comparison code (for example, with some library function)? I wish to optimize this code for speed.
I've found the following code to be about 10 times faster that the original one I gave. Is there an even faster alternative?
a and b. String b is guaranteed to be shorter than string a.Is there a faster char array comparison code (for example, with some library function)? I wish to optimize this code for speed.
long len_common_prefix(char a[], char b[]) {
char* p = a;
char* q = b;
while ((*p == *q) && (*q)) {
p++;
q++;
};
return p-a;
}I've found the following code to be about 10 times faster that the original one I gave. Is there an even faster alternative?
long long int len_common_prefix(char a[], char b[]) {
long long int* p = (long long int*)a;
long long int* q = (long long int*)b;
int ratio = sizeof(long long int) / sizeof(char);
while (*p == *q) {
p++;
q++;
}
long long int long_long_diff = p-(long long int*)a;
char* p2 = (char*)p;
char* q2 = (char*)q;
while (*p2 == *q2) {
p2++;
q2++;
}
return long_long_diff * ratio + (p2 - (char*)p);
}Solution
The C++ way would be to call
If you absolutely have to improve on the base line performance, you can rewrite your second solution as two successive calls to
Live Example that prints 27 (the same as your two implementations).
What you are doing with your pointer cast to
Some issues that you have to be very careful about when doing pointer casting:
I'd like to see a reproducible benchmark before being able to comment on whether a 10x improvement is really possible on
std::mismatch and be done with it#include // mismatch
#include // assert
#include // ptrdiff_t
#include // strlen
#include // cout
#include // distance
std::ptrdiff_t len_common_prefix_base(char const a[], char const b[])
{
assert(std::strlen(b) <= std::strlen(a));
return std::distance(b, std::mismatch(b, b + std::strlen(b), a).first);
}If you absolutely have to improve on the base line performance, you can rewrite your second solution as two successive calls to
std::mismatch. First, by reinterpreting the char as long long int and calling std::mismatch at the block level. Secondly, after finding the first char in the first mismatching block (which could be the final partial block containing the remaining chars) and doing another std::mismatch over the remaining chars.std::ptrdiff_t len_common_prefix_10x(char const a[], char const b[])
{
assert(std::strlen(b) (a);
auto q = reinterpret_cast(b);
auto const n = std::strlen(b);
auto const num_blocks = n / sizeof(block_type);
auto block_mismatch = std::mismatch(q, q + num_blocks, p);
auto b2 = reinterpret_cast(block_mismatch.first);
auto a2 = reinterpret_cast(block_mismatch.second);
return std::distance(b, std::mismatch(b2, b + n, a2).first);
}
int main()
{
char const a[] = "hello world is a trivial exercise";
char const b[] = "hello world is a trivial example";
std::cout << len_common_prefix_base(a, b) << '\n';
std::cout << len_common_prefix_10x(a, b) << '\n';
}Live Example that prints 27 (the same as your two implementations).
What you are doing with your pointer cast to
long int is very fragile, at the very least it's implementation-defined at possibly undefined behavior (for which no warning is required, and that includes the case that your code runs 10x faster on your system but crashes after you turn on compiler optimizations or port it to another system).Some issues that you have to be very careful about when doing pointer casting:
- alignment, if your
char[]is part of astruct, you need to be sure it is aligned at 8-byte boundaries to avoid unaligned memory access from yourlong long intpointers.
- endianness, you need to make sure that the bytes within a
long long intare in the same order as thechar[]array
I'd like to see a reproducible benchmark before being able to comment on whether a 10x improvement is really possible on
std::mismatch.Code Snippets
#include <algorithm> // mismatch
#include <cassert> // assert
#include <cstddef> // ptrdiff_t
#include <cstring> // strlen
#include <iostream> // cout
#include <iterator> // distance
std::ptrdiff_t len_common_prefix_base(char const a[], char const b[])
{
assert(std::strlen(b) <= std::strlen(a));
return std::distance(b, std::mismatch(b, b + std::strlen(b), a).first);
}std::ptrdiff_t len_common_prefix_10x(char const a[], char const b[])
{
assert(std::strlen(b) <= std::strlen(a));
using block_type = long long int;
auto p = reinterpret_cast<block_type const*>(a);
auto q = reinterpret_cast<block_type const*>(b);
auto const n = std::strlen(b);
auto const num_blocks = n / sizeof(block_type);
auto block_mismatch = std::mismatch(q, q + num_blocks, p);
auto b2 = reinterpret_cast<char const*>(block_mismatch.first);
auto a2 = reinterpret_cast<char const*>(block_mismatch.second);
return std::distance(b, std::mismatch(b2, b + n, a2).first);
}
int main()
{
char const a[] = "hello world is a trivial exercise";
char const b[] = "hello world is a trivial example";
std::cout << len_common_prefix_base(a, b) << '\n';
std::cout << len_common_prefix_10x(a, b) << '\n';
}Context
StackExchange Code Review Q#104477, answer score: 7
Revisions (0)
No revisions yet.