patterncsharpMinor
Simplification of byte array comparison algorithm
Viewed 0 times
arraybytecomparisonalgorithmsimplification
Problem
I have an algorithm that evaluates in input
All of these conditions are dealt with in this algorithm. Having been my first pass at this algorithm, I know there are improvements that can be made. Not only am I looking for cleaner code and possibly refactoring to find re-usable bits, this needs to be as efficient as possible. Right now this will compare an 18 byte input array against an 1170 byte source array in 2ms.
Here is the current implementation of the algorithm:
...using .Net 4.5 if that helps.
Edit per comment
In all cases, the algorithm will return true if the source sequence contains a match against the full input array sequence. Each byte must match specifically in sequence except where a wildcard
This algorithm is intended to continue reading until end of the source array. If the end of the source array is reached before another full iterative match of the input array, the algorithm would return false. But the cal
byte[] with a source byte[]. Several conditions can be encountered:- Match found
- No match found
- End of input array
- End of source array
- Input wildcard (*) matches against any source byte
All of these conditions are dealt with in this algorithm. Having been my first pass at this algorithm, I know there are improvements that can be made. Not only am I looking for cleaner code and possibly refactoring to find re-usable bits, this needs to be as efficient as possible. Right now this will compare an 18 byte input array against an 1170 byte source array in 2ms.
Here is the current implementation of the algorithm:
private bool HasMatchBytes(byte[] bytes, out int length, ref int position) {
var hasMatch = false;
var isComplete = false;
var b = position;
var end = position;
// - this loop depends on internal advancement of the counter [v]
for (var v = 0; v = mByteStream.Length) {
hasMatch = false;
}
}
// with match and at end of bytes length algorithm is complete
if (hasMatch &&
v >= bytes.Length) {
isComplete = true;
end = ++b;
}
// if end of source reached then shut down algorithm
if (b >= mByteStream.Length) {
v = bytes.Length;
isComplete = true;
end = mByteStream.Length;
}
}
length = end - position;
return hasMatch;
}...using .Net 4.5 if that helps.
Edit per comment
In all cases, the algorithm will return true if the source sequence contains a match against the full input array sequence. Each byte must match specifically in sequence except where a wildcard
(*) is specified.This algorithm is intended to continue reading until end of the source array. If the end of the source array is reached before another full iterative match of the input array, the algorithm would return false. But the cal
Solution
-
If I understand this correctly, your code doesn't work. This is because when some characters match and then you find a character that doesn't match, you can't continue checking with the current character, you need to go back to the second character of the current partial match.
An example is
-
It looks like this method is part of a type that does other things. I think it would be cleaner if it was in a separate type, taking both arrays as parameters.
-
When reading the code, it's very easy to confuse
-
A method that's this complicated would benefit greatly from detailed XML documentation. That way, callers of the method don't need to read its code to know what it does.
If I understand this correctly, your code doesn't work. This is because when some characters match and then you find a character that doesn't match, you can't continue checking with the current character, you need to go back to the second character of the current partial match.
An example is
mByteStream = new byte[] { 0, 0, 1 }, bytes = new byte[] { 0, 1 }. For this input, the correct result is true, but your code returns false.-
It looks like this method is part of a type that does other things. I think it would be cleaner if it was in a separate type, taking both arrays as parameters.
-
When reading the code, it's very easy to confuse
bytes and mByteStream, b and v. Those names don't mean anything. Much better names would be for example needle, haystack, needleIndex and haystackIndex. Having different variables names hasMatch and isMatch is also very confusing.-
A method that's this complicated would benefit greatly from detailed XML documentation. That way, callers of the method don't need to read its code to know what it does.
Context
StackExchange Code Review Q#36418, answer score: 5
Revisions (0)
No revisions yet.