HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Replace sequence of strings in binary file

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
filereplacesequencebinarystrings

Problem

I have searched for method to find and replace strings sequence in binary files without any luck. The main requirement was that method should not load all file in memory but rather use chunks. I am new in c# and code may look not "polished" but it works fine. Maybe someone will have ideas how this code could be improved or does it have any flaws? p.s. Thanks goes to Jon Skeet for idea.

```
public static void ReplaceTextInFile(string inFile, string find, string replace)
{

if (find.Length!=replace.Length) throw new ArgumentException("The lenght of find and replace strings must match!");

const int chunkPrefix = 1024*10;
var findBytes = GetBytes(find);
var replaceBytes = GetBytes(replace);
long chunkSize = findBytes.Length * chunkPrefix;
var f = new FileInfo(inFile);
if (f.Length ();
var matches = SearchBytePattern(findBytes, readBuffer, ref replacePositions);
if (matches != 0)
foreach (var replacePosition in replacePositions)
{
var originalPosition = stream.Position;
stream.Position = originalPosition - bytesRead + replacePosition;
stream.Write(replaceBytes, 0, replaceBytes.Length);
stream.Position = originalPosition;
}

if (stream.Length == stream.Position) break;
var moveBackByHalf = stream.Position - (bytesRead / 2);
stream.Position = moveBackByHalf;
}

}

}

static public int SearchBytePattern(byte[] pattern, byte[] bytes, ref List position)
{
int matches = 0;
for (int i = 0; i = pattern.Length)
{
bool ismatch = true;
for (int j = 1; j < pattern.Length && ismatch == true; j++)
{
if (bytes[i + j] != pattern[j])
ismatch = false;
}
if (ismatch)
{
position.Add(i);

Solution

A few issues with the code:

  • You're comparing the string length for both, but then replacing the bytes. In UTF-8 encoding, as you're using, it's possible that the two will be different: if find = "aeiou" and replace = "áéíóú" you'll have findBytes.Length == 5, and replaceBytes.Length == 10



  • You don't need to pass the position parameter by reference to SearchBytePattern, since you're not changing the reference, only calling methods on it.



  • On SearchBytePattern, you don't need the outermost loop to go all the way to bytes.Length, it only needs to go to bytes.Length - pattern.Length + 1 (and that would simplify the inner "if"



  • stream.Read doesn't necessarily return the count of bytes you asked for - it can return less than that. Your code should be ready to handle that situation.

Context

StackExchange Code Review Q#3226, answer score: 4

Revisions (0)

No revisions yet.