HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMajor

Regex to first match, then replace found matches

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
matchesmatchreplacefoundfirstthenregex

Problem

In my C# program I am using Regular expressions to:

  • Loop through a list of possible words in need of replacing.



  • For each word, to find out if a string I am given has any matches.



  • If it does, I perform some (slightly costly) logic to create the replacement.



  • I then perform the actual replacement.



My current code looks roughly as follows:

string toSearchInside; // The actual string I'm going to be replacing within
List searchStrings; // The list of words to look for via regex

string pattern = @"([:@?]{0})";
string replacement;

foreach (string toMatch in searchStrings)
{
    var regex = new Regex(
                            string.Format(pattern, toMatch), 
                            RegexOptions.IgnoreCase
                            );
    var matches = regex.Matches(toSearchInside);

    if (matches.Count == 0)
        continue;

    replacement = CreateReplacement(toMatch);

    toSearchInside = regex.Replace(toSearchInside, replacement);
}


And I can get this working, but it seems somewhat inefficient in that it is using the regex engine twice - Once to find the matches (regex.Matches()) and once for the replacing regex.Replace()). I was wondering if there was a way to simply say replace the matches you already found?

Also, if has been asked what is within the CreateReplacement() method since it could be possibly done via a Match Elevator, but it is actually a separate method that's fairly costly and not really what I'm asking in this case - My bigger question here is how to deal with this situation of having to use Regex twice - Once to find the matches and then a second time to replace them.

I hope that what I'm trying to find out how to do actually makes sense.

Solution

Regex.Matches returns a MatchCollection which contains Matches which captures the index and length of each match. So as such you won't have to fire up the regex engine again because you can do something like this:

string toSearchInside; // The actual string I'm going to be replacing within
List searchStrings; // The list of words to look for via regex

string pattern = @"([:@?]{0})";
string replacement;

foreach (string toMatch in searchStrings)
{
    var regex = new Regex(
                            string.Format(pattern, toMatch), 
                            RegexOptions.IgnoreCase
                            );
    var matches = regex.Matches(toSearchInside);

    if (matches.Count == 0)
        continue;

    replacement = CreateReplacement(toMatch);

    // in case the replacement is of a different length we replace from
    // from back to front to keep the match indices correct
    foreach (var match in matches.Cast().Reverse())
    {
        toSearchInside = toSearchInside.Replace(match.Index, match.Length, replacement);
    }
}


Unfortunately the .NET framework doesn't come with a positional Replace so we have to create one which I did as an extension method:

public static string Replace(this string s, int index, int length, string replacement)
{
    var builder = new StringBuilder();
    builder.Append(s.Substring(0,index));
    builder.Append(replacement);
    builder.Append(s.Substring(index + length));
    return builder.ToString();
}


If you do this often and the match patterns don't change you could consider two things:

  • Pre-compile the regular expressions



  • Pre-create the replacements

Code Snippets

string toSearchInside; // The actual string I'm going to be replacing within
List<string> searchStrings; // The list of words to look for via regex

string pattern = @"([:@?]{0})";
string replacement;

foreach (string toMatch in searchStrings)
{
    var regex = new Regex(
                            string.Format(pattern, toMatch), 
                            RegexOptions.IgnoreCase
                            );
    var matches = regex.Matches(toSearchInside);

    if (matches.Count == 0)
        continue;

    replacement = CreateReplacement(toMatch);

    // in case the replacement is of a different length we replace from
    // from back to front to keep the match indices correct
    foreach (var match in matches.Cast<Match>().Reverse())
    {
        toSearchInside = toSearchInside.Replace(match.Index, match.Length, replacement);
    }
}
public static string Replace(this string s, int index, int length, string replacement)
{
    var builder = new StringBuilder();
    builder.Append(s.Substring(0,index));
    builder.Append(replacement);
    builder.Append(s.Substring(index + length));
    return builder.ToString();
}

Context

StackExchange Code Review Q#119519, answer score: 30

Revisions (0)

No revisions yet.