HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

A reliable way to remove consecutive appearances of a substring

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
appearancesreliablesubstringwayremoveconsecutive

Problem

I'm attempting to write a piece of code that is supposed to remove consecutive appearances of a string (not a single character) in a StringBuilder.

It's extremely important that the method works well and does not remove or change anything that it shouldn't. Solid performance is a secondary requirement.

for example:

Input: "xxxxABCxxxxABCABCxxxxABABCxxABCABCABC"
Remove consecutive: "ABC"

Output: "xxxxABCxxxxABCxxxxABABCxxABC"


I've written a method that does this and testing shows that it works as expected, but it will be catastrophic if I use it and it results in changing the StringBuilder in an unexpected way - that's why I want another opinion on whether my code is really 'safe' in all cases:

public static void RemoveConsecutive(this StringBuilder sb, string value)
{
    if (sb == null)
        throw new ArgumentNullException("sb");
    if (value == null)
        throw new ArgumentNullException("value");
    if (value == string.Empty)
        throw new ArgumentException("value cannot be an empty string.", "value");

    bool justRemoved = false;
    for (int i = 0; i = sb.Length)
        throw new ArgumentOutOfRangeException("startIndex", "startIndex must be a valid index in the provided StringBuilder.");
    if (startIndex + str.Length > sb.Length)
        return false;
    for (int i = 0; i < str.Length; i++)
    {
        if (str[i] != sb[i + startIndex])
            return false;
    }
    return true;
}


While this piece of code does something rather trivial, I just want another set of eyes to look at it and possibly find faults I could not. Again, any sort of unexpected behavior might be catastrophic.

Solution

Your code looks like it does exactly what you want to me.

Some notes:

  • StringBuilder methods (which you're emulating by using an extension method) return the modified StringBuilder, so they can be used in a fluent manner. You should probably do the same.



  • I don't like changing the index of a for loop. Especially since i-- means “stay at the same index”, I think that's confusing. You should use a while loop instead.



-
You can shorten your code by quite a bit by using Regex:

public static string RemoveConsecutive(string text, string value)
{
    // validation omited

    string regex = string.Format(@"({0})+", Regex.Escape(value));
    return Regex.Replace(text, regex, match => value);
}


Shorter code is usually less error prone. But in this case, I'm not so sure about that, because there are some intricacies in this code:

  • Calling Escape() is crucial if value contains some regex special characters (e.g. *).



  • If the value was used directly as the replacement (instead of a delegate), it wouldn't work correctly if it contained any substitution (e.g. $0).

Code Snippets

public static string RemoveConsecutive(string text, string value)
{
    // validation omited

    string regex = string.Format(@"({0})+", Regex.Escape(value));
    return Regex.Replace(text, regex, match => value);
}

Context

StackExchange Code Review Q#21528, answer score: 2

Revisions (0)

No revisions yet.