HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Eliminating unwanted namespaces

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
unwantedeliminatingnamespaces

Problem

I am trying to eliminate unwanted namespaces, and I know there's a better way to do this. This code works, but it's remedial and redundant.

string sPattern = "xmlns:d5p1=\"http://www.w3.org/2001/XMLSchema-instance\" d5p1:nil=\"true\"";
string sPattern2 = "d5p1:nil=\"true\" xmlns:d5p1=\"http://www.w3.org/2001/XMLSchema-instance\"";
string sPattern3 = "xmlns:d2p1=\"http://www.w3.org/2001/XMLSchema-instance\" d2p1:nil=\"true\"";

string[] fString = new string[sentences.Count()];
//foreach (string str in sentences)
for(int i=0; i < sentences.Count(); i++)
{
    Console.WriteLine("{0,24}", sentences[i]); // lines 0-24.
    if(System.Text.RegularExpressions.Regex.IsMatch(sentences[i], sPattern, System.Text.RegularExpressions.RegexOptions.IgnoreCase))
    {
        Console.WriteLine(" (match for '{0}' found)", sPattern);
        eString = sentences[i].Replace(sPattern, "");

    }
    else if(System.Text.RegularExpressions.Regex.IsMatch(sentences[i], sPattern2, System.Text.RegularExpressions.RegexOptions.IgnoreCase))   
    {
        Console.WriteLine(" (match for '{0}' found)", sPattern2);
        eString = sentences[i].Replace(sPattern2, "");
    }
    else if (System.Text.RegularExpressions.Regex.IsMatch(sentences[i], sPattern3, System.Text.RegularExpressions.RegexOptions.IgnoreCase))
    {
        Console.WriteLine(" (match for '{0}' found)", sPattern3);
        eString = sentences[i].Replace(sPattern3, "");
    }
    else
    {
        Console.WriteLine();
        eString = sentences[i];
    }

    fString[i] = eString;
}

// ok, let's look at the end result
foreach(string s in fString)
{

    Console.WriteLine(s);
}

// stop the app from shutting down before seeing results
Console.ReadLine();

Solution

Well, the first thing I'd do is put the patterns in a list, and loop through matching them. The second thing would be general cleanup; use namespaces where you can, one-liner if/else clauses don't need braces, use var where the type is obvious, etc etc. This gets rid of that if-elseif structure and makes the code much more concise.

As far as merging the patterns, you could match on a combination of two:

  • "xmlns:(d5p1|d2p1)=\".*?\"" (we don't care what the exact namespace definition is, just don't match greedily)



  • "(d5p1|d2p1):nil=\"true\""



If either of these patterns match, replace the matching text with an empty string:

List patterns = new List{
       "xmlns:(d5p1|d2p1)=\".*?\"",
       "(d5p1|d2p1):nil=\"true\"",
    };

    var output = new string[sentences.Count()];
    for(int i=0; i < sentences.Count(); i++)
    {
        Console.WriteLine("{0,24}", sentences[i]); // 24 chars, left-justified.
        var formattedSentence = sentences[i];                       

        foreach(var pattern in patterns)
        {
           var len = formattedSentence.Length;
           formattedSentence = Regex.Replace(formattedSentence, pattern, String.Empty,
                 RegexOptions.IgnoreCase)

           if(formattedSentence.Length != len)
              Console.WriteLine(" (match for '{0}' found)", pattern);                                  
        }

        //Finally, this may have left some unneeded whitespace which we can consolidate.
        formattedSentence = Regex.Replace(formattedSentence, "  +", " ");

        output[i] = formattedSentence;
    }

    // ok, let's look at the end result
    foreach(var sentence in output)
       Console.WriteLine(sentence);


One last thing. Hungarian notation - putting some type-specific flag at the beginning of your variable names - is EVIL in virtually any language used in an IDE, including .NET languages. IntelliSense tells you all you need to know about an object's type, if it happens to not be obvious via context; you shouldn't put it in the variable name. If you do, and the type changes, the variable name is now inconsistent and every usage of it needs to change (a task made easier by refactoring assistants like ReSharper, but if you have that you have absolutely no need for Hungarian).

Code Snippets

List<string> patterns = new List<string>{
       "xmlns:(d5p1|d2p1)=\".*?\"",
       "(d5p1|d2p1):nil=\"true\"",
    };

    var output = new string[sentences.Count()];
    for(int i=0; i < sentences.Count(); i++)
    {
        Console.WriteLine("{0,24}", sentences[i]); // 24 chars, left-justified.
        var formattedSentence = sentences[i];                       

        foreach(var pattern in patterns)
        {
           var len = formattedSentence.Length;
           formattedSentence = Regex.Replace(formattedSentence, pattern, String.Empty,
                 RegexOptions.IgnoreCase)

           if(formattedSentence.Length != len)
              Console.WriteLine(" (match for '{0}' found)", pattern);                                  
        }

        //Finally, this may have left some unneeded whitespace which we can consolidate.
        formattedSentence = Regex.Replace(formattedSentence, "  +", " ");

        output[i] = formattedSentence;
    }


    // ok, let's look at the end result
    foreach(var sentence in output)
       Console.WriteLine(sentence);

Context

StackExchange Code Review Q#14051, answer score: 4

Revisions (0)

No revisions yet.