HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpModerate

Splitting a string into words or double-quoted substrings

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
substringsintowordssplittingdoublequotedstring

Problem

For various reasons, I'm parsing a string, this code will explain what I'm after:

string baseString = "This is a \"Very Long Test\"";

string[] strings = baseString.Split(' ');

List stringList = new List();
string temp = String.Empty;
foreach (var s in strings)
{
    if (!String.IsNullOrWhiteSpace(temp))
    {
        if (s.EndsWith("\""))
        {
            string item = temp + " " + s;
            stringList.Add(item.Substring(1,item.Length - 2));
            temp = string.Empty;
        }
        temp = temp + " " + s;
    }
    else if (s.StartsWith("\""))
    {
        temp = s;
    }
    else
    {
        stringList.Add(s);
    }

}

stringList.ForEach(Console.WriteLine);


The output should be:

This
is
a
Very Long Test


Basically, given a string, it will split it on spaces, unless its grouped into speech marks, the same way the command line does it.

Any better way to do this code?

Solution

Seems like a job for a regular expression:

string baseString = "This is a \"Very Long Test\"";
var re = new Regex("(?().Select(m => m.Value).ToArray();


What the regular expression (?<=")[^"](?=")|[^" ]+ does is that it either finds a sequence of zero or more characters that are not " ([^"]) preceded by a " ((?<=")) and followed by a " ((?=")) or a sequence of one or more character that are not " or a space ([^" ]+).

For the sample input, it gives the same output as your version. The code itself is much simpler, but the regular expression might be hard to understand, especially if you're not used to them.

Code Snippets

string baseString = "This is a \"Very Long Test\"";
var re = new Regex("(?<=\")[^\"]*(?=\")|[^\" ]+");
var strings = re.Matches(baseString).Cast<Match>().Select(m => m.Value).ToArray();

Context

StackExchange Code Review Q#10826, answer score: 10

Revisions (0)

No revisions yet.