patterncsharpMinor
Lazy String.Split
Viewed 0 times
splitlazystring
Problem
C#'s
Now, not that bad, but if you want to add more operations to that
While this does not have the "remove empty entries" option, using
I'm not sure about the time complexity using co-routines, but for the functionality I've written some unit tests to be sure its working:
```
[TestMethod]
public void LazyStringSplit() {
var str = "ab;cd;;";
var resp = str.LazySplit(";");
var expected = new[] { "ab", "cd", "" };
var result = resp.ToArray();
CollectionAssert.AreEqual(expected, result);
}
[TestMethod]
public void LazyStringSplitEmptyString() {
var str = "";
var resp = str.LazySplit(";");
var expected = new string[0];
var result = resp.ToArray();
CollectionAssert.AreEqual(expected, result);
}
[TestMethod]
public void Lazy
String.Split method comes from C# 2.0, and lazy operations weren't a feature back then. The task is to split a string according to a (single) separator. Doing so with String.Split is used likestring[] split = myString.Split(new string[] { separator });Now, not that bad, but if you want to add more operations to that
string[] (and you probably do), you'll need to loop over the whole array, basically iterating the string twice. Using coroutine-like behaviour of the lazy yield keyword, you can (maybe) do more than one operation while only iterating once over the string.public static IEnumerable LazySplit(this string stringToSplit, string separator) {
if (stringToSplit == null) throw new ArgumentNullException("stringToSplit");
if (separator == null) throw new ArgumentNullException("separator");
var lastIndex = 0;
var index = -1;
do {
index = stringToSplit.IndexOf(separator, lastIndex);
if (index = lastIndex) {
yield return stringToSplit.Substring(lastIndex, index - lastIndex);
}
lastIndex = index + separator.Length;
} while (index > 0);
}While this does not have the "remove empty entries" option, using
myString.LazySplit(separator).Where(str => !String.IsNullOrWhiteSpace(str)) should do the job with an O(n) operation, or am I wrong here?I'm not sure about the time complexity using co-routines, but for the functionality I've written some unit tests to be sure its working:
```
[TestMethod]
public void LazyStringSplit() {
var str = "ab;cd;;";
var resp = str.LazySplit(";");
var expected = new[] { "ab", "cd", "" };
var result = resp.ToArray();
CollectionAssert.AreEqual(expected, result);
}
[TestMethod]
public void LazyStringSplitEmptyString() {
var str = "";
var resp = str.LazySplit(";");
var expected = new string[0];
var result = resp.ToArray();
CollectionAssert.AreEqual(expected, result);
}
[TestMethod]
public void Lazy
Solution
Edge cases:
-
the behaviour of
the sequence
-
empty string. To match the behaviour of
Here's how I would suggest writing it.
First, deal with the empty separator
Then have two variables,
To make your unit tests match the behaviour of
and
If you want to test that your implementation matches the behaviour of
-
";abc".LazySplit(";") will return an empty sequence. To matchthe behaviour of
";abc".Split(new char[] { ';' }) it should returnthe sequence
{ "", "abc" }.-
";abc".LazySplit("") will return a sequence with a single item, theempty string. To match the behaviour of
";abc".Split(new char[] { }) it should return the sequence { ";abc" }.Here's how I would suggest writing it.
First, deal with the empty separator
if (separator.Length == 0)
{
yield return value;
yield break;
}Then have two variables,
start and end that refer to the start and end of the substring we want to extract.var start = 0;
for (var end = value.IndexOf(separator); end != -1; end = value.IndexOf(separator, start))
{
yield return value.Substring(start, end - start);
start = end + separator.Length;
}
yield return value.Substring(start);To make your unit tests match the behaviour of
string.Split, you also want to change LazyStringSplit to havevar expected = new[] { "ab", "cd", "", "" };and
LazyStringSplitEmptyString to havevar expected = new string[] { "" };If you want to test that your implementation matches the behaviour of
string.Split, I would suggest introducing a helper method for the tests. Something likevar expected = value.Split(new string[] { separator }, StringSplitOptions.None);
CollectionAssert.AreEqual(expected, value.LazySplit(separator));Code Snippets
if (separator.Length == 0)
{
yield return value;
yield break;
}var start = 0;
for (var end = value.IndexOf(separator); end != -1; end = value.IndexOf(separator, start))
{
yield return value.Substring(start, end - start);
start = end + separator.Length;
}
yield return value.Substring(start);var expected = new[] { "ab", "cd", "", "" };var expected = new string[] { "" };var expected = value.Split(new string[] { separator }, StringSplitOptions.None);
CollectionAssert.AreEqual(expected, value.LazySplit(separator));Context
StackExchange Code Review Q#84163, answer score: 9
Revisions (0)
No revisions yet.