patterncsharpMinor
Partitioning a string into chunks
Viewed 0 times
intochunksstringpartitioning
Problem
Inspired by this question: Split a string into chunks of the same length
The code is designed to work on text elements rather than chars to avoid unicode problems.
And a couple of unit tests
I usually use Machine Specifications for unit tests so any tips on writing MSTest unit tests would be great especially naming conventions!
The method feels slightly too long to me but I couldn't see a nice way of splitting it up.
The code is designed to work on text elements rather than chars to avoid unicode problems.
public static class StringExtensions
{
public static IEnumerable Partition(this string value, int chunkSize)
{
if (value == null)
{
throw new ArgumentNullException(nameof(value));
}
if (chunkSize < 1)
{
throw new ArgumentOutOfRangeException(nameof(chunkSize));
}
var sb = new StringBuilder(chunkSize);
var enumerator = StringInfo.GetTextElementEnumerator(value);
while (enumerator.MoveNext())
{
sb.Append(enumerator.GetTextElement());
for (var i = 0; i < chunkSize - 1; i++)
{
if (!enumerator.MoveNext())
{
break;
}
sb.Append(enumerator.GetTextElement());
}
yield return sb.ToString();
sb.Length = 0;
}
}
}And a couple of unit tests
[TestMethod]
public void Partition_SplittingAnAsciiString_ShouldSplitTheStringIntoTheRequiredChunkSize()
{
string input = "123456";
string[] expected = { "123", "456" };
string[] actual = input.Partition(3).ToArray();
CollectionAssert.AreEqual(expected, actual);
}
[TestMethod]
public void Partition_SplittingAPartiallyDecomposedString_ShouldSplitTheStringIntoTheRequiredChunkSize()
{
string input = "éée\u0301";
string[] expected = { "é", "é", "e\u0301" };
string[] actual = input.Partition(1).ToArray();
CollectionAssert.AreEqual(expected, actual);
}I usually use Machine Specifications for unit tests so any tips on writing MSTest unit tests would be great especially naming conventions!
The method feels slightly too long to me but I couldn't see a nice way of splitting it up.
Solution
Nice and clean implementation. I tested this with the tests (the new ones) of my implementation and they all passed (skipping the tests with chunkSize == 0).
The code could be sligthly more readable by having some vertical space to group related code, for instance after validating the input.
You could improve this a little bit by just having this
after the validation, in this way you wouldn't need to create a
A slightly different approach could be to remove the
Unfortunately this makes the code 2 lines (3 with the vertical spacing) longer
Regarding naming of unit tests, I usually (not in the posted question of mine) use the pattern
UnitOfWork_StateUnderTest_ExpectedBehavior
like shown in the accepted answer over here: unit test naming best practices
The code could be sligthly more readable by having some vertical space to group related code, for instance after validating the input.
You could improve this a little bit by just having this
if (chunkSize == 1)
{
yield return value;
yield break;
}after the validation, in this way you wouldn't need to create a
StringBuilder nor having the enumerator. A slightly different approach could be to remove the
for loop. It makes the intent more clear (IMO) and removes the need to double check return value of enumerator.MoveNext(). Unfortunately this makes the code 2 lines (3 with the vertical spacing) longer
var sb = new StringBuilder(chunkSize);
var enumerator = StringInfo.GetTextElementEnumerator(value);
var counter = 0;
while (enumerator.MoveNext())
{
counter++;
sb.Append(enumerator.GetTextElement());
if (counter == chunkSize)
{
yield return sb.ToString();
sb.Length = 0;
counter = 0;
}
}
if (counter > 0)
{
yield return sb.ToString();
}Regarding naming of unit tests, I usually (not in the posted question of mine) use the pattern
UnitOfWork_StateUnderTest_ExpectedBehavior
like shown in the accepted answer over here: unit test naming best practices
Code Snippets
if (chunkSize == 1)
{
yield return value;
yield break;
}var sb = new StringBuilder(chunkSize);
var enumerator = StringInfo.GetTextElementEnumerator(value);
var counter = 0;
while (enumerator.MoveNext())
{
counter++;
sb.Append(enumerator.GetTextElement());
if (counter == chunkSize)
{
yield return sb.ToString();
sb.Length = 0;
counter = 0;
}
}
if (counter > 0)
{
yield return sb.ToString();
}Context
StackExchange Code Review Q#112004, answer score: 3
Revisions (0)
No revisions yet.