HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

LINQ and string.Split do it yourself practice

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
linqyourselfpracticesplitandstring

Problem

Problem Statement:


Implement a sentence scanning functionality to parse the sentence and return a string to concatenate each word with its number of occurrences, sorted by the number of occurrences in descending order.


For example, with the sentence "code review connects the world! share the code" , the function returns the string: {code: 2, the: 2, review: 1, connects: 1, world: 1, share: 1}.
The delimiters are customized as function input argument, for the above test case, the delimiters are " ,!".

Introduction of algorithm

I spent over an hour to write the algorithm first with time complexity \$O(nm)\$, where \$n\$ is the sentence's length and \$m\$ is delimiters length. I practiced using C#, tried to write a string.Split(string) by myself, and also learn to write LINQ for correct syntax by looking up the stackoverflow question to make the code more succinct.

I am still learning to write readable and clean code. The code passes two test case function calls in the main function. Please help me to improve.

```
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;

class ScanArticle
{
static void Main(string[] args)
{
RunTestcaseOnSplit();
RunTestcase();
}

public static void RunTestcase()
{
var result = WordCountPractice(" code review connects the world! share the code", " ,?!");
Debug.Assert(result.CompareTo("{code: 2, the: 2, review: 1, connects: 1, world: 1, share: 1}") == 0);
}

///
/// test do it yourself string.Split(string)
/// Test case: " a,b,c", splitted words: new string[]{"a","b","c"}
/// Test case: " a,b,c,", splitted words: new string[]{"a","b","c"}
///
public static void RunTestcaseOnSplit()
{
var words = stringSplitDoItYourSelf(" a,b,c", " ,");
Debug.Assert(words[0].CompareTo("a") == 0);

var words2 = stringSplitDoItYourSelf(" a,b,c,", " ,");

Solution

There is not much LINQ in your code and this is a really long solution for such a simple task but let's try to change it.

You want to split a sentence like this one:

var sentence = " code review connects the world! share the code";


so why not use regex for this and split on every occurance of ,?!.. You then trim each word and filter the empty results out. Then you group each word igrnoring its case, sort the groups by count in a descending order and create strings.

var words = 
    Regex.Split(sentence, "[ ,?!.]")
    .Select(x => x.Trim())
    .Where(x => !string.IsNullOrEmpty(x))
    .GroupBy(x => x, StringComparer.OrdinalIgnoreCase)
    .OrderByDescending(g => g.Count());

var summary = $"{{{string.Join(", ", words.Select(g => $"{g.Key}: {g.Count()}"))}}}";


result

{code: 2, the: 2, review: 1, connects: 1, world: 1, share: 1}


Review

private static void concatenateContent(IList words, IEnumerable> wordCount)
{
    const string openParenthese = "{";
    const string closeParenthese = "}";
    const string colon = ": ";
    const string comma = ", ";

    words.Add(openParenthese);

    int length = wordCount.Count();
    int index = 0;

    foreach (var item in wordCount)
    {
        StringBuilder current = new StringBuilder();

        current.Append(item.Key + colon + item.Value);

        if (index < length - 1)
        {
            current.Append(comma);
        }

        words.Add(current.ToString());

        index++;
    }

    words.Add(closeParenthese);
}


You should not return a result via a parameter that is not a ref or out one because I'd be really shocked if you modified my list without giving me any hint that it will happen.

The IList words should be a return value not a parameter.

Using the StringBuilder current = new StringBuilder(); inside a loop does not make much sense and you use the + concatenation anyway item.Key + colon + item.Value. You might as well concatenate the string in the normal way because the comptiler will optimize it. The StringBuilder will become useful when you define it outside the loop and then build the string.

You don't need the index++; because you can just check the words.Count property.

Ideally this method could look like this:

private static string concatenateContent(IEnumerable> wordCount)
{
    const string openParenthese = "{";
    const string closeParenthese = "}";
    const string colon = ": ";
    const string comma = ", ";

    var result = new StringBuilder();
    result.Append(openParenthese);

    foreach (var x in wordCount.Select((item, index) => new { item, index }))
    {
        result
            .Append(x.index > 0 ? colon : string.Empty)
            .Append(x.item.Key)
            .Append(colon)
            .Append(x.item.Value);
    }

    result.Append(closeParenthese);

    return result.ToString();
}

Code Snippets

var sentence = " code review connects the world! share the code";
var words = 
    Regex.Split(sentence, "[ ,?!.]")
    .Select(x => x.Trim())
    .Where(x => !string.IsNullOrEmpty(x))
    .GroupBy(x => x, StringComparer.OrdinalIgnoreCase)
    .OrderByDescending(g => g.Count());

var summary = $"{{{string.Join(", ", words.Select(g => $"{g.Key}: {g.Count()}"))}}}";
{code: 2, the: 2, review: 1, connects: 1, world: 1, share: 1}
private static void concatenateContent(IList<string> words, IEnumerable<KeyValuePair<string, int>> wordCount)
{
    const string openParenthese = "{";
    const string closeParenthese = "}";
    const string colon = ": ";
    const string comma = ", ";

    words.Add(openParenthese);

    int length = wordCount.Count();
    int index = 0;

    foreach (var item in wordCount)
    {
        StringBuilder current = new StringBuilder();

        current.Append(item.Key + colon + item.Value);

        if (index < length - 1)
        {
            current.Append(comma);
        }

        words.Add(current.ToString());

        index++;
    }

    words.Add(closeParenthese);
}
private static string concatenateContent(IEnumerable<KeyValuePair<string, int>> wordCount)
{
    const string openParenthese = "{";
    const string closeParenthese = "}";
    const string colon = ": ";
    const string comma = ", ";

    var result = new StringBuilder();
    result.Append(openParenthese);

    foreach (var x in wordCount.Select((item, index) => new { item, index }))
    {
        result
            .Append(x.index > 0 ? colon : string.Empty)
            .Append(x.item.Key)
            .Append(colon)
            .Append(x.item.Value);
    }

    result.Append(closeParenthese);

    return result.ToString();
}

Context

StackExchange Code Review Q#159777, answer score: 7

Revisions (0)

No revisions yet.