HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Counting occurrences of different categories of characters

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
countingoccurrencesdifferentcharacterscategories

Problem

This is a sort of follow up to my previous question:

Counting the number of character occurrences

This time I have written code that still counts the number of different characters in a string, but with the added ability to find categories of characters. For example, finding how many numbers are in the string, how many punctuation characters are in the string.

Is there anything I should be doing differently? Any improvements? Also, I'm trying to learn better OOP design and software design and best practice, so any advice on that would also be helpful.

A couple of notes:

  • From what I've read immutable objects are preferred, so I created the "results" class to have private setters and return a read only dictionary to stop something else from changing it.



  • Is the way I've created the CharacterCountResult object from CharacterCount a good thing to do? As in, am I doing it correctly?



static void Main(string[] args)
{
    var i = CharacterCount.Count(@"Hello, World! £$% ^ powejdoiwr3u?!?!!/1;';'\\z\\]p[\][z]z\,.,/???");

    // Demonstrating some of the avaliable properties
    Console.WriteLine("Alphanumeric: {0}\nLowercase: {1}\nUppercase: {2}\nPunctuation: {3}\nDigits: {4}\nSymbols: {5}",
        i.LetterAndDigitCount, i.LowercaseCount, i.UppercaseCount, i.PunctuationCount, i.DigitCount, i.SymbolCount);

    foreach (var character in i.GetCharacterDictionary())
    {
        Console.WriteLine("{0} - {1}", character.Key, character.Value);
    }
}


This is the class that counts the characters in the string:

```
class CharacterCount
{
public static CharacterCountResult Count(string stringToCount)
{
var tempDictionary = new Dictionary();

uint controlCount = 0;
uint highSurrogatecount = 0;
uint lowSurrogateCount = 0;
uint whiteSpaceCount = 0;
uint symbolCount = 0;
uint punctuationCount = 0;
uint separatorCount = 0;
uint letterCount = 0;
uint digitCount = 0;

Solution

The good:

  • Good naming, both in following conventions and in picking expressive names



  • Your approach is good procedurally- e.g. good use of the dictionary and in-build char methods.



  • Putting the results in their own class and making it immutable through the use of private setters and the ReadOnlyDictionary is smart.



In terms of areas for potential improvement, reviewing is made somewhat difficult because of the nature of the made-up requirements this code is fulfilling. To demonstrate why, here's my train of thought:

It's not very realistic that you always want to get all this information together, so the first thing might be to think how to extract each piece of information individually. The first step in doing this would be to extract out the individual counts into their own methods, like:

public int CountLetters(string stringToCount)
{
    var letterCount = 0;
    foreach(var character in stringToCount)
    {
        if (char.IsLetter(character))
        {
            letterCount++;
        }
    }
    return letterCount;
}


Unfortunately as you can see, this is pretty quickly going to end up with a load of really similar methods, all of which are frustratingly large. The solution to this is two-fold.

First, because the only difference between the methods is which char function we call, the use of Func can cut it down to just one method:

public int CountLetters(string stringToCount, Func predicate)
{
    var letterCount = 0;
    foreach(var character in stringToCount)
    {
        if (predicate(character))
        {
            letterCount++;
        }
    }
    return letterCount;
}


Second, we can use LINQ:

public int CountLetters(string stringToCount, Func predicate)
{
    return stringToCount.Count(predicate);
}


And now we find the method is so simple that it probably doesn't need to be its own method at all. For example, if in the middle of some other method I wanted to count how many characters in a string were letters I could just do:

var letterCount = myString.Count(char.IsLetter);


This goes back to what I was saying about this being difficult code to review, because it essentially exists to fulfill requirement which is already fulfilled so simply by the .NET framework.

However, if we go with the fiction that exposing the counts for all those different char methods for a single string is something that is done commonly throughout your program, then your approach is sensible. Using the LINQ-style counting above, you can remove the second foreach statement and all the ifs inside, replacing them with one-liners. You can also remove all the variable declarations and feed them right into the result constructor, since they are so simple:

return new CharacterCountResult(
    stringToCount.Count(char.IsControl), 
    //etc...


Going forward, my suggestion is that you'll learn more about design by tackling problems that are a little more realistic than this one, and they will probably garner more informative reviews, too.

Code Snippets

public int CountLetters(string stringToCount)
{
    var letterCount = 0;
    foreach(var character in stringToCount)
    {
        if (char.IsLetter(character))
        {
            letterCount++;
        }
    }
    return letterCount;
}
public int CountLetters(string stringToCount, Func<string, bool> predicate)
{
    var letterCount = 0;
    foreach(var character in stringToCount)
    {
        if (predicate(character))
        {
            letterCount++;
        }
    }
    return letterCount;
}
public int CountLetters(string stringToCount, Func<string, bool> predicate)
{
    return stringToCount.Count(predicate);
}
var letterCount = myString.Count(char.IsLetter);
return new CharacterCountResult(
    stringToCount.Count(char.IsControl), 
    //etc...

Context

StackExchange Code Review Q#63995, answer score: 7

Revisions (0)

No revisions yet.