HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Syllable-counting function

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
functioncountingsyllable

Problem

I was working on a syllable counting function for a text editor(this function is accurate enough). However, I would like to know if it is possible to optimize it, it already separated into another thread, but I would like to know if there is any kind of optimization that I can do with it to make it more efficient.

private static int SyllableCount(string word)
{
    word = word.ToLower().Trim();
    int count = System.Text.RegularExpressions.Regex.Matches(word, "[aeiouy]+").Count;
    if ((word.EndsWith("e") || (word.EndsWith("es") || word.EndsWith("ed"))) && !word.EndsWith("le"))
        count--;
    return count;
}


It uses regular expressions, something that this source mentions having poor performance in .net applications, is that the case? And if not, are there any other optimizations that I can perform on it?

Anyway, it does not lag much, but my application does use around four threads already, just for keeping up with various text entry statistics, so now I'm just trying to shave the fat off so to speak.

Solution

Well, Regex is going to be slow, by virtue of it being a very powerful, flexible engine that can't assume you'll never want to do something that a regex can achieve. This particular regex pattern is pretty simple (no lookbehinds, etc) but there will be some overhead inherent in Regex use which you can trim. You can iterate through the string and count occurrences of groups of vowels in linear time with very little overhead.

private static int SyllableCount(string word)
{
    word = word.ToLower().Trim();
    bool lastWasVowel;
    var vowels = new []{'a','e','i','o','u','y'};
    int count;

    //a string is an IEnumerable; convenient.
    foreach(var c in word)
    {
       if(vowels.Contains(c))
       {
          if(!lastWasVowel)
             count++;
          lastWasVowel = true;
       }
       else
          lastWasVowel = false;                     
    }

    if ((word.EndsWith("e") || (word.EndsWith("es") || word.EndsWith("ed"))) 
          && !word.EndsWith("le"))
        count--;

    return count;
}


I'd A/B the above algorithm against the one you already have; you should see at least some performance increase. Notice that although this may well be faster as it does exactly what you want and doesn't see if you want to do anything else, it uses more LOC to achieve the same result. This is Regex's real power; powerful string analysis with very concise code.

Code Snippets

private static int SyllableCount(string word)
{
    word = word.ToLower().Trim();
    bool lastWasVowel;
    var vowels = new []{'a','e','i','o','u','y'};
    int count;

    //a string is an IEnumerable<char>; convenient.
    foreach(var c in word)
    {
       if(vowels.Contains(c))
       {
          if(!lastWasVowel)
             count++;
          lastWasVowel = true;
       }
       else
          lastWasVowel = false;                     
    }

    if ((word.EndsWith("e") || (word.EndsWith("es") || word.EndsWith("ed"))) 
          && !word.EndsWith("le"))
        count--;

    return count;
}

Context

StackExchange Code Review Q#9972, answer score: 4

Revisions (0)

No revisions yet.