debugcsharpMinor
Find all words that can create a pyramid
Viewed 0 times
cancreateallwordspyramidthatfind
Problem
I wanted to write something that would find all pyramid words within a supplied text file or list of strings. An example of a pyramid word is the word "deadheaded". It has one of the letter "h", two of "a", three of "e", and four of "d".
This would form a pyramid if you were to stack the letters:
I'm looking for input regarding various things, such as design (is this constructed well for a user and easy to understand?), extensibility, using the right data structures, dealing with input/output...
Would this be better designed with more OOP in it? For example, I could have a base class with sub classes finding words with different shapes than a pyramid. But I'm not sure how much it would help in this design, as it seems fairly script-like with all the string manipulation.
```
public class PyramidWordsFinder
{
public PyramidWordsFinder()
{
this.pyramidWordList = new List();
}
public IList pyramidWordList { get; set; }
public IReadOnlyList LoadDictionaryFile(string filePath)
{
List wordsDictionary = new List();
Encoding enc = Encoding.GetEncoding(1250);
string[] lines = File.ReadAllLines(filePath, enc);
wordsDictionary.AddRange(lines.Where(x => x.Length > 2));
return wordsDictionary.AsReadOnly();
}
public void WriteWordListToFile(IEnumerable words, string fullFileName)
{
if (String.IsNullOrWhiteSpace(fullFileName))
{
throw new ArgumentException("The file name is not correctly formed");
}
File.WriteAllLines(fullFileName, words);
}
public IList FindPyramidWords(IReadOnlyList wordList)
{
foreach (var word in wordList)
{
// Quick checks before seeing if pyramid word
if (this.WordIsCorrectLength(word) && (word.ToLower().Distinct().Count() == this.GetMaximumLetterFrequency(word)))
{
if (this.DetermineIfWordIsPyramid(word))
This would form a pyramid if you were to stack the letters:
h
a a
e e e
d d d dI'm looking for input regarding various things, such as design (is this constructed well for a user and easy to understand?), extensibility, using the right data structures, dealing with input/output...
Would this be better designed with more OOP in it? For example, I could have a base class with sub classes finding words with different shapes than a pyramid. But I'm not sure how much it would help in this design, as it seems fairly script-like with all the string manipulation.
```
public class PyramidWordsFinder
{
public PyramidWordsFinder()
{
this.pyramidWordList = new List();
}
public IList pyramidWordList { get; set; }
public IReadOnlyList LoadDictionaryFile(string filePath)
{
List wordsDictionary = new List();
Encoding enc = Encoding.GetEncoding(1250);
string[] lines = File.ReadAllLines(filePath, enc);
wordsDictionary.AddRange(lines.Where(x => x.Length > 2));
return wordsDictionary.AsReadOnly();
}
public void WriteWordListToFile(IEnumerable words, string fullFileName)
{
if (String.IsNullOrWhiteSpace(fullFileName))
{
throw new ArgumentException("The file name is not correctly formed");
}
File.WriteAllLines(fullFileName, words);
}
public IList FindPyramidWords(IReadOnlyList wordList)
{
foreach (var word in wordList)
{
// Quick checks before seeing if pyramid word
if (this.WordIsCorrectLength(word) && (word.ToLower().Distinct().Count() == this.GetMaximumLetterFrequency(word)))
{
if (this.DetermineIfWordIsPyramid(word))
Solution
SRP
You should start by splitting this class into smaller pieces to satisfy the Single Responsibility Priciple. One class should do only one thing and yours is doing at least two:
This means you should:
The first module could be the
Keep the names clear but not too long if this is not necessary.
The second module is the
HashSet
Instead of having a list with predefined lengths, you can calculate them based on the max length and save them in a
LINQ
To determine whether a word is a pyramid you can use LINQ where you first group each letter and sort it in the ascending order. Then you check if the count is increasing by one:
StringComparer
With the
IEnumerable
Don't return a complete list of words if this isn't necessary. Use an iterator instead - by using LINQ or the
Example
You can now easily find all pyramid words with linq:
I'm aware of the fact that the LINQ solution might not be the fastest one but it's definitely the esiest one. If you don't notice any performance hit then why make it so complex? Premature optimization is the root of all evil.
The point is that you should separate each feature so that you can work on it and test it without affecting the others. If you can focus on one thing at a time you can better optimize it.
You should start by splitting this class into smaller pieces to satisfy the Single Responsibility Priciple. One class should do only one thing and yours is doing at least two:
- it reads and writes files
- it finds the pyramid words
This means you should:
- Create one module for reading and writing words into a file
- Let the second module only find pyramid words
The first module could be the
DictionaryFileclass DictionaryFile
{
public IReadOnlyList LoadDictionary(string filePath)
{
// ...
}
public void SaveDictionary(IEnumerable words, string fileName)
{
// ...
}
}Keep the names clear but not too long if this is not necessary.
The second module is the
PyramidWordFinder:class PyramidWordFinder
{
private readonly HashSet _lengths = new HashSet();
public PyramidWordFinder(int maxLength)
{
var length = 3;
var i = 1;
while (length x.ToString(), StringComparer.OrdinalIgnoreCase)
.Select(x => x.Count())
.OrderBy(x => x);
return groups.SequenceEqual(Enumerable.Range(1, groups.Length));
}
}HashSet
Instead of having a list with predefined lengths, you can calculate them based on the max length and save them in a
HashSet for faster lookup which is O(1) in comparison to the list where it's O(n).LINQ
To determine whether a word is a pyramid you can use LINQ where you first group each letter and sort it in the ascending order. Then you check if the count is increasing by one:
var groups = value
.GroupBy(x => x.ToString(), StringComparer.OrdinalIgnoreCase)
.Select(x => x.Count())
.OrderBy(x => x);
return groups.SequenceEqual(Enumerable.Range(1, groups.Length));StringComparer
With the
StringComparer.OrdinalIgnoreCase you don't need to call ToLower or ToUpper.IEnumerable
public IList FindPyramidWords(IReadOnlyList wordList)Don't return a complete list of words if this isn't necessary. Use an iterator instead - by using LINQ or the
yield return. What if you wanted just the first three words? You would need to calculate them all anyway. With deferred execution you can stop whereever you want to.Example
You can now easily find all pyramid words with linq:
var pwf = new PyramidWordFinder(maxLength: 55); // or use constant
var words = new string[] { /* words */ }; // from the file
var pyramidWords = words.Where(pwf.IsPyramidWord).ToList();I'm aware of the fact that the LINQ solution might not be the fastest one but it's definitely the esiest one. If you don't notice any performance hit then why make it so complex? Premature optimization is the root of all evil.
The point is that you should separate each feature so that you can work on it and test it without affecting the others. If you can focus on one thing at a time you can better optimize it.
Code Snippets
class DictionaryFile
{
public IReadOnlyList<string> LoadDictionary(string filePath)
{
// ...
}
public void SaveDictionary(IEnumerable<string> words, string fileName)
{
// ...
}
}class PyramidWordFinder
{
private readonly HashSet<int> _lengths = new HashSet<int>();
public PyramidWordFinder(int maxLength)
{
var length = 3;
var i = 1;
while (length <= maxLength)
{
_lengths.Add(length);
length = ++i + length + 1;
}
}
public bool IsPyramidWord(string value)
{
if (!_lengths.Contains(value.Length)) { return false; }
var groups = value
.GroupBy(x => x.ToString(), StringComparer.OrdinalIgnoreCase)
.Select(x => x.Count())
.OrderBy(x => x);
return groups.SequenceEqual(Enumerable.Range(1, groups.Length));
}
}var groups = value
.GroupBy(x => x.ToString(), StringComparer.OrdinalIgnoreCase)
.Select(x => x.Count())
.OrderBy(x => x);
return groups.SequenceEqual(Enumerable.Range(1, groups.Length));public IList<string> FindPyramidWords(IReadOnlyList<string> wordList)var pwf = new PyramidWordFinder(maxLength: 55); // or use constant
var words = new string[] { /* words */ }; // from the file
var pyramidWords = words.Where(pwf.IsPyramidWord).ToList();Context
StackExchange Code Review Q#149830, answer score: 6
Revisions (0)
No revisions yet.