patterncsharpMinor
Finding the most frequently used sequence of characters in a comma-delimited input string of words
Viewed 0 times
thecommadelimitedusedwordssequenceinputfrequentlyfindingcharacters
Problem
I'm making a demo program for a job interview and I'd like to know if there is anything I can make work faster in my current solution. It's a C# console application that accepts input in form of a single string with the words delimited by
```
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
string input = args.Length > 0 ? args[0] : null;
if (input == null) // demo input
input = "Lorem,ipsum,etiam,habitasse,conubia,sed,habitasse,tristique,,erat,varius,vitae,nunc,vulputate,etiam,proin,,interdum,malesuada,nam,curabitur,nibh,pharetra,ultricies,elit,elementum,viverra,vehicula,lacinia,vestibulum,dapibus,,bibendum,vestibulum,quisque,potenti,dictum,ad,curabitur,neque,,taciti,consequat,malesuada,quisque,ultrices,scelerisque,in,fermentum,fringilla,per,ad.Tortor,habitasse,auctor,consequat,imperdiet,vel,iaculis,suscipit,torquent,,porta,eget,cubilia,cras,quisque,sociosqu,auctor,neque,,ac,dictum,elit,rhoncus,ornare,augue,cras,quis,tempor,sodales,congue,nulla,dictum,quisque,iaculis,magna,mattis,odio,,elementum,varius,turpis,pretium,consequat,gravida,ut,hendrerit,metus,,pulvinar,scelerisque,eu,et,neque,cubilia,mauris,elementum,porttitor,eleifend,vestibulum,luctus,id,diam,pellentesque,convallis,nisi,libero,ante,aliquam,maecenas,facilisis.Suscipit,posuere,gravida,luctus,cursus,erat,eleifend,,magna,tempor,iaculis,arcu,rutrum,viverra,lorem,,posuere,ipsum,leo,aenean,donec,praesent,mollis,phasellus,sociosqu,orci,magna,potenti,donec,curabitur,feugiat,,ultricies,integer,lacus,mollis,porta,consectetur,fames,dolor,,himenaeos,enim,quisque,dapibus,viverra,maecenas,nam,ac,eget,est,s
, symbol. The task is to find the most commonly encountered combination of three characters throughout all the words. It's not specified what I should do if there are multiple triplets with the same rate of appearance, so I'm just returning whatever ends up sorted to the first place. Here is my take on this:```
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
string input = args.Length > 0 ? args[0] : null;
if (input == null) // demo input
input = "Lorem,ipsum,etiam,habitasse,conubia,sed,habitasse,tristique,,erat,varius,vitae,nunc,vulputate,etiam,proin,,interdum,malesuada,nam,curabitur,nibh,pharetra,ultricies,elit,elementum,viverra,vehicula,lacinia,vestibulum,dapibus,,bibendum,vestibulum,quisque,potenti,dictum,ad,curabitur,neque,,taciti,consequat,malesuada,quisque,ultrices,scelerisque,in,fermentum,fringilla,per,ad.Tortor,habitasse,auctor,consequat,imperdiet,vel,iaculis,suscipit,torquent,,porta,eget,cubilia,cras,quisque,sociosqu,auctor,neque,,ac,dictum,elit,rhoncus,ornare,augue,cras,quis,tempor,sodales,congue,nulla,dictum,quisque,iaculis,magna,mattis,odio,,elementum,varius,turpis,pretium,consequat,gravida,ut,hendrerit,metus,,pulvinar,scelerisque,eu,et,neque,cubilia,mauris,elementum,porttitor,eleifend,vestibulum,luctus,id,diam,pellentesque,convallis,nisi,libero,ante,aliquam,maecenas,facilisis.Suscipit,posuere,gravida,luctus,cursus,erat,eleifend,,magna,tempor,iaculis,arcu,rutrum,viverra,lorem,,posuere,ipsum,leo,aenean,donec,praesent,mollis,phasellus,sociosqu,orci,magna,potenti,donec,curabitur,feugiat,,ultricies,integer,lacus,mollis,porta,consectetur,fames,dolor,,himenaeos,enim,quisque,dapibus,viverra,maecenas,nam,ac,eget,est,s
Solution
string[] trimmedWords = new string[totalWords];
for (int i = 0; i < totalWords; i++)
trimmedWords[i] = words[i].Trim();You should learn about LINQ. (Or maybe just learn more, you're using it elsewhere.) With it, you can write this as (possibly adding
ToArray() if you require the result to be an array):vat trimmedWords = words.Select(w => w.Trim());for (int j = 0; j < totalWords; j++)
{
string word = trimmedWords[j];This is the only place in the whole loop where you're using
j, so you should have used foreach instead, since it's simpler.if (!hs.Contains(triplet))
{
hs.Add(triplet);
dic[triplet] = 1;
}
else
dic[triplet]++;I don't see any reason for the
HashSet here, Dictionary works just as well for deciding whether it contains something (use its ContainsKey() method).var sortedList = dic.OrderByDescending(x => x.Value);
string output = sortedList.First().ToString();If you want just the largest value, then sorting the whole collection is unnecessary. You could use
MaxBy() from MoreLINQ to do this (or write one yourself, it's not hard).You could also use LINQ instead of the
Dictionary and HashSet, a fully LINQed solution would be:input.Split(',')
.Select(w => w.Trim())
.SelectMany(w => Enumerable.Range(0, w.Length - 2).Select(i => w.Substring(i, 3)))
.GroupBy(t => t)
.MaxBy(g => g.Count())
.Key(Though I'm not saying writing everything in a single LINQ expression is the best solution here.)
Code Snippets
string[] trimmedWords = new string[totalWords];
for (int i = 0; i < totalWords; i++)
trimmedWords[i] = words[i].Trim();vat trimmedWords = words.Select(w => w.Trim());for (int j = 0; j < totalWords; j++)
{
string word = trimmedWords[j];if (!hs.Contains(triplet))
{
hs.Add(triplet);
dic[triplet] = 1;
}
else
dic[triplet]++;var sortedList = dic.OrderByDescending(x => x.Value);
string output = sortedList.First().ToString();Context
StackExchange Code Review Q#59135, answer score: 6
Revisions (0)
No revisions yet.