patterncsharpMinor
Find most occurring word in a txt file
Viewed 0 times
fileoccurringwordfindtxtmost
Problem
Assume that we have a
Find out the word that occurs the most.
Here's what I was able to write (I used array of strings instead of a file in this example):
Questions:
.txt file that has one word per line.Find out the word that occurs the most.
Here's what I was able to write (I used array of strings instead of a file in this example):
string[] source = { "test1", "test2", "test3", "test4", "test1", "test1", "test3" };
Dictionary dic = source.Distinct().ToDictionary(p => p, p => 0);
var keys = new List(dic.Keys);
foreach (string key in keys)
{
dic[key]=source.Count(f => f == key);
}
int max = dic.Values.Max();
foreach (KeyValuePair kvp in dic)
{
if (kvp.Value == max)
{
Console.WriteLine(kvp.Key + " " + max);
break;
}
}Questions:
- Can this be done better and more efficient way (speed/ memory)?
- What if file size is 10GB. How would you do it differently from my approach?
Solution
You are trying to count each key separately. This means you need to iterate through the entire list to count each key. Instead you can keep a running total of your key's and only have to iterate through your list once:
EDIT: I did not include getting the max value as what you have works for that and has already been re-written by thantos
string[] source = { "test1", "test2", "test3", "test4", "test1", "test1", "test3" };
Dictionary dic = new Dictionary();
foreach(string s in source){
if(dic.Keys.Contains(s))
dic[s] = dic[s]++;
else
dic.Add(s, 1);
}EDIT: I did not include getting the max value as what you have works for that and has already been re-written by thantos
Code Snippets
string[] source = { "test1", "test2", "test3", "test4", "test1", "test1", "test3" };
Dictionary<string, int> dic = new Dictionary<string, int>();
foreach(string s in source){
if(dic.Keys.Contains(s))
dic[s] = dic[s]++;
else
dic.Add(s, 1);
}Context
StackExchange Code Review Q#11254, answer score: 6
Revisions (0)
No revisions yet.