principlecsharpMinor
More efficient and readable nested loop to compare keywords
Viewed 0 times
readablemoreefficientloopnestedkeywordsandcompare
Problem
I've created an algorithm which weighs the relevance of a list of articles against two lists of keywords that correlate to attributes of the article.
It works great and is super efficient... but it's a mess. It's not terribly readable, so it's difficult to discern what's going.
The operation in pseudo code goes something like this:
What would be a better way to write the following code? I can't use
It works great and is super efficient... but it's a mess. It's not terribly readable, so it's difficult to discern what's going.
The operation in pseudo code goes something like this:
- Loop through every article in a list called articles
(List)
- For every article, loop through every role in a list of roles
(List)
- Check to see if the current article has any roles
(Article.Roles = List)
- If yes, then loop through each role in the article and try to match a role in the article to the role in the current loop
- If a match is found, add weight to the article. If the index of the role on the article and the role in the roles list are both index 0 (in primary position) add extra weight for two matching primaries
- Repeat for topics, but with no bonus for primary matches
What would be a better way to write the following code? I can't use
foreach except in one or two places, because I need to match indexes to know what value to add on a match.private static List WeighArticles(List articles, List roles, List topics, List industries)
{
var returnList = new List();
for (int currentArticle = 0; currentArticle 0)
{
for (int currentArticleRole = 0; currentArticleRole 0)
{
for (int currentArticleTopic = 0; currentArticleTopic Roles { get; set; }
public List Topics { get; set; }
public double Weight { get; set; }
}Solution
Note: last refactoring with rich domain model approach is at the bottom. But you can see my 3 refactoring steps which lead to nice object-oriented design.
You can use Linq to make your code more readable. Also you can give names for intermediate results:
I also would recommend to use named constants instead of magic numbers
Further improving readability could involve splitting this method into three methods:
Which could look this way:
Yes, its less efficient. But you always should chose between readability and performance:
Same way you can update articles with selected topics.
Final solution
Further refactoring can include moving weights calculation logic to article class, thus making it reach instead of anemic, and increasing cohesion:
Adding topics weights:
And your original code will look like:
I think it's a little more readable now :)
You can use Linq to make your code more readable. Also you can give names for intermediate results:
var articlesWithRoles = from a in articles
where a.Roles != null && a.Roles.Any()
select a;
var primaryRole = roles.First().ToLower();
var highWeightArticles = from a in articlesWithRoles
where a.Roles.First().ToLower() == primaryRole
select a;
foreach(var article in highWeightArticles)
article.Weight += 3;
var lowWeightArticles = from a in articlesWithRoles.Except(highWeightArticles)
from ar in a.Roles
join r in roles on ar.ToLower() equals r.ToLower()
select a;
foreach(var article in lowWeightArticles)
article.Weight += 1;
var articlesWithTopics = from a in articles
where a.Topics != null
from at in a.Topics
join t in topics on at.ToLower() equals t.ToLower()
select a;
foreach(var article in articlesWithTopics)
article.Weight += 0.8;I also would recommend to use named constants instead of magic numbers
3, 1, 0.8. Also maybe you don't need to create new list, because it will share same articles as list yo passed to method.Further improving readability could involve splitting this method into three methods:
AddRolesWeights(articles, roles);
AddPrimaryRoleWeights(articles, roles.First());
AddTopicsWeights(articles, topics);Which could look this way:
void AddRolesWeights(IEnumerable articles, IEnumerable roles)
{
const double RoleWeight = 1;
var nonPrimaryArticles =
from a in articles
where a.Roles != null && a.Roles.Any() &&
from ar in a.Roles
join r in roles on ar.ToLower() equals r.ToLower()
select a;
foreach(var article in primaryArticles)
article.Weight += RoleWeight;
}Yes, its less efficient. But you always should chose between readability and performance:
void AddPrimaryRoleWeights(IEnumerable articles, string primaryRole)
{
const double PrimaryArticleWeight = 2;
var primaryArticles =
from a in articlesWithRoles
where a.Roles != null && a.Roles.Any() &&
String.Equals(a.Roles[0], primaryRole, StringComparison.CurrentCultureIgnoreCase)
select a;
foreach(var article in primaryArticles)
article.Weight += PrimaryArticleWeight;
}Same way you can update articles with selected topics.
Final solution
Further refactoring can include moving weights calculation logic to article class, thus making it reach instead of anemic, and increasing cohesion:
// add this to Article class
public double Weight { get; private set; } // probably you don't need setter
public void AddWeights(IEnumerable roles)
{
const double RoleWeight = 1;
const double PrimaryRoleWeight = 3;
if (!roles.Any())
return;
if (Roles == null || !Roles.Any())
return;
var pirmaryRole = roles.First();
if (String.Equals(Roles[0], primaryRole, StringComparison.CurrentCultureIgnoreCase))
{
Weight += PrimaryRoleWeight;
return;
}
foreach(var role in roles)
if (Roles.Contains(role, StringComparer.CurrentCultureIgnoreCase))
Weight += RoleWeight;
}Adding topics weights:
public void AddWeights(IEnumerable topics)
{
const double TopicWeight = 0.8;
if (Topics == null || !Topics.Any() || !topics.Any())
return;
foreach(var topic in topics)
if (Topics.Contains(topic, StringComparer.CurrentCultureIgnoreCase))
Weight += TopicWeight;
}And your original code will look like:
foreach(var article in articles)
{
article.AddWeights(roles);
article.AddWeights(topics);
}I think it's a little more readable now :)
Code Snippets
var articlesWithRoles = from a in articles
where a.Roles != null && a.Roles.Any()
select a;
var primaryRole = roles.First().ToLower();
var highWeightArticles = from a in articlesWithRoles
where a.Roles.First().ToLower() == primaryRole
select a;
foreach(var article in highWeightArticles)
article.Weight += 3;
var lowWeightArticles = from a in articlesWithRoles.Except(highWeightArticles)
from ar in a.Roles
join r in roles on ar.ToLower() equals r.ToLower()
select a;
foreach(var article in lowWeightArticles)
article.Weight += 1;
var articlesWithTopics = from a in articles
where a.Topics != null
from at in a.Topics
join t in topics on at.ToLower() equals t.ToLower()
select a;
foreach(var article in articlesWithTopics)
article.Weight += 0.8;AddRolesWeights(articles, roles);
AddPrimaryRoleWeights(articles, roles.First());
AddTopicsWeights(articles, topics);void AddRolesWeights(IEnumerable<Article> articles, IEnumerable<Role> roles)
{
const double RoleWeight = 1;
var nonPrimaryArticles =
from a in articles
where a.Roles != null && a.Roles.Any() &&
from ar in a.Roles
join r in roles on ar.ToLower() equals r.ToLower()
select a;
foreach(var article in primaryArticles)
article.Weight += RoleWeight;
}void AddPrimaryRoleWeights(IEnumerable<Article> articles, string primaryRole)
{
const double PrimaryArticleWeight = 2;
var primaryArticles =
from a in articlesWithRoles
where a.Roles != null && a.Roles.Any() &&
String.Equals(a.Roles[0], primaryRole, StringComparison.CurrentCultureIgnoreCase)
select a;
foreach(var article in primaryArticles)
article.Weight += PrimaryArticleWeight;
}// add this to Article class
public double Weight { get; private set; } // probably you don't need setter
public void AddWeights(IEnumerable<Role> roles)
{
const double RoleWeight = 1;
const double PrimaryRoleWeight = 3;
if (!roles.Any())
return;
if (Roles == null || !Roles.Any())
return;
var pirmaryRole = roles.First();
if (String.Equals(Roles[0], primaryRole, StringComparison.CurrentCultureIgnoreCase))
{
Weight += PrimaryRoleWeight;
return;
}
foreach(var role in roles)
if (Roles.Contains(role, StringComparer.CurrentCultureIgnoreCase))
Weight += RoleWeight;
}Context
StackExchange Code Review Q#60837, answer score: 6
Revisions (0)
No revisions yet.