HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Getting matching items in lists based on nullable properties

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
propertiesgettingnullableitemsbasedlistsmatching

Problem

I'm iterating through two lists to find matching objects. The objects have two string properties, Id1 and Id2 below. If Id1 is not null, empty or whitespace, I want to use it to find a matching object in the other list. Otherwise I want to make use of Id2.

If both Id1 and Id2 are set to null by the extension method on either p or p2, I want to skip the iteration, since I know that I won't find a match.

public static string NullIfEmptyOrWhiteSpace(this string value)
{
    return string.IsNullOrEmpty(value) ? null : value;
}


foreach (Product p2 in productList1)
{ 
    Product match = productList2.FirstOrDefault(p =>
    {
        string pId = p.Id1.NullIfEmptyOrWhiteSpace() ??
                     p.Id2.NullIfEmptyOrWhiteSpace();

        string p2Id = p2.Id1.NullIfEmptyOrWhiteSpace() ??
                      p2.Id2.NullIfEmptyOrWhiteSpace();

        if (pId == null || p2Id == null)
            return false;

        return pId.Equals(p2Id);
    });

    // do stuff with match
}


This seems to work, but it feels like I'm re-inventing the wheel here. Are there any built in methods that I can make use of to simplify the code? Perhaps a class that implements IEqualityComparer?

Example use case:

List list1 = new List()
{  
    new Product() { Id1 = "123", Id2 = null },
    new Product() { Id1 = "435", Id2 = null },
    new Product() { Id1 = null, Id2 = "123" }
}

List list12 = new List()
{  
    new Product() { Id1 = "123", Id2 = null },
    new Product() { Id1 = "456", Id2 = null },
    new Product() { Id1 = null, Id2 = "123" }
}

// do something with each matching item in the first list
// which would be item #1 and item #3 above

Solution

Let's start refactoring!

foreach (Product p2 in productList1)
{ 
    Product match = productList2.FirstOrDefault(p =>
    {
        string pId = p.Id1.NullIfEmptyOrWhiteSpace() ??
                     p.Id2.NullIfEmptyOrWhiteSpace();

        string p2Id = p2.Id1.NullIfEmptyOrWhiteSpace() ??
                      p2.Id2.NullIfEmptyOrWhiteSpace();

        if (pId == null || p2Id == null)
            return false;

        return pId.Equals(p2Id);
    });

    // do stuff with match
}


p2Id does not depend on p, so let's move it out of the lambda.

foreach (Product p2 in productList1)
{ 
    string p2Id = p2.Id1.NullIfEmptyOrWhiteSpace() ??
                  p2.Id2.NullIfEmptyOrWhiteSpace();

    Product match = productList2.FirstOrDefault(p =>
    {
        string pId = p.Id1.NullIfEmptyOrWhiteSpace() ??
                     p.Id2.NullIfEmptyOrWhiteSpace();

        if (pId == null || p2Id == null)
            return false;

        return pId.Equals(p2Id);
    });

    // do stuff with match
}


Now let's define a method on Product to remove the repetition. I called it GetKey, but there's probably a better name.

public string GetKey()
{
    return this.Id1.NullIfEmptyOrWhiteSpace() ?? this.Id2.NullIfEmptyOrWhiteSpace();
}

foreach (Product p2 in productList1)
{ 
    string p2Id = p2.GetKey();

    Product match = productList2.FirstOrDefault(p =>
    {
        string pId = p.GetKey();

        if (pId == null || p2Id == null)
            return false;

        return pId.Equals(p2Id);
    });

    // do stuff with match
}


It looks like you actually want to call IsNullOrWhiteSpace in NullIfEmptyOrWhiteSpace, so let's fix that.

public static string NullIfEmptyOrWhiteSpace(this string value)
{
    return string.IsNullOrWhiteSpace(value) ? null : value;
}


Actually, we don't need to go through productList2 if p2Id is null.

foreach (Product p2 in productList1)
{ 
    string p2Id = p2.GetKey();
    if (p2Id == null)
    {
        continue;
    }

    Product match = productList2.FirstOrDefault(p =>
    {
        string pId = p.GetKey();

        if (pId == null)
            return false;

        return pId.Equals(p2Id);
    });

    // do stuff with match
}


Given that we only use p2 for its key, we can rewrite this as

foreach (var key in productList1.Select(product => product.GetKey()).Where(key => key != null))
{
    Product match = productList2.FirstOrDefault(p =>
    {
        string pId = p.GetKey();

        if (pId == null)
            return false;

        return pId.Equals(key);
    });

    // do stuff with match
}


Now let's use a dictionary to replace the call to productList2.FirstOrDefault.

var firstProductWithKey = new Dictionary();
foreach (var product in productList2)
{
    var key = product.GetKey();
    if (key != null && !firstProductWithKey.ContainsKey(key))
    {
        firstProductWithKey.Add(key, product);
    }
}

if (!firstProductWithKey.Any())
{
    return;
}

foreach (var key in productList1.Select(product => product.GetKey()).Where(key => key != null))
{
    Product match;
    if (firstProductWithKey.TryGetValue(key, out match))
    {
        // do stuff with match
    }
}


So what have we gained? Hopefully the intention of the code is a bit clearer, but there's also a performance benefit.

Let's say that the lists have \$n\$ and \$m\$ elements respectively, and that there are no keys common to the two lists.

In the original code, we're evaluating each key in the first list \$m\$ times, and each key in the second list \$n\$ times. So \$2nm\$ keys in total.

In the refactored version, we evaluate each key in the first list once, and each key in the second list once, for \$n + m\$ keys. We're performing much less work, and generating much less garbage for the GC.

Code Snippets

foreach (Product p2 in productList1)
{ 
    Product match = productList2.FirstOrDefault(p =>
    {
        string pId = p.Id1.NullIfEmptyOrWhiteSpace() ??
                     p.Id2.NullIfEmptyOrWhiteSpace();

        string p2Id = p2.Id1.NullIfEmptyOrWhiteSpace() ??
                      p2.Id2.NullIfEmptyOrWhiteSpace();

        if (pId == null || p2Id == null)
            return false;

        return pId.Equals(p2Id);
    });

    // do stuff with match
}
foreach (Product p2 in productList1)
{ 
    string p2Id = p2.Id1.NullIfEmptyOrWhiteSpace() ??
                  p2.Id2.NullIfEmptyOrWhiteSpace();

    Product match = productList2.FirstOrDefault(p =>
    {
        string pId = p.Id1.NullIfEmptyOrWhiteSpace() ??
                     p.Id2.NullIfEmptyOrWhiteSpace();

        if (pId == null || p2Id == null)
            return false;

        return pId.Equals(p2Id);
    });

    // do stuff with match
}
public string GetKey()
{
    return this.Id1.NullIfEmptyOrWhiteSpace() ?? this.Id2.NullIfEmptyOrWhiteSpace();
}

foreach (Product p2 in productList1)
{ 
    string p2Id = p2.GetKey();

    Product match = productList2.FirstOrDefault(p =>
    {
        string pId = p.GetKey();

        if (pId == null || p2Id == null)
            return false;

        return pId.Equals(p2Id);
    });

    // do stuff with match
}
public static string NullIfEmptyOrWhiteSpace(this string value)
{
    return string.IsNullOrWhiteSpace(value) ? null : value;
}
foreach (Product p2 in productList1)
{ 
    string p2Id = p2.GetKey();
    if (p2Id == null)
    {
        continue;
    }

    Product match = productList2.FirstOrDefault(p =>
    {
        string pId = p.GetKey();

        if (pId == null)
            return false;

        return pId.Equals(p2Id);
    });

    // do stuff with match
}

Context

StackExchange Code Review Q#55778, answer score: 6

Revisions (0)

No revisions yet.