HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Merging IEnumerable and removing duplicates

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
removingmergingandduplicatesienumerable

Problem

I have a method that exacts data from an XML and returns an IEnumerable. I am trying to create a method that will merge the results from 2 different XML's files into a single IEnumerable and remove any duplicate data that may be found

A duplicate in this case is defined as 2 Foo objects that have the same Name property even if the other data is different. If there is a duplicate, I always want to keep the Foo from the first IEnumerable and discard the 2nd.

Am I on the right track with this LINQ query, or is there a better way to accomplish what I am trying to do?

public IEnumerable MergeElements(XElement element1, XElement element2)
{
    IEnumerable firstFoos = GetXmlData(element1); // get and parse first set
    IEnumerable secondFoos = GetXmlData(element2); // get and parse second set

    var result = firstFoos.Concat(secondFoos)
                          .GroupBy(foo => foo.Name)
                          .Select(grp => grp.First());

    return result;
}


The biggest concern I have with this code is that I am not certain the duplicate filtering rules I want will be guaranteed. I know Concat will just append secondFoos to the end of firstFoos, but when calling GroupBy(), will the resulting IGrouping object always have elements in the same order as the source data?

Solution

With things like this, it's usually best to look at the documentation.

Is your approach with GroupBy() going to work? Yes:


Elements in a grouping are yielded in the order they appear in source.

Can you use Distinct() to do the same thing more concisely? No (at least you shouldn't depend on it):


The Distinct method returns an unordered sequence that contains no duplicate values.

Is there some other way? Yes, you can use Union():


When the object returned by this method is enumerated, Union enumerates first and second in that order and yields each element that has not already been yielded.

So, I think the best way to do it is:

var result = firstFoos.Union(secondFoos);

Code Snippets

var result = firstFoos.Union(secondFoos);

Context

StackExchange Code Review Q#11121, answer score: 8

Revisions (0)

No revisions yet.