HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaCritical

Distinct by property

Submitted by: @import:stackoverflow-api··
0
Viewed 0 times
propertydistinctstackoverflow

Problem

In Java 8, how can I filter a collection using the Stream API by checking the distinctness of a property of each object?

For example, I have a list of Person objects and I want to remove people with the same name:

persons.stream().distinct();


will use the default equality check for a Person object, so I need something like:

persons.stream().distinct(p -> p.getName());


Unfortunately, the distinct() method has no such overload. Without modifying the equality check inside the Person class, is it possible to do this succinctly?

Solution

Consider distinct to be a stateful filter. Here is a function that returns a predicate that maintains state about what it's seen previously, and that returns whether the given element was seen for the first time:

public static  Predicate distinctByKey(Function keyExtractor) {
    Set seen = ConcurrentHashMap.newKeySet();
    return t -> seen.add(keyExtractor.apply(t));
}


Then you can write:

persons.stream().filter(distinctByKey(Person::getName))


Note that if the stream is ordered and is run in parallel, this will preserve an arbitrary element from among the duplicates, instead of the first one, as distinct() does.

(This is essentially the same as my answer to this question: Java Lambda Stream Distinct() on arbitrary key?)

Code Snippets

public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
    Set<Object> seen = ConcurrentHashMap.newKeySet();
    return t -> seen.add(keyExtractor.apply(t));
}
persons.stream().filter(distinctByKey(Person::getName))

Context

Stack Overflow Q#23699371, score: 952

Revisions (0)

No revisions yet.