patternjavaMinor
Duplicate words in a text
Viewed 0 times
duplicatewordstext
Problem
Here is a simplified implementation to obtain the duplicate words in a text using lambda expressions.
public class FindDuplicateWordsInText {
public static Set findDuplicateWordsInText(String text) {
String[] words = text.split(" ");
Set duplicatesRemovedSet = new HashSet<>();
Set duplicatesSet = Arrays.stream(words).filter(string -> !duplicatesRemovedSet.add(string))
.collect(Collectors.toSet());
return duplicatesSet;
}
}Solution
Your use of the boolean return value of the
Additionally, I like how you have used Interface-based types on the left-side of assignments
In terms of the Java streaming API, though, I can't help but feel that you missed out on an opportunity to improve the process by streaming the split.... The Pattern class has a splitAsStream method which would reduce your latency on the first words....
As an aside, a word should probably be on a contiguous whitespace, not just a single space (i.e.
Here's your code done differently:
Set.add() call is a clever way to check for your duplicates. The concept you have is good, and I can't think of a faster way.Additionally, I like how you have used Interface-based types on the left-side of assignments
Set and the concrete classes on the right new HashSet<>() .... people often put the concrete type on the left too, and it's good to see that you did not.In terms of the Java streaming API, though, I can't help but feel that you missed out on an opportunity to improve the process by streaming the split.... The Pattern class has a splitAsStream method which would reduce your latency on the first words....
As an aside, a word should probably be on a contiguous whitespace, not just a single space (i.e.
"\\s+" instead of " ").Here's your code done differently:
private static final Pattern SPACE = Pattern.compile("\\s+");
public static Set findDuplicateWordsInText(String text) {
Set duplicatesRemovedSet = new HashSet<>();
return SPACE.splitAsStream(text)
.filter(string -> !duplicatesRemovedSet.add(string))
.collect(Collectors.toSet());
}Code Snippets
private static final Pattern SPACE = Pattern.compile("\\s+");
public static Set<String> findDuplicateWordsInText(String text) {
Set<String> duplicatesRemovedSet = new HashSet<>();
return SPACE.splitAsStream(text)
.filter(string -> !duplicatesRemovedSet.add(string))
.collect(Collectors.toSet());
}Context
StackExchange Code Review Q#100722, answer score: 4
Revisions (0)
No revisions yet.