patternjavaMinor
Data Stream as Disjoint intervals
Viewed 0 times
streamintervalsdatadisjoint
Problem
I was asked the following question:
Given a data stream input of non-negative integers a1, a2, ..., an,
..., summarize the numbers seen so far as a list of disjoint
intervals.
For example, suppose the integers from the data stream are 1, 3, 7, 2,
6, ..., then the summary will be:
My approach to this problem was to keep adding the numbers to a structure such as a TreeSet which will help with the in-order iteration. While iterating I can check whether the values are consecutive and if they are, I add them to the existing interval, otherwise I create a new interval. Here's the code I have to accomplish it:
It will be great to get some pointers on how to do this better.
Given a data stream input of non-negative integers a1, a2, ..., an,
..., summarize the numbers seen so far as a list of disjoint
intervals.
For example, suppose the integers from the data stream are 1, 3, 7, 2,
6, ..., then the summary will be:
[1, 1]
[1, 1], [3, 3]
[1, 1], [3, 3], [7, 7]
[1, 3], [7, 7]
[1, 3], [6, 7]My approach to this problem was to keep adding the numbers to a structure such as a TreeSet which will help with the in-order iteration. While iterating I can check whether the values are consecutive and if they are, I add them to the existing interval, otherwise I create a new interval. Here's the code I have to accomplish it:
public class SummaryRanges {
private TreeSet streamNums;
/** Initialize your data structure here. */
public SummaryRanges() {
streamNums= new TreeSet();
}
public void addNum(int val) {
if(!streamNums.contains(val)) {
streamNums.add(val);
}
}
public List getIntervals() {
ArrayList buffer = new ArrayList();
if(!streamNums.isEmpty()) {
int start = streamNums.first();
int end = start;
Iterator streamIter = streamNums.iterator();
if(streamIter.hasNext()) streamIter.next();
while(streamIter.hasNext()) {
int i = streamIter.next();
if(i == end + 1) {
end = i;
} else {
buffer.add(new Interval(start,end));
start = i;
end = i;
}
}
buffer.add(new Interval(start,end));
}
return buffer;
}
}It will be great to get some pointers on how to do this better.
Solution
Using a
Since you are already using a
This avoids having to
Inverting
You can reduce one level of nesting if you simply returned a
Interfaces over implementations and type inference
As illustrated above,
Java 8 stream-based processing
Since you are on Java 8, you can look towards its stream-based processing capabilities as the appropriate way of solving your problem. It's relatively simple to adapt your solution to reduce an integer stream to a
In summary, a
For illustration, this is how it can be used:
TreeSet effectivelySince you are already using a
TreeSet as the underlying data structure, you need not check if it contains the incoming number val or not in addNum(int) as any Set implementation will not store duplicate values. Also, to skip the first element in the TreeSet, you can do:int start = streamNums.first();
Iterator streamIter = streamNums.tailSet(start, false).iterator();This avoids having to
next() on the Iterator manually.Inverting
ifspublic List getIntervals() {
ArrayList buffer = new ArrayList();
if(!streamNums.isEmpty()) {
// ...
}
return buffer;
}You can reduce one level of nesting if you simply returned a
new ArrayList first:public List getIntervals() {
if(streamNums.isEmpty()) {
return new ArrayList<>();
}
List buffer = new ArrayList<>();
// ...
return buffer;
}Interfaces over implementations and type inference
As illustrated above,
buffer can be declared as a List instead of ArrayList, and at least you are not making the same mistake in the method return type declaration. Also, since Java 7, you can rely on type inference to eliminate mentioning the generic type a second time, by using the 'diamond operator' <>.Java 8 stream-based processing
Since you are on Java 8, you can look towards its stream-based processing capabilities as the appropriate way of solving your problem. It's relatively simple to adapt your solution to reduce an integer stream to a
List, or more specifically to create a Collector out of your solution:public class TreeInterval implements Collector> {
private final TreeSet cache = new TreeSet<>();
@Override
public BiConsumer accumulator() {
return (ti, integer) -> ti.cache.add(integer);
}
@Override
public Set characteristics() {
return Collections.singleton(Characteristics.UNORDERED);
}
@Override
public BinaryOperator combiner() {
return (a, b) -> { a.cache.addAll(b.cache); return a; };
}
private List toResult() {
if (cache.isEmpty()) {
return new ArrayList<>();
}
List buffer = new ArrayList<>();
int start = cache.first();
int end = start;
Iterator iterator = cache.tailSet(start, false).iterator();
while (iterator.hasNext()) {
int i = iterator.next();
if (i == end + 1) {
end = i;
} else {
buffer.add(new Interval(start, end));
start = i;
end = i;
}
}
buffer.add(new Interval(start, end));
return buffer;
}
@Override
public Function> finisher() {
return TreeInterval::toResult;
}
@Override
public Supplier supplier() {
return TreeInterval::new;
}
}In summary, a
Collector needs to know how to:- supply an accumulator: represented by the
supplier()method.
- accumulate an
Integer: represented by theaccumulator()method.
- combine the accumulators into one: represented by the
combiner()method.
- extract the results of the last accumulator to the required type: represented by the
finisher()method. In turn, your solution is used as a method reference within this.
- provide a set of characteristics that describes itself, for potential optimization: represented by the
characteristics()method.
For illustration, this is how it can be used:
public static void main(String[] args) {
IntStream.of(1, 3, 7, 2, 6)
.boxed()
.collect(new TreeInterval())
.forEach(System.out::println);
}
// Output
[1,3]
[6,7]- Construct an
IntStreamwith the values1, 3, 7, 2, 6.
- Box it into a
Stream.
- Collect the values using an instance of
TreeInterval.
- For each
Interval, print it usingSystem.out.println(Object).
Code Snippets
int start = streamNums.first();
Iterator<Integer> streamIter = streamNums.tailSet(start, false).iterator();public List<Interval> getIntervals() {
ArrayList<Interval> buffer = new ArrayList<Interval>();
if(!streamNums.isEmpty()) {
// ...
}
return buffer;
}public List<Interval> getIntervals() {
if(streamNums.isEmpty()) {
return new ArrayList<>();
}
List<Interval> buffer = new ArrayList<>();
// ...
return buffer;
}public class TreeInterval implements Collector<Integer, TreeInterval, List<Interval>> {
private final TreeSet<Integer> cache = new TreeSet<>();
@Override
public BiConsumer<TreeInterval, Integer> accumulator() {
return (ti, integer) -> ti.cache.add(integer);
}
@Override
public Set<Characteristics> characteristics() {
return Collections.singleton(Characteristics.UNORDERED);
}
@Override
public BinaryOperator<TreeInterval> combiner() {
return (a, b) -> { a.cache.addAll(b.cache); return a; };
}
private List<Interval> toResult() {
if (cache.isEmpty()) {
return new ArrayList<>();
}
List<Interval> buffer = new ArrayList<>();
int start = cache.first();
int end = start;
Iterator<Integer> iterator = cache.tailSet(start, false).iterator();
while (iterator.hasNext()) {
int i = iterator.next();
if (i == end + 1) {
end = i;
} else {
buffer.add(new Interval(start, end));
start = i;
end = i;
}
}
buffer.add(new Interval(start, end));
return buffer;
}
@Override
public Function<TreeInterval, List<Interval>> finisher() {
return TreeInterval::toResult;
}
@Override
public Supplier<TreeInterval> supplier() {
return TreeInterval::new;
}
}public static void main(String[] args) {
IntStream.of(1, 3, 7, 2, 6)
.boxed()
.collect(new TreeInterval())
.forEach(System.out::println);
}
// Output
[1,3]
[6,7]Context
StackExchange Code Review Q#129952, answer score: 2
Revisions (0)
No revisions yet.