patternMinor
Minimal date interval cover algorithm
Viewed 0 times
intervaldatealgorithmminimalcover
Problem
The problem involves date intervals filtered by days of week. For example, the filtered interval {2001 APR 1 - 2001 APR 30, 17} corresponds to all Mondays and Sundays between April 1 and April 30. Here is the problem: given a collection of filtered date intervals, find a minimal collection of such intervals that cover exactly the same set of dates.
Initially, I naively assumed that this can be solved by sorting the intervals by start dates and merging consecutive intervals, if possible. Here is a counterexample to this approach. Consider the date interval given by the pattern $56712345671234$, where each number stands for a day of week. Assign zero-based indices to the dates in this pattern. The set of dates covered by intervals $\{0-2, 567\}$, $\{3-9, 1234\}$, $\{10-13, 12\}$ can be covered by the collection $\{0-6, 1234567\}, \{7-13, 12\}$. The merging algorithm will fail to merge any intervals and will output three intervals as the minimal cover instead of two.
Is there a well-known algorithm for this problem? Can it be solved in $O(n \log n)$ where $n$ is the number of input intervals?
Initially, I naively assumed that this can be solved by sorting the intervals by start dates and merging consecutive intervals, if possible. Here is a counterexample to this approach. Consider the date interval given by the pattern $56712345671234$, where each number stands for a day of week. Assign zero-based indices to the dates in this pattern. The set of dates covered by intervals $\{0-2, 567\}$, $\{3-9, 1234\}$, $\{10-13, 12\}$ can be covered by the collection $\{0-6, 1234567\}, \{7-13, 12\}$. The merging algorithm will fail to merge any intervals and will output three intervals as the minimal cover instead of two.
Is there a well-known algorithm for this problem? Can it be solved in $O(n \log n)$ where $n$ is the number of input intervals?
Solution
This can be solved in $O(n \log n)$ time by a greedy algorithm. Take the earliest week you need to cover (that isn't covered yet). Pick the set of days-of-the-week that you need to cover in that week. Choose a filtered data interval that starts with that week, uses that mask, and extends as far as possible. Add that to your collection, and repeat.
I'll let you work out the proof that it is correct, via an exchange argument. How to prove greedy algorithm is correct may be helpful.
To make this run efficiently with a $O(n \log n)$ worst-case time, you'll need a proper data structure for storing the filtered intervals and figuring out how far you can extend an interval. I'll let you figure that out, but a hint: a balanced binary search tree provides $O(\log n)$ running time for all basic operations.
I'll let you work out the proof that it is correct, via an exchange argument. How to prove greedy algorithm is correct may be helpful.
To make this run efficiently with a $O(n \log n)$ worst-case time, you'll need a proper data structure for storing the filtered intervals and figuring out how far you can extend an interval. I'll let you figure that out, but a hint: a balanced binary search tree provides $O(\log n)$ running time for all basic operations.
Context
StackExchange Computer Science Q#90530, answer score: 2
Revisions (0)
No revisions yet.