patternMinor
Find a unit square containing most of the points
Viewed 0 times
containingfindthepointssquaremostunit
Problem
Given points $p_1,\ldots,p_n$ in $\mathbb{R}^2$, the task is to find an axis-aligned unit square containing the maximum number of points. I came up with and $O(n^3)$ algorithm as follows.
Observation 1: For $n\geq 2$, there is an optimal solution containing at least two points on the boundary not lying on a line parallel to the coordinate axes.
Pf. Suppose not, say it contains only 1 or all are on one boundary. Then we may move the rectangle along the boundary until we hit some point.
We can take all pairs and for each pair $p_i = (x_1,y_1)$ and $p_j = (x_2,y_2)$, where $x_1\neq x_2$ and $y_1\neq y_2$. Suppose wlog that $y_2>y_1$ and $x_2>x_1$. Other cases are symettrical. we try to find the intersection $a_1$ of lines $x=x_1$ and $y=y_2$ and try to place the upper left corner to $a_1$ and then intersection $a_2$ of lines $x=x_2$ and $y=y_1$ and place bottom right corner to $a_2$.
In this way, we get $O(1)$ candidate squares for each of $O(n^2)$ pairs of points. We can calculate the number of points contained in each of them in $O(n)$ time, givin us $O(n^3)$ algorithm.
For a sanity check, is the above algorithm correct?
Now, is there a faster way to solve this?
Observation 1: For $n\geq 2$, there is an optimal solution containing at least two points on the boundary not lying on a line parallel to the coordinate axes.
Pf. Suppose not, say it contains only 1 or all are on one boundary. Then we may move the rectangle along the boundary until we hit some point.
We can take all pairs and for each pair $p_i = (x_1,y_1)$ and $p_j = (x_2,y_2)$, where $x_1\neq x_2$ and $y_1\neq y_2$. Suppose wlog that $y_2>y_1$ and $x_2>x_1$. Other cases are symettrical. we try to find the intersection $a_1$ of lines $x=x_1$ and $y=y_2$ and try to place the upper left corner to $a_1$ and then intersection $a_2$ of lines $x=x_2$ and $y=y_1$ and place bottom right corner to $a_2$.
In this way, we get $O(1)$ candidate squares for each of $O(n^2)$ pairs of points. We can calculate the number of points contained in each of them in $O(n)$ time, givin us $O(n^3)$ algorithm.
For a sanity check, is the above algorithm correct?
Now, is there a faster way to solve this?
Solution
"A window" will also mean "an axis-aligned unit square".
As observed in the question, there is an optimal window such that
Hence we need to check only the windows with bottom left corners at $(x_i, y_j)$. There are $n^2$ windows (with possible duplicates among them) to check.
Here is a simple $O(n^2)$-time algorithm. Basically, we apply the sliding interval algorithm to check all windows with their bottom sides on the same line.
The input is $p_i=(x_i,y_i)$, $1\le i\le n$.
Now $p_1, p_2, \cdots$ are sorted by $x$-coordinate.
It takes $O(n\log n)$ time to sort. Each run of the sliding interval algorithm costs $O(n)$ time. All $n$ runs cost $O(n^2)$ time. Other computations are insignificant in time-cost. Hence the time-complexity of the algorithm is $O(n^2)$.
Since the "sliding" of the intervals is the same for all runs, we could speed up the algorithm somewhat. However, that alone does not reduce the time-complexity to below $O(n^2)$ because of work needed to track the (number of) points in each sliding interval.
As observed in the question, there is an optimal window such that
- either the bottom left corner is a given point.
- or the left side contains a given point not on the bottom side and the bottom side contains a given point not on the left side.
Hence we need to check only the windows with bottom left corners at $(x_i, y_j)$. There are $n^2$ windows (with possible duplicates among them) to check.
Here is a simple $O(n^2)$-time algorithm. Basically, we apply the sliding interval algorithm to check all windows with their bottom sides on the same line.
The input is $p_i=(x_i,y_i)$, $1\le i\le n$.
- Sort $p_i$'s by $x$-coordinate.
Now $p_1, p_2, \cdots$ are sorted by $x$-coordinate.
- For $i$ from $1$ to $n$ (which is meant for $y_i$), do the following.
- Initialize an empty queue that will be used to track the points in the sliding interval.
- Run the usual sliding interval algorithm on the array of numbers $x_1,x_2,\cdots, x_n$ using the dequeue and the sliding interval that will be $[x_1, x_1+1]$, $[x_2,x_2+1]$ and so on successively. However, when the interval is $[x_j, x_j+1]$, a number $x_k$ (with $k$ noted) in the dequeue will be counted if and only if $(x_k, y_k)$ is in the window with bottom left corner at $[x_j, y_i]$, i.e., $y_i\le y_k\le y_i+1$. Track the maximum number of counted numbers in the sliding interval.
- Return the maximum number of all maximum numbers obtained in step 2.2.
It takes $O(n\log n)$ time to sort. Each run of the sliding interval algorithm costs $O(n)$ time. All $n$ runs cost $O(n^2)$ time. Other computations are insignificant in time-cost. Hence the time-complexity of the algorithm is $O(n^2)$.
Since the "sliding" of the intervals is the same for all runs, we could speed up the algorithm somewhat. However, that alone does not reduce the time-complexity to below $O(n^2)$ because of work needed to track the (number of) points in each sliding interval.
Context
StackExchange Computer Science Q#160012, answer score: 3
Revisions (0)
No revisions yet.