HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Find point with smallest average distance to a set of given points

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
smallestpointwithpointsdistanceaveragefindgivenset

Problem

Someone recently shared with me the following problem (which I guess appeared in some kind of past coding contest):


Given $n$ points $P_i=(x_i,y_i)$ in the 2-dimensional plane, find the point $Q=(x,y)$ that minimizes the average distance from $P_i$ to $Q$.

In other words, we want to find $Q$ that minimizes the objective function $f(Q) = \sum_i d(P_i,Q)$. Here $d(\cdot,\cdot)$ represents the usual Euclidean (L2) distance. I'm told that in the coding contest this was phrased in terms of finding the best location to put a washroom somewhere in an office building, so that the average distance to all employees is minimized.

Is there an elementary solution to this problem? In other words, a polynomial-time algorithm that does not require knowledge of advanced methods (say, beyond what undergraduate computer science majors could be expected to understand).

If we were on the 1D real line rather than the 2D plane, this problem would be easy. In one dimension, the solution is exactly the median of $x_1,\dots,x_n$. However, this does not seem to generalize to the 2-dimensional plane. It's tempting to take $x$ to be the median of $x_1,\dots,x_n$ and $y$ to be the median of $y_1,\dots,y_n$, but this does not provide the optimal solution: a counterexample is to look at three points at three of the corners of a square. (The component-wise median is at a corner, but the optimal location is somewhere inside the square.)

It looks like this problem is a special case of the two-dimensional Euclidean $p$-median problem, for $p=1$. Apparently the two-dimensional Euclidean $p$-median problem is NP-hard when $p$ is part of the input (Megiddo and Supowit, SIAM J Computing 13(1) Feb 1984) -- but here we have the special case where $p=1$, so the NP-hardness result doesn't apply to this special case. I haven't been able to use this search phrase to help find an algorithm.

If the distance metric was the Manhattan (L1) distance, there would be a number of clean solutions. The

Solution

This point is known as the geometric median, also known as the 1-median.

Apparently, there is no simple algorithm for computing the geometric median. Instead, one must use some kind of numerical approximation. Wikipedia mentions Weiszfeld's algorithm, which appears to be a kind of iterative descent algorithm, and cites other more sophisticated algorithms.

This is a special case of the Weber problem, a facility location problem. Standard solvers for facilitation location problems be able to compute the geometric median for you, by framing this as an instance of the Weber problem.

Context

StackExchange Computer Science Q#30524, answer score: 7

Revisions (0)

No revisions yet.