Google Interview - Find Meeting Point (Manhattan distance)

I. Manhattan distance

In a city there are N persons. A person can walk only horizontally or vertically. Find a point that minimizes the sum of distances all persons walk to the point.

This is called the manhattan distance since a person can walk only horizontally or vertically, like in the city of Manhattan.

Let's assume this is 1-dimensional. Then for 2 persons, the point can be any point on the line connecting the 2 persons. For 3 persons , the point is the middle person. For 4 persons, the point can be any point between the middle 2 persons. For 5 persons, the point is the middle person. In general, for odd number of persons, it's the person in the middle; for even number of persons, it's any point between the middle 2 persons.

This can be extended to 2-dimensional. The answer is the point (x, y), where x and y are the median points taken independently of all the xi and yi for i = 1 to n. [1][2]

 

The cool thing about the Manhatan distance is that the distance itself comprises of two independent components: the distance on the x and y coordinate. Thus you can solve two simpler tasks and merge the results from them to obtain the desired results.

The task I speak of is: given are points on a line. Find the point on the line that minimizes the sum of the absolute distances to all the points. If there are many find all of them (btw they always turn to be a single segment which is easy to prove). The segment is determined by the (potentially two) points medians of the set. By median I mean the point that has equal number of points to the left and to the right of it. In case the number of points is odd there is no such point and you choose the points with difference 1 in both directions to form the segment.

Here I add examples of solutions of this simpler task:

In case the points on the line are like that:

-4 | | | 0 | 2 3 4
             ^

The solution is just a point and it is 2.

In case the points on the line are like that:

-4 | | | 0 | 2 3
         ^---^

The whole segment [0, 2] is the solution of the problem.

You solve this task separately for the x and y coordinate and then merge the results to obtain the rectangle of minimum distanced points.


EXAMPLE

And now comes an example of the solution for the initial task.

Imagine you want to find the points that are with minimum Manhatan distance to the set (0, 6), (1, 3), (3, 5), (3, 3), (4, 7), (2, 4)

You form the two simpler tasks:

For x:

0 1 2 3 3 4
    ^-^

And here the solution is the segment [2, 3] (note that here we have duplicated point 3, which I represented in probably not the most intuitive way).

For y:

3 3 4 5 6 7
    ^-^

Here the solution is the segment [4, 5].

Finally we get that the solution of the initial task is the rectangle with formula:

 2 <= x <= 3; 4 <= y <= 5 

COMPLEXITY

As many people show interest in this post I decided to improve it a bit more.

Let's speak about complexity.

The complexity of the task is actually the same as the complexity of solving the simpler task (because as already discussed the solution actually consists of solving two simpler tasks). Many people will go and solve it via sorting and then choosing out the medians. However, this will causeO(nlog n) complexity, where n is the number of points in the input set.

This can be improved if a better algorithm for finding the kth element is used (Example of implementation in the C++ STL). This algorithm basically follows the same approach as qsort. The running time is O(n). Even in the case of two median points this will still remain linear (needing two runs of the same algorithm), and thus the total complexity of the algorithm becomes O(n). It is obvious that the task can not be solved any faster, as long as the input itself is of the mentioned complexity.

 

注意:

还有可能出现中间有障碍物的情况。应该特殊考虑!例如Google曾经的一道面试题是(来自这里):

有一个gym,用block表示。里面有健身器材,还有障碍物。让找一个最佳的位置放置椅子,使得椅子到所有健身器材的曼哈顿距离最短。


II. Geometric median.

The geometric median of a discrete set of sample points in a Euclidean space is the point minimizing the sum of distances to the sample points [3]. This no longer requires the path be horizontal or vertical.

Despite the simple form, the solution is much more complex than the similar problem of finding the center of mass, which minimizes the sum of the squares of distances of the points to the center. There is no simple formula for the solution.

For 2 points, it's any point on the line connecting the 2 points.

For 3 non-collinear points, the problem is known as Fermat's problem. Solution is in [3]. For 4 co-planar points, the solution is also in [3].

For more points, the solution can be approximated by numerical methods such as the Weiszfeld's algorithm.

References:
[1] Shortest distance travel - common meeting point
[2] Algorithm to find point of minimum total distance from locations
[3] Geometric median

你可能感兴趣的:(interview)