Introduction
In this article, we find the maximum number of points lying on the same straight line.
Problem Statement
Given n points on a 2D plane, find the maximum number of points that lie on the same straight line.
Example 1
Input: [ [1,1], [2,2], [3,3] ]
Output: 3
Explanation
Example 2
Input: [ [1,1], [3,2], [5,3], [4,1], [2,3], [1,4] ]
Output: 4
Explanation
Solution
Let's simplify the problem and search the maximum number of points on a line passing through the point i . One can immediately notice that it's interesting to consider only the next points i + 1 .. N - 1 because the maximum number of points containing, for example, the point i - 2 was already found during the search of the maximum number of points on a line passing through the point i - 2. The idea is very simple: draw the lines passing through the point i and each of the next points. Save these lines is a hash table with a counter 2 = two points on this line.
Let's imagine now that points i < i + k < i + l are on the same line. Drawing a line through i and i + l, one would discover that this line is already tracked and hence one has to update a counter of points on this line count++. If the line is horizontal, i.e. y = c , one could use this constant c as a line key in a hash table of horizontal lines. The other lines could be represented as y = slope * x + c. The equation for the line passing through two points 1 and 2 are given
here. Hence, a slope value is sufficient to represent a unique line starting from a specific point i.
One might go ahead and use a float (or double) value to represent each unique slope. Indeed, this could work for most of the cases, but not all. One of the cases that a float value would not cut for the slope variable is that when two points form a vertical line. As we can see from the formula to calculate the slope value, we would encounter a divide-by-zero error. One might argument we could treat this as a special case, and assign a special value (say, zero) to represent the horizontal slope. However, a bigger problem is that the float and double values are intrinsically inaccurate, due to how these values are represented in the computer(click
here to know the problems in depth).
A simple fact to comprehend this limitation is that we could have infinite number of digits for a fraction number, we could only keep a limited number of digits as its float value in the computer. Therefore, it is not wise to use the float/double value to represent a unique slope, since they are not accurate.
As a reminder, two integers are co-primes, if and only if their greatest common divisor is 1. As one can see, due to the property of co-prime numbers, they can be used to represent the slope values of different lines.We now have the idea and even some important details (co-primes) to implement the algorithm,
- Initiate the maximum number of points max_count = 1.
- Iterate over all points i from 0 to N - 2 .
- For each point i find a maximum number of points max_count_i on a line passing through the point i:
- Initiate the maximum number of points on a line passing through the point i : count = 1.
- Iterate over next points j from i + 1 to N - 1. If j is a duplicate of i , update a number of duplicates for point i. If not: Save the line passing through the points i and j. Update count at each step.
- Return max_count_i = count + duplicates.
- Update the result max_count = max(max_count, max_count_i).
Below is a code sample illustrating this algorithm in python 3:
- class Solution(object):
- def maxPoints(self, points):
- def max_points_on_a_line_containing_point_i(i):
- def slope_coprime(x1, y1, x2, y2):
- delta_x, delta_y = x1 - x2, y1 - y2
- if delta_x == 0:
- return (0, 0)
- elif delta_y == 0:
- return (sys.maxsize, sys.maxsize)
- elif delta_x < 0:
- delta_x, delta_y = - delta_x, - delta_y
- gcd = math.gcd(delta_x, delta_y)
- slope = (delta_x / gcd, delta_y / gcd)
- return slope
-
- def add_line(i, j, count, duplicates):
- x1 = points[i][0]
- y1 = points[i][1]
- x2 = points[j][0]
- y2 = points[j][1]
- if x1 == x2 and y1 == y2:
- duplicates += 1
- elif y1 == y2:
- nonlocal horizontal_lines
- horizontal_lines += 1
- count = max(horizontal_lines, count)
- else:
- slope = slope_coprime(x1, y1, x2, y2)
- lines[slope] = lines.get(slope, 1) + 1
- count = max(lines[slope], count)
- return count, duplicates
- lines, horizontal_lines = {}, 1
- count = 1
- duplicates = 0
- for j in range(i + 1, n):
- count, duplicates = add_line(i, j, count, duplicates)
- return count + duplicates
- n = len(points)
- if n < 3:
- return n
- max_count = 1
- for i in range(n - 1):
- max_count = max(max_points_on_a_line_containing_point_i(i), max_count)
- return max_count
Time complexity : O(N x N)
Space complexity : O(N)