Well, this problem desires for the use of dynamic programming. They key to any DP problem is to come up with the state equation. In this problem, we define the state to be the maximal size of the square that can be achieved at point (i, j)
, denoted as P[i][j]
. Remember that we usesize instead of square as the state (square = size^2
).
Now let's try to come up with the formula for P[i][j]
.
First, it is obvious that for the topmost row (i = 0
) and the leftmost column (j = 0
), S[i][j] = matrix[i][j]
. This is easily understood. Let's suppose that the topmost row of matrix
is like [1, 0, 0, 1]
. Then we can immediately know that the first and last point can be a square of size 1
while the two middle points cannot make any square, giving a size of 0
. Thus, P = [1, 0, 0, 1]
, which is the same as matrix
. The case is similar for the leftmost column. Till now, the boundary conditions of this DP problem are solved.
Let's move to the more general case for P[i][j]
in which i > 0
and j > 0
. First of all, let's see another simple case in which matrix[i][j] = 0
. It is obvious that P[i][j] = 0
too. Why? Well, since matrix[i][j] = 0
, no square will contain matrix[i][j]
, according to our definition of P[i][j]
, P[i][j]
is also 0
.
Now we are almost done. The only unsolved case is matrix[i][j] = 1
. Let's see an example.
Suppose matrix = [[0, 1], [1, 1]]
, it is obvious that P[0][0] = 0, P[0][1] = P[1][0] = 1
, what about P[1][1]
? Well, to give a square of size larger than 1
in P[1][1]
, all of its three neighbors (left, up, left-up) should be non-zero, right? In this case, the left-up neighbor P[0][0] = 0
, so P[1][1]
can only be 1, which means that it contains the square of itself.
Now you are near the solution. In fact, P[i][j] = min(P[i - 1][j], P[i][j - 1], P[i - 1][j - 1]) + 1
in this case.
Taking all these together, we have the following state equations.
P[0][j] = matrix[0][j]
(topmost row);P[i][0] = matrix[i][0]
(leftmost column);i > 0
and j > 0
: if matrix[i][j] = 0
, P[i][j] = 0
; if matrix[i][j] = 1
, P[i][j] = min(P[i - 1][j], P[i][j - 1], P[i - 1][j - 1]) + 1
.Putting them into codes, and maintain a variable maxsize
to record the maximum size of the square we have seen, we have the following (unoptimized) solution.
1 int maximalSquare(vector<vector<char>>& matrix) { 2 int m = matrix.size(); 3 if (!m) return 0; 4 int n = matrix[0].size(); 5 vector<vector<int> > size(m, vector<int>(n, 0)); 6 int maxsize = 0; 7 for (int j = 0; j < n; j++) { 8 size[0][j] = matrix[0][j] - '0'; 9 maxsize = max(maxsize, size[0][j]); 10 } 11 for (int i = 1; i < m; i++) { 12 size[i][0] = matrix[i][0] - '0'; 13 maxsize = max(maxsize, size[i][0]); 14 } 15 for (int i = 1; i < m; i++) { 16 for (int j = 1; j < n; j++) { 17 if (matrix[i][j] == '1') { 18 size[i][j] = min(size[i - 1][j - 1], min(size[i - 1][j], size[i][j - 1])) + 1; 19 maxsize = max(maxsize, size[i][j]); 20 } 22 } 23 } 24 return maxsize * maxsize; 25 }
Now let's try to optimize the above solution. As can be seen, each time when we update size[i][j]
, we only need size[i][j - 1], size[i - 1][j - 1]
(at the previous left column) and size[i - 1][j]
(at the current column). So we do not need to maintain the full m*n
matrix. In fact, keeping two columns is enough. Now we have the following optimized solution.
1 int maximalSquare(vector<vector<char>>& matrix) { 2 int m = matrix.size(); 3 if (!m) return 0; 4 int n = matrix[0].size(); 5 vector<int> pre(m, 0); 6 vector<int> cur(m, 0); 7 int maxsize = 0; 8 for (int i = 0; i < m; i++) { 9 pre[i] = matrix[i][0] - '0'; 10 maxsize = max(maxsize, pre[i]); 11 } 12 for (int j = 1; j < n; j++) { 13 cur[0] = matrix[0][j] - '0'; 14 maxsize = max(maxsize, cur[0]); 15 for (int i = 1; i < m; i++) { 16 if (matrix[i][j] == '1') { 17 cur[i] = min(cur[i - 1], min(pre[i - 1], pre[i])) + 1; 18 maxsize = max(maxsize, cur[i]); 19 } 20 } 21 pre = cur; 22 fill(cur.begin(), cur.end(), 0); 23 } 24 return maxsize * maxsize; 25 }
As can be seen, line 21 of the above involves vector copying, which is a little unnecessary. We can simply maintain two points to the two vectors and just write and swap using the pointers, as in the following code.
1 int maximalSquare(vector<vector<char>>& matrix) { 2 if (matrix.empty()) return 0; 3 int m = matrix.size(), n = matrix[0].size(); 4 vector<int> pre(m, 0), cur(m, 0); 5 auto ppre = &pre, pcur = &cur; 6 int maxsize = 0; 7 for (int i = 0; i < m; i++) { 8 (*ppre)[i] = matrix[i][0] - '0'; 9 maxsize = max(maxsize, pre[i]); 10 } 11 for (int j = 1; j < n; j++) { 12 (*pcur)[0] = matrix[0][j] - '0'; 13 maxsize = max(maxsize, (*pcur)[0]); 14 for (int i = 1; i < m; i++) { 15 if (matrix[i][j] == '1') { 16 (*pcur)[i] = min((*pcur)[i - 1], min((*ppre)[i - 1], (*ppre)[i])) + 1; 17 maxsize = max(maxsize, (*pcur)[i]); 18 } 19 } 20 swap(ppre, pcur); 21 fill((*pcur).begin(), (*pcur).end(), 0); 22 } 23 return maxsize * maxsize; 24 }
Now the solution is finished? In fact, it can still be optimized! In fact, we need not maintain two vectors and one is enough. The idea is that we maintain two vectors simply to recover pre[i - 1]
and this can stored in a single variable, thus eliminating the need for a full vector. Moreover, in the code above, we distinguish between the 0
-th row and other rows since the 0
-th row has no row above it. In fact, we can make all the m
rows the same by padding a 0
row on the top. Finally, we will have the following short code :) If you find it hard to understand, try to run it using your pen and paper and notice how it realizes what the two-vector solution does using only one vector.
1 int maximalSquare(vector<vector<char>>& matrix) { 2 if (matrix.empty()) return 0; 3 int m = matrix.size(), n = matrix[0].size(); 4 vector<int> dp(m + 1, 0); 5 int maxsize = 0, pre = 0; 6 for (int j = 0; j < n; j++) { 7 for (int i = 1; i <= m; i++) { 8 int temp = dp[i]; 9 if (matrix[i - 1][j] == '1') { 10 dp[i] = min(dp[i], min(dp[i - 1], pre)) + 1; 11 maxsize = max(maxsize, dp[i]); 12 } 13 else dp[i] = 0; 14 pre = temp; 15 } 16 } 17 return maxsize * maxsize; 18 }
Well, someone even suggest not to use the pre variable and simply obtain the information from the matrix, which gives the following code.
1 int maximalSquare(vector<vector<char>>& matrix) { 2 if (matrix.empty()) return 0; 3 int m = matrix.size(), n = matrix[0].size(); 4 vector<int> dp(m + 1, 0); 5 int maxsize = 0; 6 for (int j = 0; j < n; j++) { 7 for (int i = 1; i <= m; i++) { 8 if (matrix[i - 1][j] == '1') { 9 int k = min(dp[i], dp[i - 1]); 10 dp[i] = matrix[i - k - 1][j - k] == '1' ? k + 1 : k; 11 maxsize = max(maxsize, dp[i]); 12 } 13 else dp[i] = 0; 14 } 15 } 16 return maxsize * maxsize; 17 }