最长对称字符串问题/最长回文子串问题

题目:输入一个字符串,输出该字符串中对称的子字符串的最大长度。比如输入字符串“google”,由于该字符串里最长的对称子字符串是“goog”,因此输出4。

何海涛日志中给出算法是:先判断子字符串A是不是对称的。如果A不是对称的,那么向该子字符串两端各延长一个字符得到的字符串肯定不是对称的。如果A对称,那么我们只需要判断A两端延长的一个字符是不是相等的,如果相等,则延长后的字符串是对称的。因此在知道A是否对称之后,只需要O(1)的时间就能知道aAa是不是对称的。由于子字符串的长度可能是奇数也可能是偶数。长度是奇数的字符串是从只有一个字符的中心向两端延长出来,而长度为偶数的字符串是从一个有两个字符的中心向两端延长出来。因此我们的代码要把这种情况都考虑进去。由于总共有O(n)个字符,每个字符可能延长O(n)次,每次延长时只需要O(1)就能判断出是不是对称的,因此整个函数的时间效率是O(n的平方)。

可以用后缀树来求,复杂度是O(n)。

一个错误的做法:网上流传很广的一个错误的做法是,把原字符串逆序,求最大公子串,复杂度是n的平方。(求最大公子串的动态规划算法的子问题/子函数的定义和上面的子问题/子函数的定义非常相似,子函数f(n)并不是前n个元素上的函数,而是定义在第n个元素上的函数。为了求得前n个元素上的解,还需要对f(1),f(2),...,f(n)扫描一遍)。

为什么错误?举个反例S = “abacdfgdcaba”, S’ = “abacdgfdcaba”. The longest common substring between S and S’ is “abacd”. Clearly, this is not a valid palindrome. We could see that the longest common substring method fails when there exists a reversed copy of a non-palindromic substring in some other part of S. To rectify this, each time we find a longest common substring candidate, we check if the substring’s indices are the same as the reversed substring’s original indices. If it is, then we attempt to update the longest palindrome found so far; if not, we skip this and find the next candidate.

下面给出用动态规划解决此问题的代码。

Define P[ i, j ] ← true  iff the substring S i … S j is a palindrome, otherwise false.

Therefore,

P[ i, j ] ← ( P[ i+1, j-1 ]  and S i = S j )

The base cases are:

P[ i, i ] ← true
P[ i, i+1 ] ← ( S i = S i+1 )

This yields a straight forward DP solution, which we first initialize the one and two letters palindromes, and work our way up finding all three letters palindromes, and so on… This gives us a run time complexity of O(N2) and uses O(N2) space to store the table.

string longestPalindromeDP(string s) {
  int n = s.length();
  int longestBegin = 0;
  int maxLen = 1;
  bool table[1000][1000] = {false};
  for (int i = 0; i < n; i++) {
    table[i][i] = true;
  }
  for (int i = 0; i < n-1; i++) {
    if (s[i] == s[i+1]) {
      table[i][i+1] = true;
      longestBegin = i;
      maxLen = 2;
    }
  }
  for (int len = 3; len <= n; len++) {
    for (int i = 0; i < n-len+1; i++) {
      int j = i+len-1;
      if (s[i] == s[j] && table[i+1][j-1]) {
        table[i][j] = true;
        longestBegin = i;
        maxLen = len;
      }
    }
  }
  return s.substr(longestBegin, maxLen);
}

A simpler approach, O(N2) time and O(1) space:
In fact, we could solve it in O(N2) time without any extra space.

We observe that a palindrome mirrors around its center. Therefore, a palindrome can be expanded from its center, and there are only 2N-1 such centers.

You might be asking why there are 2N-1 but not N centers? The reason is the center of a palindrome can be in between two letters. Such palindromes have even number of letters (such as "abba") and its center are between the two 'b's.

Since expanding a palindrome around its center could take O(N) time, the overall complexity is O(N2).

string expandAroundCenter(string s, int c1, int c2) {
  int l = c1, r = c2;
  int n = s.length();
  while (l >= 0 && r <= n-1 && s[l] == s[r]) {
    l--;
    r++;
  }
  return s.substr(l+1, r-l-1);
}
 
string longestPalindromeSimple(string s) {
  int n = s.length();
  if (n == 0) return "";
  string longest = s.substr(0, 1);  // a single char itself is a palindrome
  for (int i = 0; i < n-1; i++) {
    string p1 = expandAroundCenter(s, i, i);
    if (p1.length() > longest.length())
      longest = p1;
 
    string p2 = expandAroundCenter(s, i, i+1);
    if (p2.length() > longest.length())
      longest = p2;
  }
  return longest;
}

这里还有一个线性时间算法。

你可能感兴趣的:(字符串,算法,google)