Dynamic Programming | Set 24 (Optimal Binary Search Tree)
Given a sorted array keys[0.. n-1] of search keys and an array freq[0.. n-1] of frequency counts, where freq[i] is the number of searches to keys[i]. Construct a binary search tree of all keys such that the total cost of all the searches is as small as possible.
Let us first define the cost of a BST. The cost of a BST node is level of that node multiplied by its frequency. Level of root is 1.
Example 1
Input: keys[] = {10, 12}, freq[] = {34, 50}
There can be following two possible BSTs
10 12
\ /
12 10
I II
Frequency of searches of 10 and 12 are 34 and 50 respectively.
The cost of tree I is 34*1 + 50*2 = 134
The cost of tree II is 50*1 + 34*2 = 118
Example 2
Input: keys[] = {10, 12, 20}, freq[] = {34, 8, 50}
There can be following possible BSTs
10 12 20 10 20
\ / \ / \ /
12 10 20 12 20 10
\ / / \
20 10 12 12
I II III IV V
Among all possible BSTs, cost of the fifth BST is minimum.
Cost of the fifth BST is 1*50 + 2*34 + 3*8 = 142
1) Optimal Substructure:
The optimal cost for freq[i..j] can be recursively calculated using following formula.
We need to calculate optCost(0, n-1) to find the result.
The idea of above formula is simple, we one by one try all nodes as root (r varies from i to j in second term). When we make rth node as root, we recursively calculate optimal cost from i to r-1 and r+1 to j.
We add sum of frequencies from i to j (see first term in the above formula), this is added because every search will go through root and one comparison will be done for every search.
http://www.geeksforgeeks.org/dynamic-programming-set-24-optimal-binary-search-tree/
关键:
如何计算整个优化BST的查找次数,这里不用按上面说的一层一层地去计算,
而是巧妙地计算:每增加一个根节点的时候,根节点下面的所有节点的访问次数都需要增加1倍,那么只需要把这些节点的总和加起来就可以了,因为根节点的访问次数也是一倍,所以可以连根节点也加进球,就得到最后的总访问频率了。
经验:一些看起来简单的小函数,要用最优的方法实现就不是那么容易了。
原网站的时间效率是O(n*n*n*n),原文也提到可以提高到O(n*n*n)。
下面给出两个程序,时间效率都提高到了O(n*n*n)。
第一个程序只需要修改一下计算sum的位置就可以了,
第二个程序增加计算了一个sumTable,记录了所有sum,实际运行效率会稍微快点。
int sum(int freq[], int i, int j)
{
int s = 0;
for (; i <= j; i++)
{
s +=freq[i];
}
return s;
}
int optimalSearchTree(int keys[], int freq[], int n)
{
vector<vector<int> > tbl(n, vector<int>(n, INT_MAX));
for (int i = 0; i < n; i++)
{
tbl[i][i] = freq[i];
}
for (int d = 2; d <= n; d++)
{
int i = 0, j = d-1;
for (int r = 0; r <= n-d; r++, i++, j++)
{
int s = sum(freq, i, j);
for (int k = i; k < d; k++)
{
int t = (k > i? tbl[i][k-1]:0)+(k < j? tbl[k+1][j]:0)+s;
tbl[i][j] = min(tbl[i][j], t);
}
}
}
return tbl[0][n-1];
}
//优化sum
vector<vector<int> > sumTable(int freq[], int n)
{
vector<vector<int> > tbl(n, vector<int>(n));
for (int i = 0; i < n; i++)
{
tbl[i][i] = freq[i];
}
for (int i = 0; i < n; i++)
{
for (int j = i+1; j < n; j++)
{
tbl[i][j] = tbl[i][j-1] + freq[j];
}
}
return tbl;
}
int optimalSearchTree_2(int keys[], int freq[], int n)
{
vector<vector<int> > tbl(n, vector<int>(n, INT_MAX));
for (int i = 0; i < n; i++)
{
tbl[i][i] = freq[i];
}
vector<vector<int> > sum_tbl = sumTable(freq, n);
for (int d = 2; d <= n; d++)
{
int i = 0, j = d-1;
for (int r = 0; r <= n-d; r++, i++, j++)
{
for (int k = i; k < d; k++)
{
int t = (k > i? tbl[i][k-1]:0)+(k < j? tbl[k+1][j]:0)+sum_tbl[i][j];
tbl[i][j] = min(tbl[i][j], t);
}
}
}
return tbl[0][n-1];
}
测试:
int main()
{
int keys[] = {10, 12, 20};
int freq[] = {34, 8, 50};
int n = sizeof(keys)/sizeof(keys[0]);
printf("Cost of Optimal BST is %d\n", optimalSearchTree(keys, freq, n));
printf("Cost of Optimal BST is %d\n", optimalSearchTree_2(keys, freq, n));
system("pause");
return 0;
}