[UVA 11427] Expect the Expected

《训练指南》 P141

概率DP,数学技巧


Some mathematical background. This problem asks you to compute the expected value of a random variable. If you haven’t seen those before, the simple definitions are as follows. A random variable is a variable that can have one of several values, each with a certain probability. The probabilities of each possible value are positive and add up to one. The expected value of a random variable is simply the sum of all its possible values, each multiplied by the corresponding probability. (There are some more complicated, more general definitions, but you won’t need them now.) For example, the value of a fair,6-sided die is a random variable that has 6 possible values (from 1 to 6), each with a probability of 1/6.Its expected value is 1/6 + 2/6 + . . . + 6/6 = 3.5. Now the problem.I like to play solitaire. Each time I play a game, I have probability p of solving it and probability(1 − p) of failing. The game keeps statistics of all my games – what percentage of games I have won.If I simply keep playing for a long time, this percentage will always hover somewhere around p ∗ 100%.But I want more.Here is my plan. Every day, I will play a game of solitaire. If I win, I’ll go to sleep happy until the next day. If I lose, I’ll keep playing until the fraction of games I have won today becomes larger than p. At this point, I’ll declare victory and go to sleep. As you can see, at the end of each day, I’m guaranteed to always keep my statistics above the expected p ∗ 100%. I will have beaten mathematics! If your intuition is telling you that something here must break, then you are right. I can’t keep doing this forever because there is a limit on the number of games I can play in one day. Let’s say that I can play at most n games in one day. How many days can I expect to be able to continue with my clever plan before it fails? Note that the answer is always at least 1 because it takes me a whole day of playing to reach a failure.

Input

The first line of input gives the number of cases, N. N test cases follow. Each one is a line containing p (as a fraction) and n.1 ≤ N ≤ 3000, 0 ≤ p < 1,The denominator of p will be at most 1000,1 ≤ n ≤ 100.

Output

For each test case, print a line of the form ‘Case #x: y’, where y is the expected number of days,rounded down to the nearest integer. The answer will always be at most 1000 and will never be within 0.001 of a round-off error case.

Sample Input

4

1/2 1

1/2 2

0/1 10

1/2 3

Sample Output

Case #1: 2

Case #2: 2

Case #3: 1

Case #4: 2


做这题的时候我深刻地意识到了高中学的有关概率的知识全部还给老师了,于是乎是时候恶补一发了……orz

不难发现每一天都是独立的。因此每一天能成功完成任务的概率由 p 和 n 确定。设不能完成任务的概率为 P。玩到第 x 天的期望将是 x*P*(1-P)^(x-1) 。然后他要求的是天数的期望,再把每天的期望累加起来。


那么问题有二。


如何求P。

刚开始我还想通过数学推导的方式写出一个 P 关于 p 和 n 的公式。结果因为情况太复杂了,绕来绕去最后把自己绕晕了。过了几天再来想的时候,发现无脑dp 求一下就好了。

用dp[ i ][ j ] 表示到达一天中已经玩了i 局,而 j 局胜利的局面的概率。

那么 dp[ i ][ j ] = dp[ i-1 ][ j ] *(1-p) + dp[ i-1 ][ j-1 ]*p

dp[ 0 ][ 0 ] = 1 , 当 j / i > p 时跳出。小技巧,用 a*i - b*j > 0 以避免double比大小的精度误差。


如何将每天的期望累加起来。

理论上来说,你可以进行到无穷天,不过概率是很小的。最开始我认定它是收敛的,但无奈数学拙计求不了极限。朴素做法就是给个精度,待算到某一天时期望小于精度,就把它掐掉。 然而看了训练指南上的题解后,我再次对自己的数学失去了信心……

EX = P + 2*P*(1-P) + 2*P*(1-P)^2 + 3*P*(1-P)^3 + ......

设 s = EX/P = 1 + 2*(1-P) + 2*(1-P)^2 + 3*(1-P)^3 + ......  (1)

         (1-P)*s = (1-P) + 2*(1-P)^2 + 2*(1-P)^3 + 3*(1-P)^4 + ......  (2)

(1) - (2) 得

EX = 1 + (1-P) + (1-P)^2 + (1-P)^3 + ......  = (1 - (1-P)^d ) / P  , d为天数,趋于无穷

所以 EX = 1 / P


#include <cstdio>
#include <cstdlib>
#include <cstring>

using namespace std;

double dp[110][110];
int a,b,n;
double p,P;

int main()
{
	int T;
	scanf("%d", &T);
	for(int ck=1; ck<=T; ck++)
	{
		scanf("%d/%d %d\n", &a, &b, &n);
		p = a*1.0/(b*1.0);
		memset(dp, 0, sizeof(dp));
		P = 0;
		dp[0][0] = 1;
		for(int i=1; i<=n; i++)
		{
			for(int j=0; a*i-b*j>=0; j++)
			{
				dp[i][j] += dp[i-1][j]*(1-p);
				if(j>0) dp[i][j] += dp[i-1][j-1]*p;
			}
		}
		for(int i=0; a*n-b*i>=0; i++)
			P += dp[n][i];
			
		printf("Case #%d: %d\n", ck, (int)(1/P));
	}
	
	return 0;
}


你可能感兴趣的:([UVA 11427] Expect the Expected)