博弈DP专题

在做这类博弈题的时候,时不时会让人陷入“如何找一个最优的贪心策略”这么一个局面,所以开这么一个专题来收集这类的博弈题以此告诫自己。

对待这类题目,经常是动态规划与记忆化搜索结合。




Play Game hdu-4597

Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65535/65535 K (Java/Others)
Total Submission(s): 936    Accepted Submission(s): 551


Problem Description
Alice and Bob are playing a game. There are two piles of cards. There are N cards in each pile, and each card has a score. They take turns to pick up the top or bottom card from either pile, and the score of the card will be added to his total score. Alice and Bob are both clever enough, and will pick up cards to get as many scores as possible. Do you know how many scores can Alice get if he picks up first?
 

Input
The first line contains an integer T (T≤100), indicating the number of cases.
Each case contains 3 lines. The first line is the N (N≤20). The second line contains N integer a i (1≤a i≤10000). The third line contains N integer b i (1≤b i≤10000).
 

Output
For each case, output an integer, indicating the most score Alice can get.
 

Sample Input

2 1 23 53 3 10 100 20 2 4 3
 

Sample Output

53 105

题意:有两堆含有n张牌的卡组,两个玩家轮流从两堆卡组中任意一端抽取一张卡,每张卡有一个权值,问先手最多得多少分

思路:首先要记住一点,因为是博弈,所以双方都会选择对自己最有益的策略,即当前局面我抽某张卡能使我最终获得的权值最多。

我们用dp[l][r][ll][rr]来表示一个状态,即第一堆牌已经取到[l,r],第二堆牌已经取到[ll,rr]时,当前玩家所能获得的最大值。我们只需要考虑当前情况下,我取某一个数的值+剩下所有数的和-对方状态下的最优解。递归出口,就是当卡的个数为1或为0,这看代码写法,我以0作为递归出口。

#pragma comment(linker, "/STACK:1024000000,1024000000") 
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
using namespace std;

#define LL long long
#define ULL unsigned long long
int n;

int num1[25],num2[25];
int sum1[25],sum2[25];
int dp[25][25][25][25];
int dfs(int l,int r,int ll,int rr){
    if(dp[l][r][ll][rr]!=-1) return dp[l][r][ll][rr];
    if(l>r&&ll>rr) return 0;
    else if(ll>rr){
        if(l==r) return dp[l][r][ll][rr] = num1[l];
        else {
            dp[l][r][ll][rr] = max(dp[l][r][ll][rr],num1[l]+sum1[r]-sum1[l]+sum2[rr]-sum2[ll-1]-dfs(l+1,r,ll,rr));
            dp[l][r][ll][rr] = max(dp[l][r][ll][rr],num1[r]+sum1[r-1]-sum1[l-1]+sum2[rr]-sum2[ll-1]-dfs(l,r-1,ll,rr));
            return dp[l][r][ll][rr];
        }
    }        
    else if(l>r){
        if(ll==rr) return dp[l][r][ll][rr] = num2[ll];
        else {
            dp[l][r][ll][rr] = max(dp[l][r][ll][rr],num2[ll]+sum1[r]-sum1[l-1]+sum2[rr]-sum2[ll]-dfs(l,r,ll+1,rr));
            dp[l][r][ll][rr] = max(dp[l][r][ll][rr],num2[rr]+sum1[r]-sum1[l-1]+sum2[rr-1]-sum2[ll-1]-dfs(l,r,ll,rr-1));
            return dp[l][r][ll][rr];
        }
    }
    else {
        dp[l][r][ll][rr] = max(dp[l][r][ll][rr],num1[l]+sum1[r]-sum1[l]+sum2[rr]-sum2[ll-1]-dfs(l+1,r,ll,rr));
        dp[l][r][ll][rr] = max(dp[l][r][ll][rr],num1[r]+sum1[r-1]-sum1[l-1]+sum2[rr]-sum2[ll-1]-dfs(l,r-1,ll,rr));
        dp[l][r][ll][rr] = max(dp[l][r][ll][rr],num2[ll]+sum1[r]-sum1[l-1]+sum2[rr]-sum2[ll]-dfs(l,r,ll+1,rr));
        dp[l][r][ll][rr] = max(dp[l][r][ll][rr],num2[rr]+sum1[r]-sum1[l-1]+sum2[rr-1]-sum2[ll-1]-dfs(l,r,ll,rr-1));
        return dp[l][r][ll][rr];
    }    
    
    
}
int main(void){
    int t;
    scanf("%d",&t);
    while(t--){
        scanf("%d",&n);
        sum1[0]=sum2[0]=0;
        for(int i=1;i<=n;i++) scanf("%d",&num1[i]),sum1[i]=sum1[i-1]+num1[i];
        for(int i=1;i<=n;i++) scanf("%d",&num2[i]),sum2[i]=sum2[i-1]+num2[i];
        memset(dp,-1,sizeof(dp));
        printf("%d\n",dfs(1,n,1,n));
    }
    
    return 0;
}


poj-1440
Varacious Steve
Time Limit: 3000MS   Memory Limit: 10000K
Total Submissions: 360   Accepted: 166

Description

Steve and Digit bought a box containing a number of donuts. In order to divide them between themselves they play a special game that they created. The players alternately take a certain, positive number of donuts from the box, but no more than some fixed integer. Each player's donuts are gathered on the player's side. The player that empties the box eats his donuts while the other one puts his donuts back into the box and the game continues with the "looser" player starting. The game goes on until all the donuts are eaten. The goal of the game is to eat the most donuts. How many donuts can Steve, who starts the game, count on, assuming the best strategy for both players? 

Write a program that: 

  • reads the parameters of the game from the standard input, 

  • computes the number of donuts Steve can count on, 

  • writes the result to the standard output. 

Input

The rst and only line of the input contains exactly two integers n and m separated by a single space, 1 <= m <= n <= 100 - the parameters of the game, where n is the number of donuts in the box at the beginning of the game and m is the upper limit on the number of donuts to be taken by one player in one move. 

Process to the end of file. 

Output

The output contains exactly one integer equal to the number of donuts Steve can count on.

Sample Input

5 2

Sample Output

3

题意:盒子里有n个甜甜圈,两个人博弈,每次最多取m个,当盒子里的全取完,取完的能吃下自己取到的所有甜甜圈,对方把自己取到的放回去,并重新开始,对方先取,问最后游戏无法再进行时,先手最多能吃掉多少甜甜圈。

思路:我们用dp[a][b][c]保存从当前状态开始,我取了a个,对方取b个,还剩c个时我最终所获得的甜甜圈个数。

当c>m时,dp[a][b][c] = max(dp[a][b][c],a+b+c-dp[b][k+a][c-k]),其中k为我取的个数,范围为[1,m],因为dp[b][k+a][c-k]是对方在那个状态下最终的结果,所以我们以总数减去对方的最优策略,即是我能得到的结果。

当c<=m时,当c不等于m时与上面方程一样,当c等于m时

dp[a][b][c] = max(dp[a][b][c],a+b+c-dp[0][0][b])

这里相当于我赢了,以对方的个数重新开始,递归出口是当全部取完,即dp[0][0][0]的时候为0


#pragma comment(linker, "/STACK:1024000000,1024000000") 
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
using namespace std;

#define LL long long
#define ULL unsigned long long
int dp[105][105][105];
int n,m;
int dfs(int a,int b,int c){
	if(dp[a][b][c]!=-1) return dp[a][b][c];
	if(a==0&&b==0&&c==0) return dp[a][b][c]=0;
	else if(c>m){
		for(int i=1;i<=m;i++)
			dp[a][b][c] = max(dp[a][b][c],a+b+c-dfs(b,a+i,c-i));
		return dp[a][b][c];
	}
	else {
		for(int i=1;i<=c;i++)
			if(i==c)
				dp[a][b][c] = max(dp[a][b][c],a+b+c-dfs(0,0,b));
			else dp[a][b][c] = max(dp[a][b][c],a+b+c-dfs(b,a+i,c-i));
		return dp[a][b][c];
	}
}
int main(void){
	while(~scanf("%d%d",&n,&m)){
		memset(dp,-1,sizeof(dp));
		int ans = dfs(0,0,n);
		printf("%d\n",ans);
	}
	
	return 0;
}


你可能感兴趣的:(DP,博弈)