复杂度:零和博弈,最小最大定理以及LP对偶

Complexity of 2-Player Zero-sum Game

lecturer: Constantinos Daskalakis

Games and Equilibria

Penaliy Shot Game

Drive/Kick Left Right
Left 1,-1 -1,1
Right -1,1 1,-1

这个零和博弈存在混合策略纳什均衡,我们考虑支付期望 i,jci,jxiyj (x1 x2)T[1 1;1,1] 。这里的均衡是1/2.1/2

[von Neumann ‘28]: An equilibrium exists in every two-player zero-sum game (R+C=0)

[Dantzig’ 40s] in fact, this follows from strong LP duality

[Khachivan ‘79’] in P time

[B. 56++] dynamics converges

Penaliy Shot Game - not zero-sum game

Drive/Kick Left Right
Left 2,-1 -1,1
Right -1,1 1,-1

这里的纳什均衡是2/5,3/5

[Nash ‘50/’51]: An equilibrium exists in every finite game.

  • proof used Kakutani/Brouswer’s fixed point theorem, and no constructive proof has been found in 70+ years.
  • same is true for economic equilibria: supply different goods max utility no good is over demanded

Equlibrium:

A pair (x,y) of randomized strategies so that no player has incentive to deviate if the others does not.

xTRyxTRy, xxTCyxTCy, y

Minimax Theory

Minimax Theorem [von Newmann’28]

Suppose X and Y are compact (closed and bounded) convex sets, and f:X×Y is a continuous function that is convex-concave, i.e., f(.,y) is convex for all fixed y , and f(x,.) is concave for all fixed x , then:

minxXmaxyYf(x,y)=maxyYminxXf(x,y)

Proof: Zero-sum Game Two player game has nash equilibrum

(R,C)n×m

R+C=0

X=Δn= { X:Ex=xi0 }

Y=Δm

  • In a zero-sum game, take f(x,y)=XTCY

    • how much row plays colum
  • Then (x,y) is an equilibria, where

    • xargminxXmaxyY f(x,y) and yargminyYmaxxX f(x,y)

      xTCyminx xTCy=maxx xTCy=maxy minxxTCy=minxmaxyxTCy=maxy xTcy

Existence of Equilibrium in Zero-Sum Game [von Neumann’28]

In two-player zero-sum (R+C=0) games, equilibrium always exists.


Proof:

Let f(x,y)=xTCy (the payoff of column player), then f(x,y) satisfies Minimax theorem. Assume

f(x,y)=xTCyminxXmaxyYf(x,y)=maxyYminxXf(x,y)

Then,
xTCy=maxyYf(x,y)xTCy,  yxTCy=xTRyxTRy=minxXf(x,y)xTRy,  x


Presidential Elections

Clin/Tru Morality Tax Cuts
Economy +3,-3 -1, +1
Society -2, +2 1, -1

Suppose Clinton commits to strategy (x1,x2)

E["Morility"]=3x1+2x2

E["TaxCuts"]=x1x2

Tru: max (3x1+2x2,x1x2)

Clin: max (-3x_1+2x_2, x_1-x_2), (x1,x2)argmax min(3x1+2x2,x1x2) , which is a maximin problem

If Clinton is forced to commit to (x1,x2) , argmax(x1,x2) min(3x1+2x2,x1x2) , argmaxX min(XTR)

  • max z
  • s.t. 3x12x2z
  • x1+x2z
  • x1+x2=1
  • x1,x2>0

    No matter what Clin does Trump can guarantee 1/7 to himself by playing (3/7, 4/7)

    No matter what Clin does Trump can guarantee -1/7 to himself by playing (2/7, 5/7)

    i.e. (3/7, 4/7) is best response to (2/7, 5/7) and vise versa

    两边的LP问题其实是对偶问题 strong linear programming duality,这也可以从minimax theory这个角度来看

    方法二: 从minimax问题直接切入

你可能感兴趣的:(算法博弈)