Strategic Form of 2-person: The strategic form, or normal form, of a two-person zero-sum game is given by a triplet ( X , Y , A ) (X, Y, A) (X,Y,A), where
The interpretation is as follows. Simultaneously, Player I chooses x ∈ X x \in X x∈X and Player II chooses y ∈ Y y \in Y y∈Y, each unaware of the choice of the other. Then their choices are made known and I wins the amount A ( x , y ) A (x, y) A(x,y) from II.
Matrix Games : A finite two-person zero-sum game in strategic form, ( X , Y , A ) (X, Y, A) (X,Y,A), is sometimes called a matrix game because the essential information of the game, the payoff function A, can be represented by a matrix.
If X = { x 1 , . . . , x m } X =\{x_1 , . . . , x_m\} X={x1,...,xm} and Y = { y 1 , . . . , y n } Y = \{y_1, . . . , y_n\} Y={y1,...,yn}, then by the game matrix or payoff matrix we mean the matrix where a i j = A ( x i , y j ) a_{ij} = A(x_i, y_j) aij=A(xi,yj)
Removing Dominated Strategies: We say the i th row of a matrix A = ( a i j ) A = (a_{ij} ) A=(aij) dominates the k th row if a i j ≥ a k j a_{ij} \geq a_{kj} aij≥akj for all j. We say the i th row of A strictly dominates the k th row if a i j > a k j a_{ij} > a_{kj} aij>akj for all j. Similarly, the j th column of A dominates (strictly dominates)the k th column if a i j ≤ a i k a_{ij} \leq a_{ik} aij≤aik (resp. a i j < a i k a_{ij} < a_{ik} aij<aik ) for all i.
Best Response: Let ( X , Y , A ) (X, Y, A) (X,Y,A) be a matrix game. Let x ∈ X x \in X x∈X be a strategy of Player I. Then, y ∈ Y y \in Y y∈Y is called a Best Response to x x x if A ( x , y ) ≤ A ( x , y ’ ) A(x, y) \leq A (x, y’ ) A(x,y)≤A(x,y’) for any y ’ ∈ Y y’ \in Y y’∈Y. Equivalently, Let y ∈ Y y \in Y y∈Y be a strategy of Player II. Then, x ∈ X x \in X x∈X is called a Best Response to y y y if A ( x , y ) ≥ A ( x ’ , y ) A(x, y) \geq A (x’, y) A(x,y)≥A(x’,y) for any x ’ ∈ X x’ \in X x’∈X.
We may apply the following two principles to analyze the game.
Saddle points: If some entry a i j a_{ij} aij of the matrix A A A has the property that
then we say a i j a_{ij} aij is a saddle point. If a i j a_{ij} aij is a saddle point, then Player I can then win at least a i j a_{ij} aij by choosing row i, and Player II can keep her loss to at most a i j a_{ij} aij by choosing column j.
Maximin Principle means to find the “risk” for each strategy and then find the strategy, called the safety strategy, with minimum risk.
Suppose A = ( − 2 3 3 − 4 ) A=\begin{pmatrix}-2 & 3\\ 3 &-4 \end{pmatrix} A=(−233−4) If Player I uses Row 1, the worst payoff is -2. If Player I uses Row 2, the worst payoff is -4. Therefore, the “best” of the worst case is -2. max i min j A ( x i , x j ) = − 2 \max_i\min_j A(x_i,x_j)=-2 maximinjA(xi,xj)=−2
Mixed Strategies : Consider a finite 2-person zero-sum game, ( X , Y , A ) (X, Y, A) (X,Y,A), with m × n m\times n m×n matrix, A A A. Let us take the strategy space X X X to be the first m m m integers, X = { 1 , 2 , . . . , m } X = \{1, 2, . . .,m\} X={1,2,...,m}, and similarly, Y = { 1 , 2 , . . . , n } Y = \{1, 2, . . . , n\} Y={1,2,...,n}. A mixed strategy for Player I may be represented by a column vector, ( p 1 , p 2 , . . . , p m ) T (p_1 , p_2 , . . . , p_m )^T (p1,p2,...,pm)T of probabilities that add up to 1. Similarly, a mixed strategy for Player II is an n-tuple q = ( q 1 , q 2 , . . . , q n ) T q = (q_1 , q_2 , . . . , q_n )^T q=(q1,q2,...,qn)T.
The sets of mixed strategies of players I and II will be denoted respectively by X ∗ , Y ∗ X^*, Y^* X∗,Y∗.
X ∗ = { p = ( p 1 , . . . , p m ) T ∣ p i ≥ 0 , f o r i = 1 , . . . , m a n d p 1 + . . . + p m = 1 } X^* = \{p = (p_1 , . . . , p_m )^T | p_i \geq 0, for \;\;i = 1, . . . , m \;\;and \;\;p_1 +...+ p_m =1\} X∗={p=(p1,...,pm)T∣pi≥0,fori=1,...,mandp1+...+pm=1}
Y ∗ = { q = ( q 1 , . . . , q n ) T ∣ q j ≥ 0 , f o r j = 1 , . . . , n a n d q 1 + . . . + q n = 1 } Y^* = \{q = (q1 , . . . , qn )^T | q_j \geq 0, for \;\;j = 1, . . . , n \;\;and \;\; q_1 +...+ q_n =1\} Y∗={q=(q1,...,qn)T∣qj≥0,forj=1,...,nandq1+...+qn=1}
p = ( p 1 , . . . , p m ) T p = (p_1, . . . , p_m )^T p=(p1,...,pm)T means playing Row 1 with probability p 1 p_1 p1, playing Row 2 with probability p 2 p_2 p2 ,…, playing Row m with probability p m p_m pm.
The m-dimensional unit vector e k ∈ X ∗ e_k \in X^* ek∈X∗ with a one at the kth component and zeros elsewhere may be identified with the pure strategy of choosing row k.
We could if we like consider the game ( X , Y , A ) (X, Y, A) (X,Y,A) in which the players are allowed to use mixed strategies as a new game ( X ∗ , Y ∗ , A ) (X^*, Y^*,A) (X∗,Y∗,A), where A ( p , q ) = p T A q = ∑ i ∑ j p i a i j q j A(p, q) = p^T Aq = \sum_i\sum_j p_i a_{ij} q_j A(p,q)=pTAq=∑i∑jpiaijqj
Utility Assumption : The basic premise of utility theory is that one should evaluate a payoff by its utility to the player rather than on its numerical monetary value.
Safety Strategies :For each p ∈ X ∗ p \in X^* p∈X∗ , the worst payoff is min q ∈ Y ∗ p T A q \min_{q\in Y^*} p^T Aq minq∈Y∗pTAq The minimum is achieved at a pure strategy of Player II.
Theorem : There exists p ′ ∈ X ∗ p' \in X^* p′∈X∗ such that
V ′ = min q ∈ Y ∗ p ′ T A q = max p ∈ X ∗ min q ∈ Y ∗ p T A q V'=\min_{q\in Y^*}p'^T Aq=\max_{p\in X^*} \min_{q\in Y^*}p^TAq V′=q∈Y∗minp′TAq=p∈X∗maxq∈Y∗minpTAq
p ′ p' p′ is called the safety strategy of Player I and the value V ′ V' V′ is called the lower value of the game
The minimum is achieved by a pure strategy of Player II.
If Player I use his safety strategy then he is guaranteed to lose at lease V ′ V' V′ no matter what Player II does
Theorem 2 : There exists q ′ ∈ Y ∗ q' \in Y^* q′∈Y∗ such that
V ′ ′ = max p ∈ X ∗ p T A q ′ = min q ∈ Y ∗ max p ∈ X ∗ p T A q V''=\max_{p\in X^*}p^T Aq'=\min_{q\in Y^*} \max_{p\in X^*}p^TAq V′′=p∈X∗maxpTAq′=q∈Y∗minp∈X∗maxpTAq
q ′ q' q′ is called the safety strategy of Player II and the value V ′ ′ V'' V′′ is called the upper value of the game
If Player II use his safety strategy then he is guaranteed to lose at most V ′ ′ V'' V′′ no matter what Player I does
Theorem : V ′ ≤ V ′ ′ V'\leq V'' V′≤V′′
Minimax Theorem : V = V ′ = V ′ ′ V=V'=V'' V=V′=V′′ is called the value of the game.
Theorem : The two lines will meet within [0,1] iff No saddle point.
Solving 2 × n 2 \times n 2×n and m × 2 m \times 2 m×2 games ( m × 2 m \times 2 m×2 games ) :
Example : Reduction by removing dominated strategies.
A = ( 0 4 6 5 7 4 9 6 3 ) A=\begin{pmatrix}0&4&6\\ 5 &7&4\\9&6&3 \end{pmatrix} A=⎝⎛059476643⎠⎞
We can eliminate column 2 by a mixture of Column 1 & 3.
1 3 ( 0 5 9 ) + 2 3 ( 6 4 3 ) ≤ ( 4 7 6 ) \frac{1}{3}\begin{pmatrix}0\\ 5\\ 9\end{pmatrix}+\frac{2}{3}\begin{pmatrix}6\\ 4\\ 3\end{pmatrix}\leq \begin{pmatrix}4\\ 7\\ 6\end{pmatrix} 31⎝⎛059⎠⎞+32⎝⎛643⎠⎞≤⎝⎛476⎠⎞
Theorem : Maximin Principle ⇔ \Leftrightarrow ⇔ Equilibrium Principle.
Lemma : Let p belong to X ∗ X^* X∗ and we let J p = { j : C o l u m n j i s a B R t o p } J_p=\{j: Column\; j\; is \;a\; BR\; to\; p\} Jp={j:ColumnjisaBRtop}, the set of best response columns to p. Then, q = ( q 1 , . . . q n ) T q=(q_1,...q_n)^T q=(q1,...qn)T is a BR to p p p iff q j > 0 q_j>0 qj>0 implies j belongs to J p J_p Jp.
The Lemma says that a BR to p of Player I uses only the BR columns to p. Similarly, a BR to q of Player II uses only the BR rows to p.
Theorem (Principle of Indifference) : Let p = ( p 1 , . . . p m ) T p=(p_1,...p_m )^T p=(p1,...pm)T lie in X ∗ X^* X∗ and q = ( q 1 , . . . q n ) T q=(q_1,...q_n)^T q=(q1,...qn)T lie in Y ∗ Y^* Y∗ such that < p , q >
The Principle of Indifference gives the following recipe to find all the best response strategies to a given strategy :
Equilibrium Principle is equivalent to Maximin Principle
Theorem : Let < p 1 , q 1 >
Equalizing strategy : To define an equalizing strategy we only need to know that it equalizes on the opponent’s pure strategies.
Magic Square Games:A magic square game is an matrix game such that the row sums and the column sums of the payoff matrix are all equal. Then, < 1 4 , 1 4 , 1 4 , 1 4 > <\frac{1}{4},\frac{1}{4},\frac{1}{4},\frac{1}{4}> <41,41,41,41> is an equalizing strategy for Player I and, < 1 4 , 1 4 , 1 4 , 1 4 > <\frac{1}{4},\frac{1}{4},\frac{1}{4},\frac{1}{4}> <41,41,41,41> is an equalizing strategy for Player II.
Diagonal Games : A diagonal game is a game such that its payoff matrix is a diagonal matrix.
Solution for diagonal games :
Finding safety (Optimal) strategies when the value is given : min q p T A q = V \min_q p^TAq = V minqpTAq=V can be expressed as a system of n inequalities by noting that for a fixed p, the minimum of p T A q p^TAq pTAq is attained at a pure strategy q q q.
Theorem: Let A = ( a i j ) A=(a_{ij} ) A=(aij) be a matrix game with value V. The p p p is a safety strategy for Player I iff The following n inequalities are satisfied.
p 1 a 11 + . . . + p m a m 1 ≥ V . . . p 1 a 1 n + . . . + p m a m n ≥ V \\p_1a_{11}+...+p_ma_{m1}\geq V \\... \\p_1a_{1n}+...+p_ma_{mn}\geq V p1a11+...+pmam1≥V...p1a1n+...+pmamn≥V
Symmetric Games : A game is symmetric if the rules do not distinguish between players. Both players have the same options, and the payoff when Player I uses i and Player II uses j is the negative of the payoff when Player I uses j and Player II uses i. Thus the payoff matrix can be arranged to be skew-symmetric, i.e. A = − A T A= -A^T A=−AT .
Theorem : A finite symmetric game has value zero. Any strategy optimal (safety) for one player is also optimal for the other.
Theorem : Let the value of the game be V. Let P be the set of optimal strategies of Player I. Let Q be the set of optimal strategies of Player II. P, Q are compact convex polyhedra.
Let C be a convex set. A point x in C is called an extreme point if and only if one cannot find y, z in C distinct from x such that x=(y + z)/2. Note that the extreme points of a compact convex polyhedron are its vertices.
Essential submatrix : An essential submatrix of a matrix game is characterized by the following.
The Primal Problem of a linear programming problem is in the following form.
Choose x 1 x_1 x1,…, x m x_m xm To minimize c 1 x 1 + . . . + c m x m c_1x_1 +...+ c_mx_m c1x1+...+cmxm
Subject to the constraints :
a 11 x 1 + . . . + a m 1 x m ≥ b 1 . . . a 1 n x 1 + . . . + a m n x m ≥ b n \\a_{11}x_1 + ... + a_{m1}x_m \geq b_1 \\... \\a_{1n}x_1 + ... + a_{mn}x_m \geq b_n a11x1+...+am1xm≥b1...a1nx1+...+amnxm≥bn
and x i ≥ 0 x_i\geq 0 xi≥0 for i = 1 , . . . , m i=1,...,m i=1,...,m
The Dual Problem to the primal problem is to choose y 1 , . . . , y n y_1,...,y_n y1,...,yn To maximize b 1 y 1 + . . . + b n y n b_1y_1 +...+ b_ny_n b1y1+...+bnyn
Subject to the constraints :
a 11 y 1 + . . . + a 1 n y n ≤ c 1 . . . a m 1 y 1 + . . . + a m n y n ≤ c m \\a_{11}y_1 + ... +a_{1n}y_n \leq c_1 \\... \\a_{m1}y_1 + ... +a_{mn}y_n \leq c_m a11y1+...+a1nyn≤c1...am1y1+...+amnyn≤cm
and y j ≥ 0 y_j\geq 0 yj≥0 for j = 1 , . . . , n j=1,...,n j=1,...,n
Theorem : (Duality Theorem of LP) The optimal value of the Primal Problem and the optimal value of the Dual Problem are equal
The Duality Theorem implies the Minimax Theorem in game theory.
Solution to the Primal Problem in 2-Person Zero-Sum Game :
Choose p 1 p_1 p1,…, p m p_m pm To maximize v v v
Subject to the constraints :
a 11 p 1 + . . . + a m 1 p m ≥ v . . . a 1 n p 1 + . . . + a m n p m ≥ v p 1 + . . . p m = 1 \\a_{11}p_1 + ... + a_{m1}p_m \geq v \\... \\a_{1n}p_1 + ... + a_{mn}p_m \geq v \\p_1+...p_m=1 a11p1+...+am1pm≥v...a1np1+...+amnpm≥vp1+...pm=1
and p i ≥ 0 p_i\geq 0 pi≥0 for i = 1 , . . . , m i=1,...,m i=1,...,m
Let x i = p i v x_i=\frac{p_i}{v} xi=vpi,then
Choose x 1 x_1 x1 ,…, x m x_m xm To minimize x 1 + . . . x m x_1+...x_m x1+...xm
Subject to the constraints :
a 11 x 1 + . . . + a m 1 x m ≥ 1 . . . a 1 n x 1 + . . . + a m n x m ≥ 1 \\a_{11}x_1 + ... + a_{m1}x_m \geq 1 \\... \\a_{1n}x_1 + ... + a_{mn}x_m \geq 1 a11x1+...+am1xm≥1...a1nx1+...+amnxm≥1
and x i ≥ 0 x_i\geq 0 xi≥0 for i = 1 , . . . , m i=1,...,m i=1,...,m
In the extensive form, games are sequential, interactive processes which moves from one position to another in response to the wills of the players or the wills of chance.
Games in Extensive Form : Three new concepts make their appearance in the extensive form of a game: the game tree, chance moves, and information sets
The Game Tree : The extensive form of a game is modeled using a directed Graph that we have seen in combinatorial games. The vertices represent positions of the game. The followers, F(x), of a position, x, are those positions that can be reached from x in one move.
A path from a vertex t 0 t_0 t0 to a vertex t 1 t_1 t1 then represents the whole history of play in the game moving from position t 0 t_0 t0 to t 1 t_1 t1
The Kuhn tree of a game : When a tree is used to define a game, the vertices are interpreted as positions and the edges as moves. Each nonterminal position is assigned either to the player responsible to choosing the next move or to chance.
Chance Moves : Many games involve chance moves. It is assumed that the players are aware of the probabilities of the various outcomes resulting from a chance move. To incorporate chance moves in the Kuhn tree, we introduce a new player called Chance. The moves from Chance carry different probabilities. The probabilities assigned to the edges leading out of each chance vertex must be displayed.
Play starts at the initial vertex and continues along one of the paths eventually ending in one of the terminal vertices. Each directed path from the root to terminal vertex describes a possible course of play.
Information : Another important aspect we must consider in studying the extensive form of games is the amount of information available to the players about past moves of the game. When a player choosing a move is uninformed about some of the previous moves (concealed moves, simultaneous moves etc.), we draw little “balloons” around set of vertices which the player making the choice cannot discriminate. These are called information sets, and the player in question must make the same choice for all vertices in the set.(The player has the same information on the vertexes in the same balloon)
Edges issuing from vertices in the information set must be labeled the same. For example, {paper, scissor, rock}
In summary, the following are the basic elements of a Kuhn tree :
Games of Perfect Information : A game of perfect information is a game in extensive form in which each
information set of every player contains a single vertex.
Games of perfect recall : Games in which players remember all past information they once knew and all past moves they made are called games of perfect recall.
Games of complete information : Games in which both players know the rules of the game, that is, in which both players know the Kuhn tree, are called games of complete information.
Pure strategy : A pure strategy is a player’s complete plan for playing the game. It should cover every
contingency. A pure strategy for a Player is a rule that tells him exactly what move to make in each of his information sets. It should specify a particular edge leading out from each information set
If the set of edges coming out from the information sets are { A , B } , { E , F } , { C , D } \{A,B\}, \{E,F\}, \{C,D\} {A,B},{E,F},{C,D} , Set of Pure Strategies : { A , B } × { C , D } × { E . F } = { A C E , A C F , A D E , A D F , B C E , B C F , B D E , B D F } \{A,B\}\times\{C,D\}\times\{E.F\}=\{ACE, ACF, ADE, ADF, BCE, BCF, BDE, BDF\} {A,B}×{C,D}×{E.F}={ACE,ACF,ADE,ADF,BCE,BCF,BDE,BDF}
Reduced pure strategy : Specification of choices at all information sets except those that are eliminated by the previous moves.
Random payoffs : The actual outcome of the game for given pure strategies of the players depends on the chance moves selected, and is therefore a random quantity. We represent random payoffs by their expected values.
Game in strategic form (normal form) :
Set of reduced strategies for I : { A E , A F , B C , B D } \{AE,AF, BC,BD\} {AE,AF,BC,BD}
Set of reduced strategies for II : { a c , a d , b c , b d } \{ac,ad, bc,bd\} {ac,ad,bc,bd}
Strategic Form :
ac | ad | bc | bd | |
---|---|---|---|---|
AE | (10,0) | |||
AF | (0,20) | |||
BC | (-2.3,2.2) | |||
BD | (-0.5,-2.3) |
Strategic Equilibrium : A vector of pure strategy choices ( x 1 , x 2 , . . . , x n ) (x_1 , x_2 , . . . , x_n) (x1,x2,...,xn) with x i ∈ X i x_i \in X_i xi∈Xi for i = 1 , . . . , n i = 1, . . . , n i=1,...,n is said to be a pure strategic equilibrium, or PSE for short, if for all i = 1 , 2 , . . . , n i = 1, 2, . . . , n i=1,2,...,n, and for all x ∈ X i x \in X_i x∈Xi
u i ( x 1 , . . . , x i − 1 , x i , x i + 1 , . . . , x n ) ≥ u i ( x 1 , . . . , x i − 1 , x , x i + 1 , . . . , x n ) u_i(x_1, . . . , x_{i-1}, x_i, x_{i+1}, . . . , x_n) \geq u_i (x_1, . . . , x_{i-1}, x,x_{i+1}, . . . , x_{n}) ui(x1,...,xi−1,xi,xi+1,...,xn)≥ui(x1,...,xi−1,x,xi+1,...,xn)
Equation says that if the players other than player i i i use their indicated strategies, then the best player i i i can do is to use x i x_i xi. Such a pure strategy choice of player i i i is called a best response to the strategy choices of the other players.
Labeling Algorithm for Finding All PSE’s(2 Person)
Bad News : Many games have no PSE’s
Method of Backward Induction : Starting from any terminal vertex and trace back to the vertex leading to it. The player at this vertex will discard those edges with lower payoff. Then, treat this vertex as a terminal vertex and repeat the process. Then, we get a path from the root to a terminal vertex.
Theorem : The path obtained by the method of backward induction defines a PSE.
If a strategic Form comes from an extensive form with perfect information then there exists at least one PSE by the method of Backward Induction.
Theorem : The PSE obtained by the method of backward induction satisfies stronger properties so that it is called a perfect pure strategy equilibrium or PPSE.
Subgame : A subgame of a game presented in extensive form is obtained by taking a vertex in the Kuhn tree and all the edges and paths originated from this vertex.
Theorem : A PSE of a game in extensive form is called a Perfect Pure Strategy Equilibrium (PPSE) if it is a PSE for all subgames.
Subgame perfect implies that when making choices, a player look forward and assumes that the choice that will subsequently be made by himself and by others will be rational. Threats which would be irrational to carry through are ruled out. It is precisely this kind of forward looking rationality that is most suited to economic applications.
PSE allow players to make noncredible threats provided they never have to carry them out. Decisions on the equilibrium path are driven in part by what the players expect will happen off the equilibrium path.
Behavior Strategies : A behavior strategy is obtained if the Player makes a randomized choice independently at each of his/her information sets.
Example : Player I has 3 information sets. The set of edges coming out from the information sets are P = { A , B } , Q = { E , F } , R = { C , D } P=\{A,B\}, Q=\{E,F\}, R=\{C,D\} P={A,B},Q={E,F},R={C,D}
A behavior strategy ( p , q , r ) (p, q, r) (p,q,r) means Probability p q r pqr pqr for ACE, p ( 1 − q ) r p(1-q)r p(1−q)r for ADE, p q ( 1 − r ) pq(1-r) pq(1−r) for ACF, p ( 1 − q ) ( 1 − r ) p(1-q)(1-r) p(1−q)(1−r) for ADF. This is a mixed strategy for Player I.
Each behavior strategy induces a probability distribution on the set of terminal vertices.
Each mixed strategy induces a probability distribution on the set of terminal vertices.
Dimension of behavior strategies : Suppose the Player has m information sets such that there are k i k_i ki choices at the ith information set. Then, the dimension of behavior strategies for this player is ( k 1 − 1 ) + ( k 2 – 1 ) + … + ( k m – 1 ) (k_1-1) + (k_2– 1) +…+ (k_m – 1) (k1−1)+(k2–1)+…+(km–1). Note that the space of mixed strategies is ( k 1 k 2 … k m − 1 ) ( k_1 k_2…k_m -1) (k1k2…km−1)
Question: Can each mixed strategy represented by an “equivalent” behavior strategy?
From the consideration of dimension, the answer should be negative without additional assumptions.
Theorem (Harold Kuhn) : For a game of perfect recall, mixed strategies are equivalent to behavior strategies.
General-Sum Strategic Form Games : The normal or strategic form of a two person game is given by two sets X X X and Y Y Y of pure strategies of the players, and two real-valued functions u 1 ( x , y ) u_1(x, y) u1(x,y) and u 2 ( x , y ) u_2(x, y) u2(x,y) defined on X × Y X \times Y X×Y, representing the payoffs to the two players. If Player I chooses x ∈ X x \in X x∈X and Player II chooses y ∈ Y y \in Y y∈Y, then Player I receives u 1 ( x , y ) u_1(x, y) u1(x,y) and Player II receives u 2 ( x , y ) u_2(x, y) u2(x,y)
The Maximin Principle does not apply to bimatrix games!
The theory is generally divided into two branches, the noncooperative theory and the cooperative theory.
Safety Levels : In a bimatrix game with m×n matrices A and B, Player I can guarantee winning on the average at least v 1 = max p min j ( p 1 a 1 j + … + p m a m j ) = V a l ( A ) v_1 =\max_p \min_j (p_1 a_{1j}+…+p_m a_{mj})= Val(A) v1=maxpminj(p1a1j+…+pmamj)=Val(A). This is called the safety level of Player I. A strategy, p, that achieves the maximum is caled a maxmin strategy for Player I.
Noncooperative Games Strategic Equilibrium : A vector of mixed strategy choices (also called a strategy profile) ( p 1 , p 2 , . . . , p n ) p_1, p_2, . . . , p_n) p1,p2,...,pn) with p i ∈ X i ∗ p_i \in X^*_i pi∈Xi∗ for i = 1 , . . . , n i =1, . . . , n i=1,...,n is said to be a strategic equilibrium, or SE for short, if for all i = 1 , 2 , . . . , n i = 1, 2, . . . , n i=1,2,...,n, and for all p ∈ X i ∗ p \in X^*_i p∈Xi∗, u i ( p 1 , . . . , p i − 1 , p i , p i + 1 , . . . , p n ) ≥ u i ( p 1 , . . . , p i − 1 , p , p i + 1 , . . . , p n ) u_i(p_1 , . . . , p_{i−1}, p_i , p_{i+1}, . . . , p_n ) \geq u_i(p_1 , . . . , p_{i−1}, p, p_{i+1}, . . . , p_n ) ui(p1,...,pi−1,pi,pi+1,...,pn)≥ui(p1,...,pi−1,p,pi+1,...,pn) (*). Any mixed strategy p i p_i pi that satisfies (*) for all p ∈ X i ∗ p ∈ X^*_i p∈Xi∗ is a best response of player I to the mixed strategies of the other players.
Nash equilibria : Every finite n-person game in strategic form has at least one strategic equilibrium.
Theorem : < p , q >
Labeling algorithm to check for SE’s : Let [ A , B ] [A,B] [A,B] be a bimatrix game. Given a mixed strategy p = ( p 1 , … p m ) p=(p_1 ,…p_m) p=(p1,…pm) for Player I, and q = ( q 1 , … , q n ) q=(q_1 ,…,q_n) q=(q1,…,qn) a mixed strategy for Player II, there is a simple labeling algorithm to determine whether < p , q >
<p,q> is a SE whenever a + + + label is accompanied by a BR label
Theorem :
is a SE if p , q , α , β p,q,\alpha,\beta p,q,α,β satisfy the following optimization problem.
max ( p T ( A + B ) q − α − β ) \max (p^T(A+B)q - \alpha - \beta) max(pT(A+B)q−α−β) , Π n = ( 1 , 1 , 1 , . . . , 1 ) T \Pi_n=(1,1,1,...,1)^T Πn=(1,1,1,...,1)T , Subject o the following constraints
A q − α Π m ≤ 0 p T Π m − 1 = 0 p T B − β Π n T ≤ 0 q T Π n − 1 = 0 0 ≤ q , 0 ≤ p \\Aq - \alpha \Pi_m \leq 0 \\p^T \Pi_m-1=0 \\p^T B - \beta\Pi_n^T\leq 0 \\q^T \Pi_n -1=0 \\0\leq q, 0\leq p Aq−αΠm≤0pTΠm−1=0pTB−βΠnT≤0qTΠn−1=00≤q,0≤p
Remark: The optimal value is 0.
Tetraskelion Method to find all SE’s for 2 × 2 2\times2 2×2 games :
Duopoly : There are two firms sending out fishing boats every day to catch fish to sell in the market the same day. The day’s catch of fish is brought into port by 9:00 am and must be sold the same day. Each day is taken to be entirely separate in the sense that the two firms concern themselves with the profit of only the current day and engage in no longer-term planning at all. When they set sail, each has already decided on the catch to bring back that day, and neither knows what the other has decided. Perhaps each has an educated guess, but neither has spies in the other’s camp or any source of factual knowledge of the other’s commitments. Each of them, however, does know both cost functions, as well as the market demand function which is a single market price because the catches of the two firms are identical. The decision variables of the firms are their output levels.
Cournot 1838 : Two firms produce the same product. Cost function for Firm 1: C 1 ( q 1 ) = q 1 C_1 (q_1)=q_1 C1(q1)=q1. Cost function for Firm 2: C 2 ( q 2 ) = q 2 C_2 (q_2)=q_2 C2(q2)=q2. q 1 , q 2 q_1, q_2 q1,q2 are the quantities produced. Price function: P ( q ) = 25 − q P(q)=25-q P(q)=25−q, where q q q is the total amount in the market
Duopoly: Firm 1 and Firm 2 will use quantity as strategy. Profit to Firm 1: P 1 ( q 1 ) = ( 25 − q ) q 1 − q 1 P_1(q_1)=(25-q)q_1-q_1 P1(q1)=(25−q)q1−q1, Profit to Firm 2: P 2 ( q 2 ) = ( 25 − q ) q 2 − q 2 P_2(q_2)=(25-q)q_2-q_2 P2(q2)=(25−q)q2−q2, where q = q 1 + q 2 q=q_1+q_2 q=q1+q2. Firm 1 will find q 1 q_1 q1 to maximize profit. Firm 2 will find q 2 q_2 q2 to maximize profit.
Stackelberg Leader and Follower : Suppose Firm 1 moves first. Firm 1 announces the quantity q 1 q_1 q1 (strategy) that it will produce. Knowing this, Firm 2 will choose its quantity q 2 q_2 q2 to maximize profit (as BR). Then, q 2 = 1 2 ( 24 − q 1 ) q_2= \frac{1}{2} (24-q_1) q2=21(24−q1) , Firm 1 knows this, its profit is ( 25 − q ) q 1 − q 1 (25-q)q_1-q_1 (25−q)q1−q1 , where q = q 1 + 1 2 ( 24 − q 1 ) q=q_1 + \frac{1}{2} (24-q_1) q=q1+21(24−q1)
Crossing Game :
Cross | Yield | |
---|---|---|
Cross | (-10,-10) | (5,0) |
Yield | (0,5) | (-1,-1) |
A traffic light might be installed. The traffic light is merely a randomization device which, with a certain probability, tells each of the two drivers whether to Cross or to Yield. This is not much different from a pair of mixed strategies. Except that now the two players’ probability distribution on strategies can be correlated. For example, the traffic light can be
Cross | Yield | |
---|---|---|
Cross | 0 | 0.55 |
Yield | 0.40 | 0.05 |
Will the players comply or will they deviate from the traffic light’s instruction? If the traffic light is operated in a way that
both players see that it is in their best interest to comply then we have an EQUILIBRIUM! We will call this a Correlated
Equilibrium.
Suppose Player I knows that Player II will comply with the traffic light instructions. When Player I sees the signal to Cross, he will not deviate because he knows that (Cross, Cross) occurs with p r o b = 0 prob=0 prob=0. When Player I sees the signal to Yield, he can compute the conditional probability of instructions to Player II given that his instruction is Yield. Then,
P r o b ( I I t o C r o s s ∣ I t o Y i e l d ) = P r o b ( I I t o C r o s s , I t o Y i e l d ) P r o b ( I t o Y i e l d ) = 0.4 0.45 P r o b ( I I t o Y i e l d ∣ I t o Y i e l d ) = P r o b ( I I t o Y i e l d , I t o Y i e l d ) P r o b ( I t o Y i e l d ) = 0.05 0.45 \\Prob(II \;to \;Cross| I\; to\; Yield)= \frac{Prob(II \;to\; Cross, I\; to\; Yield)}{Prob(I\; to\; Yield)}=\frac{0.4}{0.45} \\Prob(II \;to \;Yield| I\; to\; Yield)= \frac{Prob(II \;to\; Yield, I\; to\; Yield)}{Prob(I\; to\; Yield)}=\frac{0.05}{0.45} Prob(IItoCross∣ItoYield)=Prob(ItoYield)Prob(IItoCross,ItoYield)=0.450.4Prob(IItoYield∣ItoYield)=Prob(ItoYield)Prob(IItoYield,ItoYield)=0.450.05
Then, the expected payoff to Player I if he complies with the instruction to Yield is 0 8 9 + ( − 1 ) 1 9 = − 1 9 0\frac{8}{9} + (-1)\frac{1}{9}= -\frac{1}{9} 098+(−1)91=−91. The expected payoff to Player I if he deviates from the instruction to Yield is − 10 8 9 + 5 1 9 = − 75 9 -10\frac{8}{9} + 5\frac{1}{9}=-\frac{75}{9} −1098+591=−975. This is far inferior to his payoff if he complies with the instruction. Same does Player II.
Therefore, it is for the best interest of both players to comply with the instruction of the traffic light.This probability distribution over the outcomes is then called a Correlated Equilibrium.
Correlated Equilibrium : In general, let [ A , B ] [A, B] [A,B] be an m × n m\times n m×n bimatrix game where A = ( a i j ) , B = ( b i j ) A=(a_{ij}),B=(b_{ij}) A=(aij),B=(bij). A probability distribution of the outcomes of this game is given by p i j , p i j ≥ 0 p_{ij}, p_{ij}\geq 0 pij,pij≥0, ∑ i , j p i j = 1 \sum_{i,j}p_{ij}=1 ∑i,jpij=1. This means that the instruction of playing
(Row i, Column j) to the players with probability p i j p_{ij} pij. This probability distribution is called a Correlated Equilibrium (CE) if it is for the best interest for players to comply with the instruction.
P r o b ( C o l u m n j f o r I I ∣ R o w i f o r I ) = P r o b ( C o l u m n j f o r I I , R o w i f o r I ) P r o b ( R o w i f o r I ) = p i j ∑ k p i k Prob(Column\; j\; for\; II| Row\; i\; for\; I)= \frac{Prob(Column \;j\; for\; II, Row\; i\; for\; I)}{Prob(Row\; i\; for\; I)} =\frac{p_{ij}}{\sum_kp_{ik}} Prob(ColumnjforII∣RowiforI)=Prob(RowiforI)Prob(ColumnjforII,RowiforI)=∑kpikpij. For Player I to comply with the instruction to play Row i, the payoff must be better than deviating the instruction in playing Row r instead. This means
∑ j p i j ∑ k p i k a i j ≥ ∑ j p i j ∑ k p i k a r j ∀ r ∈ { 1 , 2 , 3... m } \sum_j \frac{p_{ij}} {\sum_k p_{ik}}a_{ij} \geq \sum_ j \frac{p_{ij}} {\sum_kp_{ik}}a_{rj}\;\;\forall r \in\{1,2,3...m\} j∑∑kpikpijaij≥j∑∑kpikpijarj∀r∈{1,2,3...m}
Equivalently and similarly,
∑ j p i j ( a i j – a r j ) ≥ 0 ∀ i , r = 1 , . . . m ∑ i p i j ( b i j – b i s ) ≥ 0 ∀ j , s = 1 , . . . n . \\\sum_j p_{ij} (a_{ij} – a_{rj})\geq 0 \;\;\forall i, r=1,...m \\\sum_i p_{ij} (b_{ij} – b_{is})\geq 0 \;\;\forall j, s=1,...n. j∑pij(aij–arj)≥0∀i,r=1,...mi∑pij(bij–bis)≥0∀j,s=1,...n.
For an m × n m\times n m×n game, there are m ( m − 1 ) + n ( n − 1 ) m(m-1) +n(n-1) m(m−1)+n(n−1) inequalities to characterize an CE.
Proposition: Any SE is a CE (correlated equilibrium).
Proposition: The convex combination of two CE’s is a CE.
The corresponding payoff of a CE may be outside the convex hull of the corresponding payoff points of SEs.
Proposition: Let [A, -A] be a 2-person 0-sum game with value V. For any CE of [A, -A], the payoff vector is (V, -V).
Computation of CE : Find ( p i j ) (p_{ij}) (pij) such that
∑ j p i j ( a i j – a r j ) ≥ 0 ∀ i , r = 1 , . . . m ∑ i p i j ( b i j – b i s ) ≥ 0 ∀ j , s = 1 , . . . n p i j ≥ 0 , ∑ i , j p i j = 1 \\\sum_j p_{ij} (a_{ij} – a_{rj})\geq 0\;\;\forall i, r=1,...m \\\sum_i p_{ij} (b_{ij} – b_{is})\geq 0\;\;\forall j, s=1,...n \\ p_{ij}\geq 0, \sum_{i,j} p_{ij} =1 j∑pij(aij–arj)≥0∀i,r=1,...mi∑pij(bij–bis)≥0∀j,s=1,...npij≥0,i,j∑pij=1
We can represent the above inequalities as finding an optimal strategy, ( p i j ) (p_{ij}) (pij), of the row chooser of a certain 2-person 0-sum game whose value is nonnegative. The payoff matrix A of this game is an m n × ( m ( m − 1 ) + n ( n − 1 ) ) mn \times (m(m-1) + n(n-1)) mn×(m(m−1)+n(n−1)) matrix.
For 2 × 2 2\times 2 2×2 Game,
p 11 p_{11} p11 | a 11 − a 21 a_{11}-a_{21} a11−a21 | 0 | b 11 − b 12 b_{11}-b_{12} b11−b12 | 0 |
p 12 p_{12} p12 | a 12 − a 22 a_{12}-a_{22} a12−a22 | 0 | 0 | b 12 − b 11 b_{12}-b_{11} b12−b11 |
p 21 p_{21} p21 | 0 | a 21 − a 11 a_{21}-a_{11} a21−a11 | b 21 − b 22 b_{21}-b_{22} b21−b22 | 0 |
p 22 p_{22} p22 | 0 | a 22 − a 12 a_{22}-a_{12} a22−a12 | 0 | b 22 − b 21 b_{22}-b_{21} b22−b21 |
Example : Hawk & Dove
Hawk | Dove | |
---|---|---|
Hawk | ( 1 2 ( V − D ) , 1 2 ( V − D ) ) (\frac{1}{2}(V-D),\frac{1}{2}(V-D)) (21(V−D),21(V−D)) | ( V , 0 ) (V,0) (V,0) |
Dove | ( 0 , V ) (0,V) (0,V) | ( 1 2 V , 1 2 V ) (\frac{1}{2}V,\frac{1}{2}V) (21V,21V) |
We are then studying the symmetric bimatrix game [ A , A T ] [A, A^T] [A,AT]. From Game Theory, we will arrive at the concept of
finding a strategy p such that < p , p >
Evolution Stable Strategy( p E S S p^{ESS} pESS) : A mutation occurs, producing a small numbers of players with mutant strategy p M p^M pM, p M ≠ p E S S p^M \not = p^{ESS} pM=pESS. The proportion of the population with this mutant strategy is ε \varepsilon ε. Then for a player using p E S S p^{ESS} pESS the expected payoff is
π E S S = ( 1 − ε ) π ( p E S S , p E S S ) + ε π ( p E S S , p M ) \pi^{ESS}=(1-\varepsilon)\pi(p^{ESS},p^{ESS})+\varepsilon \pi(p^{ESS},p^M) πESS=(1−ε)π(pESS,pESS)+επ(pESS,pM)
A player using p M p^M pM the payoff is
π M = ( 1 − ε ) π ( p M , p E S S ) + ε π ( p M , p M ) \pi^M =(1-\varepsilon)\pi(p^M,p^{ESS})+\varepsilon \pi(p^M,p^M) πM=(1−ε)π(pM,pESS)+επ(pM,pM)
We want π M < π E S S \pi^M <\pi^{ESS} πM<πESS , This is guaranteed if for p M ≠ p E S S p^M \not = p^{ESS} pM=pESS
Theorem: Suppose the payoff matrix is ( A , A T ) (A, A^T) (A,AT). Suppose for the jth column of A A A is such that its diagonal element, a j j a_{jj} ajj, is the largest (strictly) of the column. Then playing the jth strategy is an ESS.
Necessary and sufficient condition for an ESS : Let [ A , A T ] [A, A^T] [A,AT] be a given bimatrix game, where A A A is an n × n n\times n n×n matrix. Let p = ( p 1 , … , p n ) p=(p_1,…,p_n) p=(p1,…,pn) be a mixed strategy. Then p p p is an ESS iff
Let A A A be the Paper-Scissor-Rock payoff matrix. Then [ A , A T ] [A, A^T] [A,AT] does not have an ESS.
An evolutionary process has two basic elements: mutation and selection mechanism. The ESS concept highlights mutation. Replicator dynamics highlights the role of selection.
Replicator dynamics is formalized as a system of ODE that does not include any mutation mechanism. Instead robustness against mutation is indirectly taken care of by dynamic stability criteria
Replicator Dynamics : Let p = ( p 1 , … p n ) T p=(p_1 ,…p_n)^T p=(p1,…pn)T be a mixed strategy for a symmetric bimatrix game [ A , A T ] [A, A^T] [A,AT]. p i p_i pi denotes the proportion of population using the ith strategy. Payoff to strategy i against p p p is ∑ j a i j p j \sum_j a_{ij}p_j ∑jaijpj. Payoff to strategy p p p against p p p is ∑ i ∑ j a i j p i p j \sum_i \sum_j a_{ij}p_ip_j ∑i∑jaijpipj
We assume the growth rate of the population using strategy i is proportional to the advantage of that strategy over the whole population. For simplicity, we assume the constant is 1. Therefore, growth rate of the population playing strategy i is
∑ j a i j p j − ∑ k , j p k a k j p j \sum_j a_{ij}p_j -\sum_{k,j} p_k a_{kj}p_j j∑aijpj−k,j∑pkakjpj
Replicator Dynamics Equations : d p i d t = [ ∑ j a i j p j − ∑ k , j p k a k j p j ] p i \frac{d p_i}{dt} =[\sum_j a_{ij}p_j -\sum_{k,j} p_k a_{kj}p_j]p_i dtdpi=[∑jaijpj−∑k,jpkakjpj]pi , i = 1 , … n i=1,…n i=1,…n.
An equilibrium point of the set of Replicator Dynamics Equations is for a solution such that RHS= 0.
RHS=0 iff ∑ j a i j p j = ∑ k , j p k a k j p j ∀ p i ≠ 0 \sum_j a_{ij}p_j =\sum_{k,j} p_k a_{kj}p_j \;\;\forall p_i\not=0 ∑jaijpj=∑k,jpkakjpj∀pi=0. iff p p p is a BR to p p p.
Therefore, equilibrium point of Replicator Dynamics Equations iff is a SE.
Repeat Game : Introduce a discount factor β. x at the nth game is worth βn-1x.
Finite Automata :
1-memory automata : The player can only remember what happened in the previous move and make decision based on that. f : { C C , C D , D C , D D } → { C , D } f: \{CC, CD, DC, DD\}\rightarrow\{C, D\} f:{CC,CD,DC,DD}→{C,D} When both players are using a1-memory automata, we can use the technique of transition matrix in the study. Each row of the matrix is a probability vector giving the probability of transiting to one of the 4 possible states (CC, CD, DC, DD).
CC | CD | DC | DD | |
---|---|---|---|---|
CC | p 11 p_{11} p11 | p 12 p_{12} p12 | ||
CD | p 21 p_{21} p21 | |||
DC | ||||
DD |
Let v T = ( p 1 , p 2 , p 3 , p 4 ) v^T = (p_1,p_2, p_3,p_4) vT=(p1,p2,p3,p4) be a probability vector of the outcomes from the previous game. Then, the result for the next game will be ( v T P ) T = P T v (v^TP)^T =P^Tv (vTP)T=PTv.
PR: Permanent Retaliation
Cooperate, until, if ever, the opponent defects, then defect for ever.
Tit-for-Tat : Cooperate first, then do what your opponent’s previous move.
A strategy is said to be Nice: Start cooperating and never the first to defect
Let S be a nice strategy and that is a SE. Then β is sufficiently large.
Let S be a nice strategy and that is a SE. Then S should be provoked (when the opponent plays D, S must retaliate by playing D at some later move) by the defection of the opponent.
In the cooperative theory, we allow communication of the players and also allow binding agreements to be made. This requires some outside mechanism to enforce the agreements.
The cooperative theory is divided into two classes of problems depending on whether or not her is a mechanism for transfer of utility from one player to the other. TU/NTU
Feasible Sets of Payoff Vectors. For any game its important to know the set of feasible payoffs vectors for the players. For a noncooperative game, this set is called the noncooperative feasible set.
When players cooperate in a bimatrix game with matrices [A,B], they may agree to achieve a payoff vector of any of the m*n points, ( a i j , b i j a_{ij}, b_{ij} aij,bij) for i= 1, . ., m and j= 1. . ,n. They may also agree to any probability mixture of these points.
The NTU feasible set is the convex hull of the m*n points, ( a i j , b i j a_{ij}, b_{ij} aij,bij) for i=1, . ., m and j =1. . ,n.
The distinguishing feature of the TU case is that he players may make side payments of utility as part of the agreement.
Thus the whole line of slope −1 through the point (aij, bij) is part of the TU feasible set.
A feasible payoff vector, ( v 1 , v 2 v_1, v_2 v1,v2), is said to be Pareto optimal if or any feasible payoff vector ( v 1 ’ , v 2 ’ v_1’ , v_2’ v1’,v2’) such that v 1 ’ ≥ v 1 v_1’ \geq v_1 v1’≥v1 and v 2 ’ ≥ v 2 v_2 ’ \geq v_2 v2’≥v2 implies ( v 1 ’ , v 2 ’ v_1’, v_2’ v1’,v2’) =( v 1 , v 2 v_1,v_2 v1,v2).
The TU Solution: If the players come to an agreement, then rationality implies that they will agree to play to achieve the largest possible total payoff, call it σ σ σ, σ = max { ( a i j + b i j ) : i , j } σ =\max \{(a_{ij} + b_{ij}): i,j \} σ=max{(aij+bij):i,j} as the payoff to be divided between them.
Suppose now that he players have selected their threat strategies, say p for Player I and q for Player II. This point is in the NTU feasible set and is called the disagreement point or threat point.
Player I will accept no less than D 1 D_1 D1 and Player II will accept no less than D 2 D_2 D2 since these can be achieved if no agreement is reached.
They will split evenly the excess of σ \sigma σ over D 1 + D 2 D_1 + D_2 D1+D2
The resolution point is then ϕ = ( ϕ 1 , ϕ 2 ) = ( 1 2 ( σ + D 1 – D 2 ) , 1 2 ( σ − D 1 + D 2 ) ) ϕ = (ϕ_1, ϕ_2)= (\frac{1}{2} (σ+ D_1– D_2), \frac{1}{2} (σ- D_1 + D_2)) ϕ=(ϕ1,ϕ2)=(21(σ+D1–D2),21(σ−D1+D2)) To select the threat optimally, Player I wants to maximize D 1 − D 2 D_1 − D_2 D1−D2 and Player I wants to minimize it.
Let δ = V a l ( A − B ) = p ∗ T ( A − B ) q ∗ δ =Val(A − B) ={p^*}^T (A − B)q^* δ=Val(A−B)=p∗T(A−B)q∗, ϕ = ( 1 2 ( σ + δ ) , 1 2 ( σ − δ ) ) \phi = (\frac{1}{2}(\sigma+\delta),\frac{1}{2}(\sigma-\delta)) ϕ=(21(σ+δ),21(σ−δ))
All such disagreement points must be on the same 45 degrees line.
Nash Bargaing Model. A bargaing problem is given by ( S , u ∗ , v ∗ ) (S, u^*, v^*) (S,u∗,v∗) such that
Given an NTU-feasible set, S, and a threat point or status-quo point, ( u ∗ , v ∗ ) ∈ S (u^∗ , v^∗) \in S (u∗,v∗)∈S, the problem is to find a point, ( u # , v # ) = f ( S , u ∗ , v ∗ ) (u^\# , v^\#) =f(S, u^∗ , v^∗) (u#,v#)=f(S,u∗,v∗) , to be considered a “fair and reasonable outcome” or “solution” of the game for an arbitrary compact convex set S and point ( u ∗ , v ∗ ) ∈ S (u^∗ , v^∗) \in S (u∗,v∗)∈S.
Nash Axioms for f ( S , u ∗ , v ∗ ) = ( u # , v # ) f(S, u^∗, v^∗) =(u^\# , v^\#) f(S,u∗,v∗)=(u#,v#).
Theorem. There exist a unique function f satisfying the Nash axioms. Moreover, if there exists a point ( u , v ) ∈ S (u, v)\in S (u,v)∈S such that u > u ∗ u> u^∗ u>u∗ and v > v ∗ v> v^∗ v>v∗ , then f ( S , u ∗ , v ∗ ) f(S, u^∗ , v^∗) f(S,u∗,v∗) is that point of S that maximizes ( u − u ∗ ) ( v − v ∗ ) (u− u^* )(v −v^* ) (u−u∗)(v−v∗) among points of S such that u ≥ u ∗ u\geq u^∗ u≥u∗ and v ≥ v ∗ v\geq v^∗ v≥v∗. It’s called Nash product.
Kali-Smorodinsky solution: For any bargaining problem ( S , ( d 1 , d 2 ) ) (S, (d_1, d_2)) (S,(d1,d2)), define I 1 = max { u : f o r s o m e v , ( u , v ) ∈ S , d 1 ≤ u , d 2 ≤ v } I_1 = \max \{u: for \;some \;v, (u,v)\in S, d_1\leq u, d_2\leq v \} I1=max{u:forsomev,(u,v)∈S,d1≤u,d2≤v}, I 2 = max { v : f o r s o m e u , ( u , v ) ∈ S , d 1 ≤ u , d 2 ≤ v } I_2 = \max \{v: for\; some\; u,(u, v)\in S, d_1 \leq u, d_2 \leq v \} I2=max{v:forsomeu,(u,v)∈S,d1≤u,d2≤v} The Kali-Smorodinsky solution ( u 1 , u 2 ) (u_1, u_2) (u1,u2) is on the Pareto optimal boundary such that ( u 1 – d 1 ) / ( I 1 – u 1 ) = ( u 2 – d 2 ) / ( I 2 – u 2 ) (u_1– d_1)/(I_1– u_1) =(u_2– d_2) /(I_2– u_2) (u1–d1)/(I1–u1)=(u2–d2)/(I2–u2)
Suppose that, at a given stage of the bargaining process, Player I’s last offer corresponded to the payoff vector v = ( v 1 , v 2 ) v=(v_1 , v_2) v=(v1,v2) whereas Player II’s last offer corresponded to the payoff vector $w=(w_1 , w_2) $ such that v and w belong to the Pareto Optimal boundary of the cooperative feasible set, with v 1 > w 1 v_1>w_1 v1>w1 but v 2 < w 2 v_2
If Player I accepts his opponent’s last offer w, then he will get payoff w 1 w_1 w1. On the other hand, if he insists on his own last offer v, then he will obtain either payoff v 1 v_1 v1 or will obtain the disagreement point payoff d 1 d_1 d1 i.e Player II sticks to his last offer.
Suppose Player I assigns probability p to the later possibility and assigns probability (1-p) to the former possibility. Then, he must conclude that by insisting on his last offer v and by making no further concession he will get the expected payoff ( 1 − p ) v 1 + p d 1 (1-p)v_1+pd_1 (1−p)v1+pd1, whereas by being conciliatory and accepting Player II’s last offer w he will get w 1 w_1 w1 Therefore, Player I can rationally stick to his last offer only if ( 1 − p ) v 1 + p d 1 ≥ w 1 (1-p) v_1 + pd_1 \geq w_1 (1−p)v1+pd1≥w1 i.e. if p ≤ ( v 1 − w 1 ) / ( v 1 − d 1 ) = r 1 p \leq (v_1 -w_1 ) /(v_1-d_1) =r_1 p≤(v1−w1)/(v1−d1)=r1
We will call r 1 r_1 r1, r 2 r_2 r2 the two players’ risk limits expressed as ratio of difference in payoffs.
Zeuthen suggests that the next concession must always come from the player with a smaller risk limit
Theorem: If the two players follow Zeuthen’s Principle then the next concession will always made by the player whose last offer is associated with a smaller Nash product (unless both are associated with equal Nash product, in which case both of them have to make concessions).
A coalition, S, is defined to be a subset of N, S ⊂ N S\subset N S⊂N, and the set of al coalitions is denoted by 2N. We also speak of the empty set, Ø, as a coalition, the empty coalition. The set N is also a coalition, called the grand
coalition.
The coalitional form of a n-person game is given by the pair (N, v), where v is a real-valued function, called the characteristic function of the game, defined on the set, 2N, of all coalitions (subsets of N), and satisfying
We define v(S) for each S∈ 2N as the value of the 2-person zero-sum game obtained when the coalition S acts as one player and the complementary coalition, SC = N −S, acts as the other player, and where the payoff to S is ∑ i ∈ S u i ( x 1 . . . x n ) \sum_{i\in S}u_i(x_1...x_n) ∑i∈Sui(x1...xn) the total of the payoffs to the players in S
The value, v(S), is the analogue of the safety level. It represents the total amount that coalition S can guarantee for itself.v is a characteristic function.
A game in coalitional form is said to be constant sum, if v(S)+v(SC) =v(N) for all coalitions S∈ 2N. It is said to be zero-sum if, in addition, v(N) =0.
A payoff vector, x = ( x 1 , x 2 , . . , x n ) x = (x_1, x_2, . ., x_n) x=(x1,x2,..,xn), is said to be group rational or efficient if x 1 + … + x n = v ( N ) x_1+…+x_n = v(N) x1+…+xn=v(N).
A payoff vector, x, is said to be individually rational if x i ≥ v ( { i } ) x_i ≥ v(\{i\}) xi≥v({i})
An imputation is a payoff vector that is group rational and individually rational.
A game in coalitional form is said to be inessential if ∑ i v ( { i } ) = v ( N ) \sum_iv(\{i\})=v(N) ∑iv({i})=v(N), and essential if ∑ i v ( { i } ) ≠ v ( N ) \sum_iv(\{i\})\not=v(N) ∑iv({i})=v(N).
An imputation x is said to be unstable through a coalition S if v ( S ) > ∑ i ∈ S x i v(S) >\sum_{i∈S}x_i v(S)>∑i∈Sxi
The set, C, of stable imputations is called the core, C = { x = ( x 1 , . . , x n ) : ∑ i ∈ N x i = v ( N ) a n d ∑ i ∈ S x i ≥ v ( S ) , f o r a l l S ⊂ N } C =\{x = (x_1, . ., x_n) :\sum_{i∈N} x_i = v(N) and \sum _{i∈S}x_i \geq v(S), for all S\subset N\} C={x=(x1,..,xn):∑i∈Nxi=v(N)and∑i∈Sxi≥v(S),forallS⊂N}.
The core of an essential n-person constant-sum game is empty.
The Shapley Value is to assign to each game in coalitional form a unique vector of payoffs, called the value. The ith entry of the value vector may be considered as a measure of the value or power of the ith player in the game.
φ i ( v ) φ_i(v) φi(v) represents the worth or value of player in the game with characteristic function v. The axioms of fairness are placed on the function, φ φ φ.
The Shapley Axioms for φ(v):
Example: Let S ⊂ N S\subset N S⊂N Define a characteristic function w S w_S wS such that w S ( T ) = 1 w_S (T)= 1 wS(T)=1if S ⊂ T S\subset T S⊂