博弈论(上)

Introduction

Three mathematical models: Extensive Form, Strategic Form, Coalition Form.

Two Person 0-Sum Games

Games with only two players in which one player wins what the other player loses.

Strategic Form

The strategic form, or normal form, of a two-person zero-sum game is given by a triplet (X, Y, A), where

(1) X is a nonempty set, the set of strategies of Player I

(2) Y is a nonempty set, the set of strategies of Player II

(3) A is a real-valued function defined on X × Y.

(Thus, A(x, y) is a real number for every x X and every y ⊆ Y)

The interpretation is as follows:

Simultaneously, Player I chooses x ⊆ X and Player II chooses y ⊆ Y, each unaware of the choice of the other.

Then their choices are made known and I wins the amount A(x, y) from II.

If A is negative, I pays the absolute value of this amount to II.

Thus, A(x, y) represents the winnings of I and the losses of II.

Matrix Games

A finite two-person zero-sum game in strategic form, (X, Y, A), is sometimes called a matrix game because the essential information of the game, the payoff function A, can be represented by a matrix.

If X = x1,...,xm{x_1, . . . , x_m} and Y = y1,...,yn{y_1, . . . , y_n}, then by the game matrix or payoff matrix we mean the matrix

A=(a11a1nam1amn)A=\begin{pmatrix} a_{11} & \cdots & a_{1n}\\ \vdots & & \vdots\\ a_{m1} & \cdots & a_{mn} \end{pmatrix}

where aij=A(xi,yj),a_{ij} = A(x_i, y_j),

In this form, Player I chooses a row, Player II chooses a column, and II pays I the entry in the chosen row and column.

Note that the entries of the matrix are the winnings of the row chooser and losses of the column chooser.

Player I: the Row choser.

Player II: the Column chooser.

Dominated Strategies

Definition: We say the ithi^{th} row of a matrix A=(aij)A = (a_{ij}) dominates the kth row if aijakja_{ij} \geq a_{kj} for all j. We say the ithi^{th} row of A strictly dominates the kthk^{th} row if aijakja_{ij} a_{kj} for all j.

Similarly, the jthj^{th} column of A dominates (strictly dominates) the kthk^{th} column if aijaika_{ij} \leq a_{ik} (resp. aij<aika_{ij} < a_{ik}) for all i.

Best Response

Definition: Let (X, Y, A) be a matrix game. Let x ⊆ X be a strategy of Player I. Then, y ⊆ Y is called a Best Response (BR) to x if A(x, y) ≤ A (x, y’) for any y’ ⊆ Y.

Equivalently, Let y ⊆ Y be a strategy of Player II. Then, x ⊆ X is called a Best Response (BR) to y if A(x, y) ≥ A (x’, y) for any x’ ⊆ X.

Equilibrium Principle: Best Responses to each other.

Maximum Principle: Safety First.

Saddle points

Pure Strategic Equilibrium, PSE, for 2-person 0-sum games.

If some entry aija_{ij} of the matrix A has the property that

(1) aija_{ij} is the minimum of the ithi^{th} row, and

(2) aija_{ij} is the maximum of the jthj^{th} column,

then we say aija_{ij} is a saddle point. If aija_{ij} is a saddle point, then Player I can then win at least aija_{ij} by choosing row i, and Player II can keep her loss to at most aija_{ij} by choosing column j.

<Row i, Column j> is then an equilibrium pair, i.e. BR to each other.

Maximin Principle

Maximin Principle means to find the “risk” for each strategy and then find the strategy, called the safety strategy, with minimum risk.

Mixed Strategies

Consider a finite 2-person zero-sum game, (X, Y, A), with m×n matrix, A. Let us take the strategy space X to be the first m integers, X = {1, 2, . . .,m}, and similarly, Y = {1, 2, . . . , n}.

A mixed strategy for Player I may be represented by a column vector, (p1,p2,...,pm)T(p_1, p_2, . . . , p_m )^T of probabilities that add up to 1.

Similarly, a mixed strategy for Player II is an n-tuple q=(q1,q2,...,qn)T\mathbf q = (q_1, q_2, . . . , q_n)^T.

The sets of mixed strategies of players I and II will be denoted respectively by X*, Y*:

X* = {p\{ \mathbf p = (p1,...,pm)T(p_1, . . . , p_m )^T : pip_i ≥ 0, for i = 1, . . . , m and p1+...+pmp_1 +...+ p_m =1}\}

Y* = {q\{ \mathbf q = (q1,...,qn)T(q_1, . . . , q_n)^T : qjq_j ≥ 0, for j = 1, . . . , n and q1+...+qnq_1 +...+ q_n =1}\}

p=(p1,p2,...,pm)T\mathbf p = (p_1, p_2, . . . , p_m)^T means playing Row 1 with probability p1p_1, playing Row 2 with probability p2p_2, ... , playing Row m with probability pmp_m.

The m-dimensional unit vector ek\mathbf e_k ⊆ X* with a one at the kthk^{th} component and zeros elsewhere may be identified with the pure strategy of choosing row k.

We may consider the set of Player I’s pure strategies, X, to be a subset of X*.

Similarly, Y may be considered to be a subset of Y*.

We could if we like consider the game (X, Y, A) in which the players are allowed to use mixed strategies as a new game (X*, Y*,A), where A(p,q)=pTAq=ijpiaijqjA(\mathbf p, \mathbf q) = \mathbf p^T\mathbf A \mathbf q = \sum\limits_i\sum\limits_j p_ia_{ij}q_j.

Safety Strategies

For each p ⊆ X*, the worst payoff is minqYpTAq\min\limits_{q\in Y*}p^TAq. The minimum is achieved at a pure strategy of Player II.

Lemma: For each fixed p, the minimum of A(p, q) as a function of q on Y* is achieved at a vertex of Y*, or, equivalently, at a pure strategy.

Theorem: There exists p~X\tilde p\in X^* such that minqYp~TAq=maxpXminqYpTAq\min\limits_{q\in Y^*}\tilde p^TAq = \max\limits_{p\in X^*}\min\limits_{q\in Y^*}p^TAq, p~\tilde p is called the safety strategy of Player I and V=minqYp~TAq=maxpXminqYpTAq\underline V = \min\limits_{q\in Y^*}\tilde p^TAq = \max\limits_{p\in X^*}\min\limits_{q\in Y^*}p^TAq is called the lower value of the game.

Theorem: There exists q~Y\tilde q\in Y^* such that minpXpTAq~=minqYmaxpXpTAq=V\min\limits_{p\in X^*}p^TA\tilde q = \min\limits_{q\in Y^*}\max\limits_{p\in X^*}p^TAq = \overline V. q~\tilde q is called the safety strategy of Player II and V\overline V is called the upper value of the game.

Theorem: VV\underline V \leq \overline V .

MiniMax Theorem

minqYmaxpXpTAq=maxpXminqYpTAq,i.e.V=V\min\limits_{q\in Y^*}\max\limits_{p\in X^*}p^TAq = \max\limits_{p\in X^*}\min\limits_{q\in Y^*}p^TAq, i.e. \underline V = \overline V. Then, V=V=VV = \underline V = \overline V is called the value of the game.

Because of the MiniMax Theorem, the safety strategies are also called the optimal strategies.

2-strategy games

safety strategies: A=(abdc)A=\begin{pmatrix} a & b \\ d & c \end{pmatrix}

p=cdcd+ab,1p=abcd+abp=\frac{|c-d|}{|c-d|+|a-b|}, 1-p=\frac{|a-b|}{|c-d|+|a-b|}

q=cbcb+ad,1q=adbc+adq=\frac{|c-b|}{|c-b|+|a-d|}, 1-q=\frac{|a-d|}{|b-c|+|a-d|}

Solving 2×n and m×2 games:

(i) Check for saddle point first. If there is a saddle point then we solve the game already.

(ii) Suppose there is no saddle point.

Using the graphical method, we can find the safety strategy of Player I for 2 × n games. Since the Maximin occurs at the intersection of two lines, Player II, knowing that Player I will play his/her safety strategy, will play the two Columns giving rise to the two lines. The situation then reduces to 2 × 2 games.

Maximin Principle \Leftrightarrow Equilibrium Principle

Lemma: Let p belong to X* and we let JpJ_p={j: Column j is a BR to p}, the set of best response columns to p.

Then, q=(q1,...,qn)T\mathbf q=(q_1,...,q_n)^T a BR to p iff qjq_j>0 implies j belongs to JpJ_p.

Let q belong to Y* and we let IqI_q={i: Row i is a BR to q}, the set of best response rows to q.

Then, p=(p1,...,pm)T\mathbf p=(p_1,...,p_m)^T a BR to q iff pip_i>0 implies i belongs to IqI_q.

Theorem(Principle of Indifference): Let p=(p1,...,pm)T\mathbf p=(p_1,...,p_m)^T lie in X* and q=(q1,...,qn)T\mathbf q=(q_1,...,q_n)^T lie in Y* such that <p,q> is an equilibrium pair, i.e. BR to each other. Let pTAq=Vp^TAq=V. Then, ai1q1+...+ainqn=Va_{i1}q_1+...+a_{in}q_n=V for all i for which pip_i>0 and a1jp1+...+amjpm=Va_{1j}p_1+...+a_{mj}p_m=V for all j for which qjq_j>0.

Theorem(Exchange Principle): Let <p1p_1, q1q_1>, <p2p_2, q2q_2> be equilibrium pairs. Then, <p1p_1, q2q_2> is also an equilibrium pair.

Definition: p belong to X* is called an equalizing strategy whenever A(p,q)=A(p,q’) for any q, q’ in Y*. q belong to Y* is called an equalizing strategy whenever A(p’,q)=A(p,q) for any p, p’ in X*.

Theorem: Suppose p, q are equalizing strategies for Player I, Player II respectively. Then, p, q are BR to each other and hence <p,q> is an equilibrium pair.

Finding safety (Optimal) strategies when the value is given

Theorem: Let A=(aija_{ij}) be a matrix game with value V. The p is a safety strategy for Player I iff the following n inequalities are satisfied:

p1a11+...+pmam1Vp_1a_{11}+...+p_ma_{m1}\geq V

\vdots \qquad \vdots \qquad \qquad \vdots

p1a1n+...+pmamnVp_1a_{1n}+...+p_ma_{mn}\geq V

Theorem: A finite symmetric game(A=ATA=-A^T) has value zero. Any strategy optimal (safety) for one player is also optimal for the other.

It is equivalent to study the following optimization problem: minimize (x1+...+xmx_1+...+x_m) subject to the constraints:

1x1a11+...+xmam11\leq x_1a_{11} + ... + x_ma_{m1}

\vdots \qquad \vdots \qquad \qquad \vdots

1x1a1n+...+xmamn1\leq x_1a_{1n} + ... + x_ma_{mn}

xi0 for i=1,...,mx_i \geq 0\ for\ i = 1,...,m

This is the Primal Problem of a linear programming problem. The optimal value of this problem is 1/V.

Games in Extensive Form

The Extensive Form of a Game

The extensive form, is built on the basic notions of position and move. This is a most detailed description of a game. It tells exactly which player should move, what are the choices, the outcomes, the information of the players at every stage, and so on.

In the extensive form, games are sequential, interactive processes which moves from one position to another in response to the wills of the players or the whims of chance. It will justify and give meaning to more abstract concepts such as strategy.

Three new concepts make their appearance in the extensive form of a game: the game tree, chance moves, and information sets.

The Game Tree

The Kuhn tree of a game: When a tree is used to define a game, the vertices are interpreted as positions and the edges as moves. Each nonterminal position is assigned either to the player responsible to choosing the next move or to chance.

Chance Moves: To incorporate chance moves in the Kuhn tree, we introduce a new player called Chance (Dame Fortune). The moves from Chance carry different probabilities. The probabilities assigned to the edges leading out of each chance vertex must be displayed.

Information: When a player choosing a move is uninformed about some of the previous moves (concealed moves, simultaneous moves etc.), we draw little “balloons” around set of vertices which the player making the choice cannot discriminate.

These are called information sets, and the player in question must make the same choice for all vertices in the set. Edges issuing from vertices in the information set must be labeled the same.

Game of Perfect Information

A game of perfect information is a game in extensive form in which each information set of every player contains a single vertex.

Games in which players remember all past information they once knew and all past moves they made are called games of perfect recall.

Games in which both players know the rules of the game, that is, in which both players know the Kuhn tree, are called games of complete information.

Games in which one or both of the players do not know some of the payoffs, or some of the probabilities of chance moves, or some of the information sets, or even whole branches of the tree, are called games with incomplete information.

Reduction of a Game in Extensive Form to Strategic Form

Reduced pure strategy: Specification of choices at all information sets except those that are eliminated by the previous moves.

Random payoffs: The actual outcome of the game for given pure strategies of the players depends on the chance moves selected, and is therefore a random quantity. We represent random payoffs by their expected values.

Pure Strategic Equilibrium: A vector of Pure Strategy choices (x1,x2,...,xn)(x_1, x_2, ... , x_n) with xiXix_i \in X_i for i = 1, . . . , n is said to be a Pure Strategic Equilibrium, or PSE for short, if for all i = 1, 2, . . . , n, and for all xXix \in X_i, ui(x1,...,xi1,xi,xi+1,...,xn)ui(x1,...,xi1,x,xi+1,...,xn)u_i (x_1, ... , x_{i-1}, x_i, x_{i+1}, ... , x_n) \geq u_i (x_1, ... , x_{i-1}, x, x_{i+1}, ... , x_n).

Extensive Games of perfect information always have at least one PSE that may be found by the method of backward induction. Method of Backward Induction: Starting from any terminal vertex and trace back to the vertex leading to it. The player at this vertex will discard those edges with lower payoff. Then, treat this vertex as a terminal vertex and repeat the process. Then, we get a path from the root to a terminal vertex.

Definition: A subgame of a game presented in extensive form is obtained by taking a vertex in the Kuhn tree and all the edges and paths originated from this vertex.

Definition: A PSE of a game in extensive form is called a Perfect Pure Strategy Equilibrium (PPSE) if it is a PSE for all subgames.

Theorem: The path obtained by the method of backward induction defines a PPSE.

Behavior Strategy: A behavior strategy is obtained if the Player makes a randomized choice independently at each of his/her information sets.

Dimension of behavior strategies: Suppose the Player has m information sets such that there are kik_i choices at the ithi^{th} information set. Then, the dimension of behavior strategies for this player is (k11)+(k21)+...+(km1)(k_1-1) + (k_2-1) + ... + (k_m-1). Note that the space of mixed strategies is (k1k2...km1)(k_1k_2...k_m-1).

Theorem (Harold Kuhn): For a game of perfect recall, mixed strategies are equivalent to behavior strategies.

Bimatrix Games

Two-Person General-Sum Games/General-Sum Strategic Form Games

The normal or strategic form of a two person game is given by two sets X and Y of pure strategies of the players, and two real-valued functions u1(x,y)u_1(x, y) and u2(x,y)u_2(x, y) defined on X × Y, representing the payoffs to the two players. If Player I chooses x ⊆ X and Player II chooses y ⊆ Y, then Player I receives u1(x,y)u_1(x, y) and Player II receives u2(x,y)u_2(x, y).

A finite two-person game in strategic form can be represented as a matrix of ordered pairs, sometimes called a bimatrix. The first component of the pair represents Player I’s payoff and the second component represents Player II’s payoff.

If m and n representing the number of pure strategies of the two players, the game may be represented by two m × n matrices A = (aija_{ij}) and B = (bijb_{ij}).

The interpretation here is that if Player I chooses row i and Player II chooses column j, then Player I wins aija_{ij} and Player II wins bijb_{ij}, where aija_{ij} and bijb_{ij} are the elements in the ithi^{th} row, jthj^{th} column of A and B respectively.

The game of bimatrix in the above is represented as [A,B].

The theory is generally divided into two branches, the noncooperative theory and the cooperative theory.

In the non-cooperative theory, either the players are unable to communicate before decisions are made, or if such communication is allowed, the players are forbidden or are otherwise unable to make a binding agreement on a joint choice of strategy. The main non-cooperative solution concept is the strategic equilibrium (SE).

In the cooperative theory, it is assumed that the players are allowed to communicate before the decisions are made. They may make threats and counter-threats, proposals and counter-proposals, and hopefully come to some compromise. They may jointly agree to use certain strategies, and it is assumed that such an agreement can be made binding.

The cooperative theory itself breaks down into two branches, depending on whether or not the players have comparable units of utility and are allowed to make monetary side payments in units of utility as an incentive to induce certain strategy choices. The corresponding solution concept is called:

• TU(transferable utility) cooperative value if side payments are allowed.

• NTU(non-transferable utility) cooperative value if side payments are forbidden or otherwise unattainable.

Safety Levels

In a bimatrix game with m×n matrices A and B, Player I can guarantee winning on the average at least vI=maxpminj(p1a1j+...+pmamj)=Val(A)v_I = \max\limits_p \min\limits_j (p_1a_{1j}+...+p_m a_{mj})= Val(A).

This is called the safety level of Player I. Player I can achieve this payoff without considering the payoff matrix of Player II. A strategy, p, that achieves the maximum is called a maxmin strategy for Player I.

Strategic Equilibrium

A finite n-person game in strategic form is given by n nonempty finite sets, X1,X2,...,XnX_1, X_2, ... , X_n, and n real-valued functions u1,u2,...,unu_1, u_2, ... , u_n, defined on X1×X2×...×XnX_1 \times X_2 \times ... \times X_n. The set XiX_i represents the pure strategy set of player i and ui(x1,x2,...,xn)u_i(x_1, x_2, ... , x_n) represents the payoff to player i when the pure strategy choices of the players are x1,x2,...,xnx_1, x_2, ... , x_n, with xjXjx_j \in X_j for j = 1, 2, . . . , n.

We denote the set of probabilities over k points by PkP_k: PkP_k = {p=(p1,...,pk)\{ \mathbf p = (p_1, ... , p_k) : pi0p_i \geq 0 for i = 1, ... , k, and p1+...+pk=1}p_1 + ... + p_k = 1\}.

Let mim_i denote the number of pure strategy choices of player i, so that the set XiX_i has mim_i elements. Then the set of mixed strategies of player i is just PmiP_{m_i}. It is denoted by XiX^*_i where Xi=PmiX^*_i = P_{m_i}.

We denote the set of elements of XiX_i by the first mim_i integers, Xi={1,2,...,mi}X_i = \{1, 2, ..., m_i\}.

Suppose that for i = 1, 2, ... , n, Player i uses pi=(pi1,pi2,...,pimi)Xi\mathbf p_i = (p_{i1}, p_{i2}, ... , p_{im_i}) \in X^*_i. Then the average payoff to player j is uj(p1,...,pn)=i1...inp1i1...pninuj(i1,...,in)u_j(\mathbf p_1,...,\mathbf p_n)=\sum\limits_{i_1}...\sum\limits_{i_n}p_{1i_1}...p_{ni_n}u_j(i_1,...,i_n)

Definition: A vector of mixed strategy choices (also called a strategy profile) (p1,p2,...,pn)(\mathbf p_1, \mathbf p_2, ... , \mathbf p_n) with piXi\mathbf p_i \in X_i for i =1, . . . , n is said to be a strategic equilibrium, or SE for short, if for all i = 1, 2, . . . , n, and for all pXi,ui(p1,...,pi1,pi,pi+1,...,pn)ui(p1,...,pi1,p,pi+1,...,pn)\mathbf p \in X^*_i, u_i(\mathbf p_1, ... , \mathbf p_{i-1}, \mathbf p_i, \mathbf p_{i+1}, ... , \mathbf p_n) \geq u_i(\mathbf p_1, ... , \mathbf p_{i-1}, \mathbf p, \mathbf p_{i+1}, ... , p_n).(*)

Any mixed strategy pi\mathbf p_i that satisfies (*) for all pXi\mathbf p \in X^*_i is a best response of player i to the mixed strategies of the other players.

Theorem(Nash Equilibrium): Every finite n-person game in strategic form has at least one strategic equilibrium.

Theorem: Let [A, B] be a bimatrix game. Let <p,q> be a SE. Then, pTAqvI,pTBqvIIp^TAq \geq v_I , p^TBq \geq v_{II}.

Theorem: <p,q> is a SE whenever the following two conditions are valid:

(i) pi>0p_i>0 implies Row i is a BR to q.

(ii) qj>0q_j>0 implies Column j is a BR to p.

Labeling algorithm to check for SE’s:

• Put the label BR to the best response rows to q

• Put the label + at the ithi^{th} row where pip_i is positive

• Put the label BR to the best response columns to p

• Put the label + at the jthj^{th} column where qjq_j is positive

• <p,q> is a SE whenever a + label is accompanied by a BR label

Proposition: p is a BR to q whenever p, α\alpha satisfy the following optimization problem:

Max (pTAqαp^TAq - \alpha) subject to the following constraints:

Aqα1m0,pT1m1=0,0pAq - \alpha \mathbf 1_m \leq 0, p^T \mathbf 1_m - 1=0, 0 \leq p

Theorem: <p,q> is a SE iff p,q, α, β satisfy the following optimization problem:

Max(pT(A+B)qαβ)Max (p^T(A+B)q - \alpha - \beta) subject to the following constraints:

Aqα1m0,pT1m1=0,pTBβ1nT0,qT1n1=0,0q,0pAq - \alpha \mathbf 1_m \leq 0, p^T \mathbf 1_m - 1=0, p^TB - \beta \mathbf 1_n^T \leq 0, q^T \mathbf 1_n - 1=0, 0\leq q, 0\leq p

Tetraskelion Method to find all SE’s for 2 × 2 games:

• For each mixed strategy (p, 1-p) of Player I, we use the method as in 0-zero sum games to find the best response columns.

• Plot this in the unit square as a graph parameterized by p, 0 ≤ p ≤ 1, on x-axis.

• For each mixed strategy (q, 1-q) of Player II, we use the method as in 0-zero sum games to find the best response rows.

• Plot this in the unit square in the above as a graph parameterized by q, 0 ≤ q ≤ 1, on y-axis.

• The SE’s are the intersections of the two graphs.

Duopoly

Cournot Duopoly:

Two firms produce the same product.

Cost function for Firm 1: C1(q1)=q1C_1(q_1)=q_1

Cost function for Firm 2: C2(q2)=q2C_2(q_2)=q_2

q1,q2q_1, q_2 are the quantities produced.

Price function: P(q)=25-q, where q is the total amount in the market.

Monopoly:

Profit = P(q)=(25q)qq=24qq2P(q) = (25-q)q - q = 24q - q^2

Find q to maximize profit.

Duopoly:

Profit to Firm 1: P1(q1)=(25q)q1q1P_1(q_1)=(25-q)q_1-q_1, where q=q1+q2q=q_1+q_2.

Profit to Firm 2: P2(q2)=(25q)q2q2P_2(q_2)=(25-q)q_2-q_2.

Firm 1 will find q1q_1 to maximize profit.

Firm 2 will find q2q_2 to maximize profit.

Stackelberg Leader and Follower:

Suppose Firm 1 moves first.

Firm 1 announces the quantity q1q_1 (strategy) that it will produce.

Knowing this, Firm 2 will choose its quantity q2q_2 to maximize profit (as BR).

Firm 1 is called the Stackelberg Leader and Firm 2 called the follower.

Bertrand Duopoly:

In economics, there is also a concept of Bertrand Equilibrium for duopoly.

In this case, the firms will choose price as strategies.

Potential Games

Congestion Games: Any game where a collection of homogenous agents have to choose from a finite set of alternatives, and where the payoff of a player depends on the number of players choosing each alternative, is a congestion game.

Potential Game: We consider a finite n-person game in startegic form. Let {1, ..., n} be the set of players. For Player I, i=1, ..., n, let XiX_i be its set of pure startegies and let ui=X1×...×XnRu_i = X_1 \times ... \times X_n \rightarrow R be the payoff function.

A function P: X1×...×XnRX_1 \times ... \times X_n \rightarrow R, is called a potential function iff for any x1X1,...,xi1Xi1,x,zXi,xi+1Xi+1,...,xnXnx_1 \in X_1,...,x_{i-1}\in X_{i-1},x,z\in X_i,x_{i+1}\in X_{i+1},...,x_n\in X_n, we have ui(x1,...,xi1,x,xi+1,...,xn)ui(x1,...,xi1,z,xi+1,...,xn)=P(x1,...,xi1,x,xi+1,...,xn)P(x1,...,xi1,z,xi+1,...,xn)u_i(x_1,...,x_{i-1},x,x_{i+1},...,x_n)-u_i(x_1,...,x_{i-1},z,x_{i+1},...,x_n)=P(x_1,...,x_{i-1},x,x_{i+1},...,x_n)-P(x_1,...,x_{i-1},z,x_{i+1},...,x_n)

A game with a potential function is called a potential game.

Theorem: Any potential game admits a PSE.

Correlated Equilibrium

Solution Concept for games with communication but no binding agreement.

In general, let [A, B] be a an m×n bimatrix game where A=(aija_{ij}), B=(bijb_{ij}). A probability distribution of the outcomes of this game is given by pij,pij0,i,jpij=1p_{ij}, p_{ij}\geq0, \sum\limits_{i,j}p_{ij}=1. This means that the instruction of playing (Row i, Column j) to the players with probability pijp_{ij}. This probability distribution is called a Correlated Equilibrium (CE) if it is for the best interest for players to comply with the instruction.

Inequalities characterizing CE: Prob(Column j for IIRow i for I)=Prob(Column j for II,Row i for I)Prob(Row i for I)=pijkpikProb(Column\ j\ for\ II | Row\ i\ for\ I) = \frac{Prob(Column\ j\ for\ II, Row\ i\ for\ I)}{Prob(Row\ i\ for\ I)} =\frac{p_{ij}}{\sum\limits_k p_{ik}}.

For Player I to comply with the instruction to play Row i, the payoff must be better than deviating the instruction in playing Row r instead. This means jpijkpikaijjpijkpikarj\sum\limits_j\frac{p_{ij}}{\sum\limits_k p_{ik}}a_{ij} \geq \sum\limits_j\frac{p_{ij}}{\sum\limits_k p_{ik}}a_{rj}, for r =1,...,m.

Equivalently, jpij(aijarj)0\sum\limits_j p_{ij} (a_{ij} - a_{rj})\geq0, for r=1,...,m.

Definition of CE: A probability distribution pijp_{ij} over the outcomes of the m×n bimatrix game [A, B] is called a Correlated Equilibrium whenever jpij(aijarj)0\sum\limits_j p_{ij} (a_{ij} - a_{rj})\geq0 for i, r=1,...m and ipij(bijbis)0\sum\limits_i p_{ij} (b_{ij} - b_{is})\geq0 for j, s=1,...n.

Theorem: Let p=(pip_i), q=(qjq_j) be mixed strategies for Player I and Player II respectively. Then, (piqjp_iq_j) is a CE if and only if <p,q> is a SE.

Corollary: Suppose (pijp_{ij}) is a CE such that pijp_{ij}=1 for i=k and j=r, and pijp_{ij}=0 otherwise. Then, <Row k, Col r> is a PSE.

Theorem: The convex combination of two CE’s is a CE.

Theorem: Let (u1,u2),(u1,u2)(u_1, u_2), (u_1', u_2') be payoffs to two SE’s. Then, any convex combination of these two payoffs is the payoff of a certain CE.

Theorem: The payoff of each player from a CE is better than their own safety level.

Theorem: Let [A, -A] be a 2-person 0-sum game with value V. For any CE of [A, -A], the payoff vector is (V, -V).

Theorem: The Principle of Elimination of Dominated Strategies is valid for computing CEs.

Theorem: Let [A, -A] be a 2-person 0-sum game. Then, all CE of [A, -A] are derived from SE.

results matching ""

    No results matching ""

    results matching ""

      No results matching ""