6.Adversarial search（Minmax and α

In AI, a game* is often a deterministic, turn-taking, two- player, zero-sum game of perfect information:
1.deterministic
2.two agents
3.whose action alternate
4.utility values are opposite e.g. (+1,-1) fully observable

Defination of Game

A Game consists of:
1.sets of players P, states S (board and player to play), and moves M.
2.an initial state s0 ∈ S which specifies how the game is set up.
3.Player(s)∈ P : defines the player to move in state s.
4.Moves(s)∈ 2M : defines the set of legal moves in state s.
5.Result(s,m)∈ S: defines the result of perfoming move m in state s.
6.Terminal(s)∈ B: the terminal test says whether the game is over.
7.Utility(s,p)∈ R: the utility function gives a numeric value to terminal states from the point of view of a given player, e.g. {+1, −1, 0} for chess.

A Game is defined by an initial state, a successor function, a terminal test, and a utility function

Defination of Minimax

Perfect play for deterministic, two-player, zero-sum, perfect-information games.

Idea: choose move to position with highest minimax value
= best achievable utility against best possible opponent

Computing the Minimax value:

Apply utility function to each leaf of the game tree
Back-up values from the leaves through inner nodes up to the root:
(a) min node: compute the min of its children values
(b) max node: compute the max of its children values
At the root: choose the move leading to the child of highest value

Minimax-Value(s) =
Utility(s, max)                           if Terminal(s)
maxm∈Moves(s) Minimax-Value(Result(s, m)) if Player(s) = max 
minm∈Moves(s) Minimax-Value(Result(s, m)) if Player(s) = min

Minimax algorithm:
function Minimax-Decision(state) returns a move
function Max-Value(state) returns a utility value
function Min-Value(state) returns a utility value

The minimax algorithm select optimal actions for two-player zero-sum games of perfect information by a depth first exploration of the game-tree

α–β pruning

A parent node passes its current values for α and β to its children in turn. A child passes back up its value v to the parent.
The parent compares v to α (min) or β (max) to decide whether to prune the child’s sibling and if so return v to the parent.
Otherwise, it updates its current values for α (max) or β (min) using v and go on.

Example of α–β pruning

Alpha-beta pruning does not compromise optimality but increases efficiency by eliminating provably irrelevant subtrees.
Good move ordering improves effectiveness of pruning. Perfect ordering (unachievable): increasing order for max and decreasing order for min.

Changes to Minmax(in realtime situation)

Limit search depth and estimate expected utility

1.Use Cutoff test instead of Terminal test
– Cutoff(s,d): true iff the state s encountered at depth d in the tree must be considered as a leaf (or s is terminal).

2.Use Eval instead of Utility
– Eval(s,p) i.e., evaluation function that estimates the expected utility of cutoff state s wrt player p, and correlates with chances of winning

Minimax-Value(s, d) =
Eval(s, max)                                       if Cutoff(s, d)
maxm∈Moves(s)D-Minimax-Value(Result(s,m),d+1)      ifPlayer(s)=max 
minm∈Moves(s) D-Minimax-Value(Result(s, m), d + 1) if Player(s) = min

It is not feasible to consider the whole game tree (even with alpha-beta), so we need to cut the search off at some point and apply an evaluation function that gives an estimate of the expected utility of a state

6.Adversarial search（Minmax and α–β pruning ）

Defination of Game

Defination of Minimax

α–β pruning

Changes to Minmax(in realtime situation)

Limit search depth and estimate expected utility

你可能感兴趣的:(6.Adversarial search（Minmax and α–β pruning ）)