对抗搜索(adversarial search)源于博弈(game playing)
博弈问题定义为搜索问题:
下图为井字棋游戏的博弈树(utility值针对的是MAX棋手,MIN的值正好相反)
博弈树较小, 9 ! = 362880 9!=362880 9!=362880个结束状态,对于国际象棋等现实问题来说很小
资源有限的条件下搜索尽可能多的节点(而非所有)以决定棋手如何行棋(最佳还是合理?)
算法思路:
M I N I M A X ( s ) = { U T I L I T Y ( s ) i f T E R M I N A L − T E S T ( s ) m a x a ∈ A c t i o n s ( s ) M I N I M A X ( R E S U L T ( s , a ) ) i f P L A Y E R ( s ) = M A X m i n a ∈ A c t i o n s ( s ) M I N I M A X ( R E S U L T ( s , a ) ) i f P L A Y E R ( s ) = M I N MINIMAX(s)=\begin{cases}UTILITY(s) & if~TERMINAL-TEST(s)\\ max_{a\in Actions(s)}MINIMAX(RESULT(s,a)) & if~PLAYER(s)=MAX\\ min_{a\in Actions(s)}MINIMAX(RESULT(s,a)) & if~PLAYER(s)=MIN \end{cases} MINIMAX(s)=⎩⎪⎨⎪⎧UTILITY(s)maxa∈Actions(s)MINIMAX(RESULT(s,a))mina∈Actions(s)MINIMAX(RESULT(s,a))if TERMINAL−TEST(s)if PLAYER(s)=MAXif PLAYER(s)=MIN
示意图:
算法:
Function MINIMAX_DECISION(state) returns an action
return arg max MIN-VALUE(RESULT(state,a))
Function MAX-VALUE(state) returns a utility value
if TERMINAL-TEST(state) then return UTILITY(state)
v <--- oo
for each a in ACTIONS(state) do
v <--- MAX(v,MIN-VALUE(RESULT(s,a)))
return v
Function MIN-VALUE(state) returns a utility value
if TERMIANL-TEST(state) then return UTILITY(state)
v <--- oo
for each a in ACTIONS(state) do
v <--- MIN(v,MAX-VALUE(RESULT(s,a)))
return v
特点:
递归算法(自上而下至叶子节点,通过树把各节点minimax值回传)
完善博弈树的深度优先(depth-first)搜索过程
时间复杂度: O ( b m ) O(b^m) O(bm)
空间复杂度: O ( b m ) O(bm) O(bm)
为什么需要裁剪:
Function ALPHA-BETA-SEARCH(state) returns an action
v <--- MAX-VALUE(state, -oo,+oo)
return the action in ACTIONS(state) with value v
Function MAX-VALUE(state, alpha, beta) returns a utility value
if TERMINAL-TEST(state) then return UTILITY(state)
v <--- -oo
for each a in ACTIONS(state) do
v <--- MAX(v, MIN-VALUE(RESULT(s,a),alpha,beta))
if v >= beta then return v
alpha <--- MAX(aalpha,v)
return v
Function MIN-VALUE(state, alpha, beta) returns a utility value
if TERMINAL-TEST(state) then return UTILITY(state)
v <--- +oo
for each a in ACTIONS(state) do
v <--- MIN(v,MAX_VALUE(RESULT(s,a),alpha,beta))
if v <= alpha then return v
beta <--- MIN(beta,v)
return v
特点: α − β \alpha -\beta α−β剪枝效果很大程度上依赖于状态节点被探查的次序。
算法:
Function ALPHA-BETA-SEARCH(state) returns an action
v <--- MAX-VALUE(state, -oo,+oo)
return the action in ACTIONS(state) with value v
Function MAX-VALUE(state, alpha, beta) returns a utility value
if CUTOFF-TEST(state,depth) then return Eval(state)
v <--- -oo
for each a in ACTIONS(state) do
v <--- MAX(v, MIN-VALUE(RESULT(s,a),alpha,beta))
if v >= beta then return v
alpha <--- MAX(aalpha,v)
return v
Function MIN-VALUE(state, alpha, beta) returns a utility value
if CUTOFF-TEST(state,depth) then return Eval(state)
v <--- +oo
for each a in ACTIONS(state) do
v <--- MIN(v,MAX_VALUE(RESULT(s,a),alpha,beta))
if v <= alpha then return v
beta <--- MIN(beta,v)
return v
特点: