最小值评估功能

大家好，我目前正在学习CS50AI课程。第一个任务是创建一个具有最小最大值功能的 tictactoe AI。我的问题是：据我了解，必须对游戏的位置进行静态评估。我试图用伪代码写这样的东西：

If next move is a winning move:
return 10 point
elif opponent is going to win stop him:
return 8 point

之类的事情。但是当我检查其他人最小值 - 最大值函数时，没有这样的事情。

def minimax(board):
"""
Returns the optimal action for the current player on the board.
"""
currentactions = actions(board)
if player(board) == X:
vT = -math.inf
move = set()
for action in currentactions:
v, count = maxvalue(result(board,action), 0)
if v > vT:
vT = v
move = action
else:
vT = math.inf
move = set()
for action in currentactions:
v, count = minvalue(result(board,action), 0)
if v < vT:
vT = v
move = action
print(count)
return move
def maxvalue(board, count):
"""
Calculates the max value of a given board recursively together with minvalue
"""

if terminal(board): return utility(board), count+1

v = -math.inf
posactions = actions(board)

for action in posactions:
vret, count = minvalue(result(board, action), count)
v = max(v, vret)

return v, count+1

def minvalue(board, count):
"""
Calculates the min value of a given board recursively together with maxvalue
"""

if terminal(board): return utility(board), count+1

v = math.inf
posactions = actions(board)

for action in posactions:
vret, count = maxvalue(result(board, action), count)
v = min(v, vret)

return v, count+1

这是 sikburn 的 tictactoe 实现的最大 - 最小函数。我不明白最大值或最小值函数会产生什么结果。谁能澄清我的逻辑？顺便说一下，terminal()函数检查游戏是否结束(有赢家或平局)，result()函数将棋盘和动作作为输入并返回结果棋盘。感谢您的所有帮助。

在utility函数(未包含在代码中)中，您可能将 1 分配给 X 胜利，将 -1 分配给 O 胜利，否则分配 0。minimax函数递归调用minvalue并递归maxvalue，用于所有可能的移动，直到游戏结束，无论是平局还是胜利。然后它调用utility来获取值。minvalue和maxvalue都保证X和O将始终选择最佳动作。

不要忘记检查电路板是否是终端，并在minimax功能中返回None是否是终端。

交换minvalue和maxvalue函数在minimax中的调用：对于X，调用minvalue(因为X想知道O下一步行动)，对于O，调用maxvalue(出于同样的原因)。

如果要查看每次迭代时的计算，可以在minvalue和maxvalue函数的末尾打印类似f"Minvalue: {v}, Iteration: {count+1}"和f"Maxvalue: {v}, Iteration: {count+1}"的内容，就在返回这些值之前。我认为这更容易理解。

我澄清了你的疑问。

相关内容

最新更新

热门标签：