我试图找出与计算机一起播放Gomoku(5乘5版的Tictactoe)的算法。在这种情况下,我发现最常用的算法是min-max(或alpha-beta),但是这些算法对我来说太难处理了。因此,我决定使用以下代码,这些代码非常易于理解但耗时。它显示了计算机如何做出合理的选择。
//------------------------------------------------------------
// computer_move() checks for all legal moves in the current |
// position. then for each of them it calls the dfs_search() |
// function to get the moves' score. And finally it returns |
// the move with the best score. |
//------------------------------------------------------------
int computer_move() //
{
int best_move; // best move so far
int best_score = -100; // best score so far
int score; // current score
int i;
for (i = 0; i < 16; ++i) { //
if (pos[i] == EMPTY) { // if a legal move can be made
pos[i] = COMPUTER; // mark the move
score = -dfs_search(HUMAN); //
pos[i] = EMPTY; // take back the move
if (score > best_score) {
best_score = score;
best_move = i;
}
}
}
printf("Computer's move: %dn", best_move);
return best_move; // return the best move found
}
//------------------------------------------------------------
// dfs_search() gets the side to move, find all legal moves |
// for him, and for each move, it recursively calls itself. |
// It returns a score for the position. |
// This recursive function continues on each variation until |
// the game's end is found in that variation. Which means |
// that the variation continues until check_result() returns |
// a value other than CONTINUE. |
// Note that this can be done in tic-tac-toe, since it's a |
// deterministic game. For games like chess or checkers we |
// can't continue the variation until reaching the game's end |
// so we have to cut the variation at some point. |
//------------------------------------------------------------
int dfs_search(int player) //
{
int best_score = -100;
int score;
int result;
int i;
result = check_result(player);
if (result != CONTINUE) return result; // return the result
for (i = 0; i < 16; ++i) {
if (pos[i] == EMPTY) {
pos[i] = player;
score = -dfs_search(CHANGE_PLAYER(player)); //
pos[i] = EMPTY;
if (score > best_score)
best_score = score;
}
}
return best_score; // return the best score
}
对于3乘3个矩阵,它效果很好。但是,对于4乘4,留下一块石头花费太长了。由于长时间消费的原因是前三到四个决定,我认为仅仅使计算机只能围绕人的最后选择(点)搜索最佳点才是一个解决方案。在前三到四个决定之后,上面的正式算法将在剩下的几个点上效果很好。您如何看待它?并提供一些修改当前算法的建议。
您正在尝试解决整个游戏树。在3x3板上有9个!=树上的362880节点,该节点足够小,可以处理计算机,但是在4x4板上有16个!= 2092278988000(20.9万亿)节点,这太多了,无法在合理的时间内访问。
考虑实现搜索算法,该算法可以返回当前位置得分的合理估算,而无需求解整个游戏树。对于Gomoku,我建议蒙特卡洛树搜索。它已成功地应用于许多GO等游戏,并在其香草形式中不需要您按照固定深度最深度搜索及其变体(例如Alpha-beta)编写静态评估功能。