Java Minimax Alpha-Beta 修剪递归返回
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15447580/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java Minimax Alpha-Beta Pruning Recursion Return
提问by sage88
I am trying to implement minimax with alpha-beta pruning for a checkers game in Java. My minimax algorithm works perfectly. My code runs with the alpha-beta code in place. Unfortunately, when I play 1000 games vs the standard minimax algorithm, the alpha-beta algorithm always comes out behind by 50 games or so.
我正在尝试使用 alpha-beta 修剪为 Java 中的跳棋游戏实现 minimax。我的极小极大算法完美运行。我的代码使用 alpha-beta 代码运行。不幸的是,当我玩 1000 场比赛与标准的 minimax 算法时,alpha-beta 算法总是落后 50 场左右。
Since alpha-beta pruning should not be reducing the quality of the moves, just the time it takes to achieve them, something has to be wrong. However, I have taken out pen and paper and drawn hypothetical leaf node values and used my algorithm to predict whether it will calculate the correct best move, and there doesn't appear to be any logic errors. I used the tree from this video: Alpha-Beta Pruningto trace my algorithm. It logically should make all of the same choices, and therefore be a functioning implementation.
由于 alpha-beta 修剪不应该降低移动的质量,而只是降低实现它们所需的时间,所以一定是出了什么问题。然而,我已经拿出笔和纸,画出了假设的叶节点值,并用我的算法来预测它是否会计算出正确的最佳走法,而且似乎没有任何逻辑错误。我使用了这个视频中的树:Alpha-Beta 剪枝来跟踪我的算法。从逻辑上讲,它应该做出所有相同的选择,因此是一个有效的实现。
I have also put print statements into the code (they have been removed to reduce the clutter), and values are being returned correctly it appears and pruning does happen. Despite my best efforts I have been unable to find where the logic error lies. This is my third different attempt at implementing this and all of them have had the same issue.
我还将打印语句放入代码中(它们已被删除以减少混乱),并且值正确返回,并且确实发生了修剪。尽管我尽了最大的努力,但我一直无法找到逻辑错误所在。这是我实现这一点的第三次不同尝试,他们都遇到了同样的问题。
I can't post the full code here, it's much too long, so I have included the methods that are relevant to the error. I'm not certain, but I suspect the problem may likely be in the non-recursive move() method, though I can't find a logical error in it so I'd just be thrashing around in it more, probably making things worse rather than better without having a rhyme or reason.
我不能在这里发布完整的代码,它太长了,所以我包含了与错误相关的方法。我不确定,但我怀疑问题可能出在非递归 move() 方法中,尽管我找不到其中的逻辑错误,所以我只会在其中进行更多的折腾,可能是在做一些事情没有韵律或理由,更糟而不是更好。
Is there a trick to recovering multiple integer values from recursive calls in a for loop?It works fine with both my minimax and negamax implementations, but alpha-beta pruning seems to produce some strange results.
是否有从 for 循环中的递归调用中恢复多个整数值的技巧?它适用于我的 minimax 和 negamax 实现,但 alpha-beta 修剪似乎产生了一些奇怪的结果。
@Override
public GameState move(GameState state)
{
int alpha = -INFINITY;
int beta = INFINITY;
int bestScore = -Integer.MAX_VALUE;
GameTreeNode gameTreeRoot = new GameTreeNode(state);
GameState bestMove = null;
for(GameTreeNode child: gameTreeRoot.getChildren())
{
if(bestMove == null)
{
bestMove = child.getState();
}
alpha = Math.max(alpha, miniMax(child, plyDepth - 1, alpha, beta));
if(alpha > bestScore)
{
bestMove = child.getState();
bestScore = alpha;
}
}
return bestMove;
}
private int miniMax(GameTreeNode currentNode, int depth, int alpha, int beta)
{
if(depth <= 0 || terminalNode(currentNode.getState()))
{
return getHeuristic(currentNode.getState());
}
if(currentNode.getState().getCurrentPlayer().equals(selfColor))
{
for(GameTreeNode child: currentNode.getChildren())
{
alpha = Math.max(alpha, miniMax(child, depth - 1, alpha, beta));
if(alpha >= beta)
{
return beta;
}
}
return alpha;
}
else
{
for(GameTreeNode child: currentNode.getChildren())
{
beta = Math.min(beta, miniMax(child, depth - 1, alpha, beta));
if(alpha >= beta)
{
return alpha;
}
}
return beta;
}
}
//Checks to see if the node is terminal
private boolean terminalNode(GameState state)
{
if(state.getStatus().equals(win) || state.getStatus().equals(lose) || state.getStatus().equals(draw))
{
return true;
}
else
{
return false;
}
}
采纳答案by Salvador Dali
You already fixed your problem, but the problem you encountered is pretty common. So whenever you build a part of the algorithm for an AI agent, you have to test it properly. So once your minimax algorithm is correct, you can just generate many random trees and check whether the results are the same. For example in python you can do this in this way:
您已经解决了您的问题,但您遇到的问题很常见。因此,无论何时为 AI 代理构建算法的一部分,都必须对其进行适当的测试。因此,一旦您的 minimax 算法正确,您就可以生成许多随机树并检查结果是否相同。例如在 python 中,您可以通过以下方式执行此操作:
class Node():
def __init__(self, data, children):
self.data = data
self.children = children
def generateTree(depth, branching):
total = branching**depth
values = [randint(-100, 100) for _ in xrange(total)]
level = [Node(values[i], []) for i in xrange(total)]
for _ in xrange(depth):
total /= branching
level = [Node(None, level[i * branching: (i+1) * branching]) for i in xrange(total)]
return level[0], values
Now you can generate a tree with many random trees and compare the results.
现在您可以生成具有许多随机树的树并比较结果。
tree, values = generateTree(depth, branching)
print negamax(tree, depth, 1) == alpha_beta_negamax(tree, depth, float('-inf'), float('inf'), 1)
Do not forget that minimax and alpha-beta return just the best value, whereas what you are interested in a real game is a move. It is straightforward to modify them in such a way that they can return a move, but this is up to a developer to decide how the move is returned. This is because there can be many moves that lead to the best solution (you can return the first one, last one or the most common one is to find all the moves and to return the random one).
不要忘记 minimax 和 alpha-beta 只返回最佳值,而您对真实游戏感兴趣的是移动。以它们可以返回移动的方式修改它们很简单,但这取决于开发人员来决定如何返回移动。这是因为可能有许多移动导致最佳解决方案(您可以返回第一个、最后一个或最常见的一个是找到所有移动并返回随机移动)。
In your case the problem was with the randomness of the returned values, so during the testing the good approach is to fix randomness.
在您的情况下,问题在于返回值的随机性,因此在测试期间,好的方法是修复随机性。
回答by Adrian
I noticed you said you found the problem but shouldnt the minimax alpha beta pruning be
我注意到你说你发现了问题,但不应该是 minimax alpha beta pruning
if it is MAX's turn to move
for child in children
result = alphaBetaMinimax(child, alpha, beta)
if result > alpha
alpha = result
if node is root
bestMove = operator of child
if alpha >= beta
return alpha
return alpha
if it is MIN's turn to move
for child in children
result = alphaBetaMinimax(child, alpha, beta)
if result < beta
beta = result
if node is root
bestMove = operator of child
if beta <= alpha
return beta
return beta
you wrote:
你写了:
if alpha >= beta
return beta
return alpha
回答by gknicker
On March 16, 2013, sage88 asked:
2013年3月16日,sage88问:
Is there a trick to recovering multiple integer values from recursive calls in a for loop?It works fine with both my minimax and negamax implementations, but alpha-beta pruning seems to produce some strange results.
是否有从 for 循环中的递归调用中恢复多个整数值的技巧?它适用于我的 minimax 和 negamax 实现,但 alpha-beta 修剪似乎产生了一些奇怪的结果。
In alpha beta pruning, the only output value of interest is a node's score: the final value of beta in a min node is considered for the alpha value of its parent max node; likewise, the final value of alpha in a max node is considered for the beta value of its parent min node. Therefore:
在 alpha beta 剪枝中,唯一感兴趣的输出值是节点的分数:最小节点中的最终 beta 值被考虑为其父最大节点的 alpha 值;同样,最大节点中 alpha 的最终值被视为其父最小节点的 beta 值。所以:
The answer to your question is the algorithm itself, as it's the most relevant trick.
您问题的答案是算法本身,因为它是最相关的技巧。
That said, there are two errors in your implementation: 1) As Adrian Blackburn originally pointed out, it's incorrectly returning alpha from a min node and vice-versa, thereby skewing its accuracy; 2) It's giving up pruning opportunities by prematurely considering the parent alpha or beta in the current node's value. This version fixes the return values and maximizes pruning:
也就是说,您的实现中有两个错误:1) 正如 Adrian Blackburn 最初指出的那样,它错误地从最小节点返回了 alpha,反之亦然,从而影响了其准确性;2) 通过过早地考虑当前节点值中的父 alpha 或 beta 来放弃修剪机会。此版本修复了返回值并最大化修剪:
private int miniMax(GameTreeNode currentNode, int depth, int alpha, int beta) {
if (depth <= 0 || terminalNode(currentNode.getState())) {
return getHeuristic(currentNode.getState());
}
if (currentNode.getState().getCurrentPlayer().equals(selfColor)) {
int currentAlpha = -INFINITY;
for (GameTreeNode child : currentNode.getChildren()) {
currentAlpha = Math.max(currentAlpha, miniMax(child, depth - 1, alpha, beta));
alpha = Math.max(alpha, currentAlpha);
if (alpha >= beta) {
return alpha;
}
}
return currentAlpha;
}
int currentBeta = INFINITY;
for (GameTreeNode child : currentNode.getChildren()) {
currentBeta = Math.min(currentBeta, miniMax(child, depth - 1, alpha, beta));
beta = Math.min(beta, currentBeta);
if (beta <= alpha) {
return beta;
}
}
return currentBeta;
}
Thanks for contributing a fun and interesting question :)
感谢您提出一个有趣而有趣的问题:)
For more fun, here's a clarification of your move()
method, removing a redundant call to Math.max()
:
为了更有趣,这里澄清了您的move()
方法,删除了对 的冗余调用Math.max()
:
@Override
public GameState move(GameState state) {
GameState bestMove = null;
int bestScore = -INFINITY;
GameTreeNode gameTreeRoot = new GameTreeNode(state);
for (GameTreeNode child : gameTreeRoot.getChildren()) {
int alpha = miniMax(child, plyDepth - 1, bestScore, INFINITY);
if (alpha > bestScore || bestMove == null) {
bestMove = child.getState();
bestScore = alpha;
}
}
return bestMove;
}
Finally (even more fun), just a suggestion, a method name change to clarify the intent of terminalNode()
, though I would move this into GameState
so it could be called with no parameters:
最后(更有趣),只是一个建议,更改方法名称以阐明 的意图 terminalNode()
,尽管我会将其移入,GameState
以便可以在没有参数的情况下调用它:
private boolean isTerminal(GameState state) {
//return Is.any(state.getStatus(), win, lose, draw);
return state.getStatus().equals(win)
|| state.getStatus().equals(lose)
|| state.getStatus().equals(draw);
}
回答by DanLatimer
To just answer your question
只回答你的问题
Is there a trick to recovering multiple integer values from recursive calls in a for loop?
是否有从 for 循环中的递归调用中恢复多个整数值的技巧?
Yes, in Java you would need to pass an object into the recursive function call, then modify the contents of that object. After the function returns you will be able to access the modified values.
是的,在 Java 中,您需要将一个对象传递给递归函数调用,然后修改该对象的内容。函数返回后,您将能够访问修改后的值。
Eg.
例如。
class ToBeReturned {
int returnValue1;
int returnValue2;
int returnValue3;
}
回答by Ales Dolecek
To achive bets prunning results you should implement some kind of move ordering. In chess it is usually captures or checks. Those kind of moves tend to change evaluation most and so they have great impact on prunning. In checkers it might be taking oponents stones or promoting self stones on 8th rank (sorry do not know the terms used).
为了获得投注修剪结果,您应该实施某种移动排序。在国际象棋中,它通常是捕获或检查。这类动作最容易改变评估,因此它们对剪枝有很大影响。在跳棋中,它可能会在第 8 位采取对手的石头或提升自己的石头(抱歉,不知道使用的术语)。