java Alpha-Beta 移动排序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9964496/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 23:05:24  来源:igfitidea点击:

Alpha-beta move ordering

javaalgorithmartificial-intelligenceminimaxalpha-beta-pruning

提问by amit

I have a basic implementation of alpha-beta pruning but I have no idea how to improve the move ordering. I have read that it can be done with a shallow search, iterative deepening or storing the bestMoves to transition table.

我有 alpha-beta 修剪的基本实现,但我不知道如何改进移动排序。我读过它可以通过浅搜索、迭代深化或将 bestMoves 存储到转换表来完成。

Any suggestions how to implement one of these improvements in this algorithm?

任何建议如何在此算法中实现这些改进之一?

 public double alphaBetaPruning(Board board, int depth, double alpha, double beta, int player) {
    if (depth == 0) {
        return board.evaluateBoard();
    }

    Collection<Move> children = board.generatePossibleMoves(player);
    if (player == 0) {
        for (Move move : children) {
            Board tempBoard = new Board(board);
            tempBoard.makeMove(move);
            int nextPlayer = next(player);
            double result = alphaBetaPruning(tempBoard, depth - 1, alpha,beta,nextPlayer);
            if ((result > alpha)) {
                alpha = result;
                if (depth == this.origDepth) {
                    this.bestMove = move;
                }
            }
            if (alpha >= beta) {
                break;
            }
        }
        return alpha;
    } else {
        for (Move move : children) {
            Board tempBoard = new Board(board);
            tempBoard.makeMove(move);
            int nextPlayer = next(player);
            double result = alphaBetaPruning(tempBoard, depth - 1, alpha,beta,nextPlayer);
            if ((result < beta)) {
                beta = result;
                if (depth == this.origDepth) {
                    this.bestMove = move;
                }
            }
            if (beta <= alpha) {
                break;
            }
        }
        return beta;
    }
}

public int next(int player) {
    if (player == 0) {
        return 4;
    } else {
        return 0;
    }
}

回答by amit

  • Node reordering with shallow search is trivial: calculate the heuristic value for each child of the state before recursively checking them. Then, sort the values of these states [descending for max vertex, and ascending for min vertex], and recursively invoke the algorithm on the sorted list. The idea is - if a state is good at shallow depth, it is more likely to be good at deep state as well, and if it is true - you will get more prunnings.

    The sorting should be done beforethis [in both ifand elseclauses]

    for (Move move : children) {

  • storing moves is also trivial - many states are calculated twice, when you finish calculating any state, store it [with the depth of the calculation! it is improtant!] in a HashMap. First thing you do when you start calculation on a vertex - is check if it is already calculated - and if it is, returned the cached value. The idea behind it is that many states are reachable from different paths, so this way - you can eliminate redundant calculations.

    The changes should be done both in the first line of the method [something like if (cache.contains((new State(board,depth,player)) return cache.get(new State(board,depth,player))] [excuse me for lack of elegance and efficiency - just explaining an idea here].
    You should also add cache.put(...)before each returnstatement.

  • 使用浅层搜索重新排序节点很简单:在递归检查状态的每个子​​节点之前计算它们的启发式值。然后,对这些状态的值进行排序[最大顶点降序,最小顶点升序],并递归调用排序列表上的算法。这个想法是——如果一个状态擅长浅深度,它更有可能擅长深度状态,如果是真的——你会得到更多的修剪。

    排序应该之前完成[在ifelse子句中]

    for (Move move : children) {

  • 存储移动也是微不足道的 - 许多状态被计算两次,当你完成计算任何状态时,存储它[与计算的深度!这很重要!] 在HashMap. 开始计算顶点时要做的第一件事 - 检查它是否已经计算 - 如果已经计算,则返回缓存值。其背后的想法是可以从不同的路径到达许多状态,因此通过这种方式 - 您可以消除冗余计算。

    更改应该在方法的第一行中完成 [类似于if (cache.contains((new State(board,depth,player)) return cache.get(new State(board,depth,player))] [请原谅我缺乏优雅和效率 - 只是在这里解释一个想法]。
    您还应该cache.put(...)在每个return语句之前添加。

回答by Salvador Dali

First of all one has to understand the reasoning behind the move ordering in an alpha-beta pruning algorithm. Alpha-beta produces the same result as a minimax but in a lot of cases can do it faster because it does not search through the irrelevant branches.

首先,必须了解 alpha-beta 剪枝算法中移动排序背后的推理。Alpha-beta 产生与 minimax 相同的结果,但在很多情况下可以更快地完成,因为它不会搜索不相关的分支。

It is not always faster, because it does not guarantee to prune, if fact in the worse case it will not prune at all and search absolutely the same tree as minimax and will be slower because of a/b values book-keeping. In the best case (maximum pruning) it allows to search a tree 2 times deep at the same time. For a random tree it can search 4/3 times deeper for the same time.

它并不总是更快,因为它不能保证修剪,如果事实上在更坏的情况下它根本不会修剪并搜索与 minimax 完全相同的树,并且会因为 a/b 值簿记而变慢。在最好的情况下(最大修剪),它允许同时搜索 2 倍深的树。对于随机树,它可以同时搜索 4/3 倍的深度。

Move ordering can be implemented in a couple of ways:

移动排序可以通过以下几种方式实现:

  1. you have a domain expert who gives you suggestion of what moves are better. For example in chess promotion of a pawn, capturing high value pieces with lower value piece are on average good moves. In checkers it is better to kill more checkers in a move then less checker and it is better to create a queen. So your move generation function return better moves before
  2. you get the heuristic of how good is the move from evaluating the position at the 1 level of depth smaller (your shallow search / iterative deepening). You calculated the evaluation at the depth n-1, sorted the moves and then evaluate at the depth n.
  1. 您有一位领域专家,他会为您提供更好的建议。例如,在棋子的国际象棋推广中,用低价值棋子捕获高价值棋子通常是好的举动。在跳棋中,最好在一次移动中杀死更多的跳棋而不是更少的跳棋,并且最好创建一个皇后。所以你的移动生成函数之前返回更好的移动
  2. 您可以通过在更小深度的 1 个级别(您的浅搜索/迭代深化)评估位置来了解移动有多好。您计算了深度 n-1 的评估,对移动进行了排序,然后在深度 n 进行评估。

The second approach you mentioned has nothing to do with a move ordering. It has to do with a fact that evaluation function can be expensive and many positions are evaluated many time. To bypass this you can store the values of the position in hash once you calculated it and reuse it later.

您提到的第二种方法与移动排序无关。这与评估函数可能很昂贵并且许多职位需要多次评估的事实有关。要绕过这一点,您可以在计算后将头寸的值存储在哈希中,并在以后重复使用。