java 如何实现高效的 Alpha-Beta 剪枝游戏搜索树?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15626660/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 20:18:07  来源:igfitidea点击:

How to implement efficient Alpha-Beta pruning Game Search Tree?

javaandroidartificial-intelligencegame-theoryalpha-beta-pruning

提问by chRyNaN

I'm trying to learn about artificial intelligence and how to implement it in a program. The easiest place to start is probably with simple games (in this case Tic-Tac-Toe) and Game Search Trees (recursive calls; not an actual data structure). I found thisvery useful video on a lecture about the topic.

我正在尝试了解人工智能以及如何在程序中实现它。最容易开始的地方可能是简单的游戏(在本例中为 Tic-Tac-Toe)和游戏搜索树(递归调用;不是实际的数据结构)。在有关该主题的讲座中发现了这个非常有用的视频。

The problem I'm having is that the first call to the algorithm is taking an extremely long amount of time (about 15 seconds) to execute. I've placed debugging log outputs throughout the code and it seems like it is calling parts of the algorithm an excessive amount of times.

我遇到的问题是对算法的第一次调用需要非常长的时间(大约 15 秒)来执行。我在整个代码中放置了调试日志输出,看起来它调用部分算法的次数过多。

Here's the method for choosing the best move for the computer:

以下是为计算机选择最佳移动的方法:

    public Best chooseMove(boolean side, int prevScore, int alpha, int beta){
    Best myBest = new Best(); 
    Best reply;

    if (prevScore == COMPUTER_WIN || prevScore == HUMAN_WIN || prevScore == DRAW){
        myBest.score = prevScore;
        return myBest;
    }

    if (side == COMPUTER){
        myBest.score = alpha;
    }else{
        myBest.score = beta;
    }
    Log.d(TAG, "Alpha: " + alpha + " Beta: " + beta + " prevScore: " + prevScore);
    Move[] moveList = myBest.move.getAllLegalMoves(board);
    for (Move m : moveList){
        String choice;
        if (side == HUMAN){
            choice = playerChoice;
        }else if (side == COMPUTER && playerChoice.equals("X")){
            choice = "O";
        }else{
            choice = "X";
        }
        Log.d(TAG, "Current Move: column- " + m.getColumn() + " row- " + m.getRow());
        int p = makeMove(m, choice, side);
        reply = chooseMove(!side, p, alpha, beta);
        undoMove(m);
        if ((side == COMPUTER) && (reply.score > myBest.score)){
            myBest.move = m;
            myBest.score = reply.score;
            alpha = reply.score;
        }else if((side == HUMAN) && (reply.score < myBest.score)){
            myBest.move = m;
            myBest.score = reply.score;
            beta = reply.score;
        }//end of if-else statement
        if (alpha >= beta) return myBest;
    }//end of for loop
    return myBest;
}

Where the makeMovemethod makes the move if the spot is empty and returns a value (-1 - human win, 0 - draw, 1 - computer win, -2 or 2 - otherwise). Though I believe the error may be in the getAllLegalMovesmethod:

makeMove如果该位置为空,则该方法进行移动并返回一个值(-1 - 人类获胜,0 - 平局,1 - 计算机获胜,-2 或 2 - 否则)。虽然我相信错误可能出在getAllLegalMoves方法中:

    public Move[] getAllLegalMoves(String[][] grid){
    //I'm unsure whether this method really belongs in this class or in the grid class, though, either way it shouldn't matter.
    items = 0;
    moveList = null;
    Move move = new Move();

    for (int i = 0; i < 3; i++){
        for(int j = 0; j < 3; j++){
            Log.d(TAG, "At Column: " + i + " At Row: " + j);
            if(grid[i][j] == null || grid[i][j].equals("")){
                Log.d(TAG, "Is Empty");
                items++;
                if(moveList == null || moveList.length < items){
                    resize();
                }//end of second if statement
                move.setRow(j);
                move.setColumn(i);
                moveList[items - 1] = move;
            }//end of first if statement
        }//end of second loop
    }//end of first loop
    for (int k = 0; k < moveList.length; k++){
        Log.d(TAG, "Count: " + k + " Column: " + moveList[k].getColumn() + " Row: " + moveList[k].getRow());
    }
    return moveList;
}

private void resize(){
    Move[] b = new Move[items];
    for (int i = 0; i < items - 1; i++){
        b[i] = moveList[i];
    }
    moveList = b;
}

To sum it all up:What's causing my call, to choose the best move, to take so long? What am I missing? Is there an easier way to implement this algorithm? Any help or suggestions will be greatly appreciated, thanks!

总而言之:是什么导致我的呼叫,选择最佳移动,花费这么长时间?我错过了什么?有没有更简单的方法来实现这个算法?任何帮助或建议将不胜感激,谢谢!

回答by Patashu

A minimax tree with alpha beta pruning should be visualized as a tree, each node of the tree being a possible move that many turns into the future, and its children being all the moves that can be taken from it.

一个带有 alpha beta 剪枝的极小极大树应该被可视化为一棵树,树的每个节点都是一个可能的移动,很多都变成了未来,它的子节点是可以从中获取的所有移动。

To be as fast as possible and guarantee you'll only need space linear on number of moves you're looking ahead, you do a depth first search and 'sweep' from one side to another. As in, if you imagine the whole tree being constructed, your program would actually only construct a single strand from lead to root one at a time, and discard any parts of it it is done with.

为了尽可能快并保证您只需要与向前看的移动次数成线性关系的空间,您可以进行深度优先搜索并从一侧“扫描”到另一侧。就像,如果您想象正在构建整棵树,那么您的程序实际上一次只会构建一个从头到根的单链,并丢弃它完成的任何部分。

I'm just going to copy the wikipedia pseudo code at this point because it's really, really succinct and clear:

我现在只想复制维基百科的伪代码,因为它非常非常简洁明了:

function alphabeta(node, depth, α, β, Player)         
    if  depth = 0 or node is a terminal node
        return score
    if  Player = MaxPlayer
        for each child of node
            α := max(α, alphabeta(child, depth-1, α, β, not(Player) ))     
            if β ≤ α
                break                             (* Beta cut-off *)
        return α
    else
        for each child of node
            β := min(β, alphabeta(child, depth-1, α, β, not(Player) ))     
            if β ≤ α
                break                             (* Alpha cut-off *)
        return β

Notes:

笔记:

-'for each child of node' - Rather than editing the state of the current board, create an entirely new board that is the result of applying the move.By using immutable objects, your code will be less prone to bugs and quicker to reason about in general.

-'对于节点的每个子节点' - 不是编辑当前板的状态,而是创建一个全新的板,这是应用移动的结果。通过使用不可变对象,您的代码将不太容易出现错误,并且通常可以更快地进行推理。

-To use this method, call it for every possible move you can make from the current state, giving it depth -1, -Infinity for alpha and +Infinity for beta, and it should start by being the non-moving player's turn in each of these calls - the one that returns the highest value is the best one to take.

- 要使用此方法,请为您可以从当前状态进行的每一个可能的移动调用它,给它深度 -1,-Infinity 为 alpha 和 +Infinity 为 beta,并且它应该从轮到非移动玩家开始在这些调用中 - 返回最高值的调用是最好的调用。

It's very very conceptually simple. If you code it right then you never instantiate more than (depth) boards at once, you never consider pointless branches and so on.

这在概念上非常简单。如果你编码正确,那么你永远不会一次实例化超过(深度)的板,你永远不会考虑无意义的分支等等。

回答by meriton

I am not going to profile your code for you, but since this is such a nice coding kata I wrote a small ai for tic tac toe:

我不会为你分析你的代码,但由于这是一个非常好的编码 kata,我为井字棋写了一个小 ai:

import java.math.BigDecimal;

public class Board {

    /**
     * -1: opponent
     * 0: empty
     * 1: player
     */
    int[][] cells = new int[3][3];

    /**
     * the best move calculated by eval(), or -1 if no more moves are possible
     */
    int bestX, bestY;

    int winner() {
        // row
        for (int y = 0; y < 3; y++) {
            if (cells[0][y] == cells[1][y] && cells[1][y] == cells[2][y]) {
                if (cells[0][y] != 0) {
                    return cells[0][y];
                }
            }
        }

        // column
        for (int x = 0; x < 3; x++) {
            if (cells[x][0] == cells[x][1] && cells[x][1] == cells[x][2]) {
                if (cells[x][0] != 0) {
                    return cells[x][0];
                }
            }
        }

        // 1st diagonal
        if (cells[0][0] == cells[1][1] && cells[1][1] == cells[2][2]) {
            if (cells[0][0] != 0) {
                return cells[0][0];
            }
        }

        // 2nd diagonal
        if (cells[2][0] == cells[1][1] && cells[1][1] == cells[0][2]) {
            if (cells[2][0] != 0) {
                return cells[2][0];
            }
        }

        return 0; // nobody has won
    }

    /**
     * @return 1 if side wins, 0 for a draw, -1 if opponent wins
     */
    int eval(int side) {
        int winner = winner();
        if (winner != 0) {
            return side * winner;
        } else {
            int bestX = -1;
            int bestY = -1;
            int bestValue = Integer.MIN_VALUE;
        loop:
            for (int y = 0; y < 3; y++) {
                for (int x = 0; x < 3; x++) {
                    if (cells[x][y] == 0) {
                        cells[x][y] = side;
                        int value = -eval(-side);
                        cells[x][y] = 0;

                        if (value > bestValue) {
                            bestValue = value;
                            bestX = x;
                            bestY = y;
                            if (bestValue == 1) {
                                // it won't get any better, we might as well stop thinking
                                break loop;
                            }
                        }
                    }
                }
            }
            this.bestX = bestX;
            this.bestY = bestY;
            if (bestValue == Integer.MIN_VALUE) {
                // there were no moves left, it must be a draw!
                return 0;
            } else {
                return bestValue;
            }
        }
    }

    void move(int side) {
        eval(side);
        if (bestX == -1) {
            return;
        }
        cells[bestX][bestY] = side;
        System.out.println(this);

        int w = winner();
        if (w != 0) {
            System.out.println("Game over!");
        } else {
            move(-side);
        }
    }

    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder();
        char[] c = {'O', ' ', 'X'};
        for (int y = 0; y < 3; y++) {
            for (int x = 0; x < 3; x++) {
                sb.append(c[cells[x][y] + 1]);
            }
            sb.append('\n');
        }
        return sb.toString();
    }

    public static void main(String[] args) {
        long start = System.nanoTime();
        Board b = new Board();
        b.move(1);
        long end = System.nanoTime();
        System.out.println(new BigDecimal(end - start).movePointLeft(9));
    }
}

The astute reader will have noticed I don't use alpha/beta cut-off. Still, on my somewhat dated notebook, this plays through a game in 0.015 seconds ...

精明的读者会注意到我没有使用 alpha/beta 截止值。尽管如此,在我有点过时的笔记本上,这会在 0.015 秒内完成一个游戏......

Not having profiled your code, I can't say for certain what the problem is. However, you logging each possible move at every node in the search tree might have something to do with it.

没有分析您的代码,我不能确定问题是什么。但是,您在搜索树中的每个节点上记录每个可能的移动可能与它有关。