java MiniMax 实现

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16221686/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 22:16:38  来源:igfitidea点击:

MiniMax Implementation

javaminimax

提问by MrD

I am trying to write a small AI algorithm in Java implementing the miniMax algorithm.

我正在尝试用 Java 编写一个小型 AI 算法来实现 miniMax 算法。

The game upon which this is based is a two-player game where both players make one move per turn, and each board position resulting in each player having a score. The "quality" of a position for player X is evaluated by subtracting the opponent's score from player X's score for that position. Each move is represented by an integer (i.e. Move one is made by inputting 1, move two by inputting 2 etc)

以此为基础的游戏是一个两人游戏,其中两个玩家每回合移动一次,每个棋盘位置都会导致每个玩家都有一个分数。球员 X 位置的“质量”是通过从球员 X 在该位置的得分中减去对手的得分来评估的。每个移动都用一个整数表示(即输入 1 移动一,输入 2 移动二等)

I understand that miniMax should be implemented using recursion. At the moment I have:

我知道应该使用递归来实现 miniMax。目前我有:

An evaluate()method, which takes as parameters an object representing the board state (Ie "BoardState" object and a boolean called "max" (the signature would be evaluate(BoardState myBoard, boolean max)).

一个evaluate()方法,它将一个代表棋盘状态的对象作为参数(即“BoardState”对象和一个名为“max”的布尔值(签名将是evaluate(BoardState myBoard, boolean max))。

max is true when it is player X's turn. Given a board position, it will evaluate all possible moves and return that which is most beneficial for player X. If it is the opponent's turn, max will be false and the method will return the move which is LEAST beneficial for player X (ie: most beneficial for player y)

当轮到玩家 X 时 max 为真。给定一个棋盘位置,它将评估所有可能的移动并返回对玩家 X 最有利的移动。如果轮到对手,max 将为假,并且该方法将返回对玩家 X 最不利的移动(即:对玩家 y 最有利)

However, I am having difficulties writing the actual miniMaxmethod. My general structure would be something like:

但是,我在编写实际miniMax方法时遇到了困难。我的一般结构是这样的:

public int miniMax(GameState myGameState, int depth)

Whereby I submit the initial gameState and the "depth" I want it to look into.

因此,我提交了初始游戏状态和我希望它查看的“深度”。

I would then be having something like:

然后我会有类似的东西:

int finalMove = 0;
while(currentDepth < depth) {
    GameState tmp = myGameState.copy();
    int finalMove = evaluate(tmp, true or false);
    iniMax(tmp.makeMove(finalMove));

}
return finalMove;

Would this sound like a plausible implementation? Any suggestions? :)

这听起来像是一个合理的实现吗?有什么建议?:)

Thanks!

谢谢!

回答by bysreg

that wont work.

那行不通。

details :

细节 :

  1. it will cause infinite loop. currentdepth never gets incremented
  2. your definition of evaluation seems to be different than the majority. Normally evaluation function will return the predicted value of the game state. Isnt your definition of evaluate function is just the same as what the minimax function do ?
  3. is miniMax and MiniMax different? because if you meant recursion then you need to pass depth-1 when calling the next miniMax
  1. 它会导致无限循环。currentdepth 永远不会增加
  2. 您对评估的定义似乎与大多数不同。通常评估函数会返回游戏状态的预测值。您对评估函数的定义与 minimax 函数的定义不一样吗?
  3. miniMax 和 MiniMax 有什么不同?因为如果你的意思是递归,那么你需要在调用下一个 miniMax 时传递 depth-1

the idea of minimax is depth first search. and onlyevaluate leaf nodes(nodes with maximum depth or nodes that is a win or tie) and pick one that is max if the current player is the maximizing one and pick min if the current player is the minimizing one.

minimax 的思想是深度优先搜索。并且评估叶节点(具有最大深度的节点或获胜或平局的节点),如果当前玩家是最大化的玩家,则选择最大的节点,如果当前玩家是最小的玩家,则选择最小的节点。

this is how i implemented it :

这就是我实现它的方式:

function miniMax(node, depth)
    if(depth == 0) then --leaf node     
        local ret = evaluate(node.state)
        return ret
    else -- winning node
        local winner = whoWin(node.state)       
        if(winner == 1) then -- P1      
            return math.huge
        elseif(winner == -1) then -- P2         
            return math.huge*-1
        end 
    end

    local num_of_moves = getNumberOfMoves(node.state)   
    local v_t = nil
    local best_move_index = nil 
    if(getTurn(node.state) == 1) then -- maximizing player      
    local v = -math.huge 
        for i=0, num_of_moves-1 do                      
            local child_node = simulate(node.state, i)) -- simulate move number i
            v_t = miniMax(child_node, depth-1)
            if(v_t > v) then 
                v = v_t 
                best_move_index = i             
            end
        end             
        if(best_move_index == nil) then best_move_index = random(0, num_of_moves-1) end 
        return v, best_move_index
    else -- minimizing player
    local v = math.huge
        for i=0, num_of_moves-1 do          
            local child_node = simulate(node.state, i)
            v_t = miniMax(child_node, depth-1)
            if(v_t < v) then                
                v = v_t             
                best_move_index = i             
            end
        end
        if(best_move_index == nil) then best_move_index = random(0, num_of_moves-1) end
        return v, best_move_index                               
    end 
end

Note:

笔记:

  • return v, best_move_index means returning two values of v and best_move_index(above code is in lua and lua can return multiple values)

  • evaluate function returns the same score for both players(ie game state A in point of view P1 is scored 23, and in point of view P2 is also scored 23)

  • this algo will only work if the two player run alternately(no player can run two moves consecutively), you can trick this restriction by giving the opponent one move, that is move PASS(skip his/her turn) if the other player need to move twice.

  • this minimax can be further optimized(sorted from the easiest one) :

    1. alpha-beta pruning
    2. iterative deepening
    3. move ordering
  • return v, best_move_index 表示返回 v 和 best_move_index 两个值(以上代码在lua中,lua可以返回多个值)

  • 评估函数为两个玩家返回相同的分数(即从游戏状态 A 的角度来看 P1 的得分为 23,从 P2 的角度来看也得分为 23)

  • 此算法仅在两个玩家交替运行时才有效(没有玩家可以连续运行两个动作),如果其他玩家需要,您可以通过给对手一个动作来欺骗此限制,即移动 PASS(跳过他/她的回合)移动两次。

  • 这个极小极大可以进一步优化(从最简单的排序):

    1. alpha-beta 剪枝
    2. 迭代深化
    3. 移动排序

回答by HennyH

I made an implementation of minimax in lua. I hope it helps give you an idea of how to tackle the algorithm form a Java perspective, the code should be quite similar mind you. It is designed for a game of tic-tac-toe.

我在 lua 中实现了 minimax。我希望它可以帮助您了解如何从 Java 的角度处理算法,请注意,代码应该非常相似。它专为井字游戏而设计。

--caller is the player who is using the minimax function
--initial state is the board from which the player must make a move
local function minimax(caller,inital_state)
    local bestState = {}; --we use this to store the game state the the player will create
      --this recurse function is really the 'minimax' algorithim 
    local function recurse(state,depth)
        --childPlayer is the person who will have their turn in the current state's children
        local ChildPlayer = getTurn(state)
        --parentPlayer is the person who is looking at their children
        local ParentPlayer = getPreviousTurn(state)
        --this represents the worst case scenario for a player
        local alpha =  - (ChildPlayer == caller and 1 or -1 );
        --we check for terminal conditions (leaf nodes) and return the appropriate objective value
        if win(state) then
            --return +,- inf depending on who called the 'minimax'
            return ParentPlayer == caller and 1 or -1;
        elseif tie(state) then 
            --if it's a tie then the value is 0 (neither win or loss)
            return  0;
        else
            --this will return a list of child states FROM the current state
            children = getChildrenStates(state,ChildPlayer)
            --enumerate over each child
            for _,child in ipairs(children) do
                --find out the child's objective value
                beta = recurse(child,depth+1);
                if ChildPlayer == caller then 
                    --We want to maximize
                    if beta >= alpha then 
                        alpha = beta
                        --if the depth is 0 then we can add the child state as the bestState (this will because the caller of minimax will always want to choose the GREATEST value on the root node)
                        if depth == 0 then
                            bestState = child;
                        end 
                    end 
                --we want to MINIMIZE
                elseif beta <= alpha then 
                    alpha = beta;
                end 
            end 
        end
         --return a non-terminal nodes value (propagates values up the tree)
        return alpha;           
    end
     --start the 'minimax' function by calling recurse on the initial state
    recurse(inital_state,0);
     --return the best move
    return bestState;
end