java 我应该如何为 Connect 4 设计一个好的评估函数?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10985000/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How should I design a good evaluation function for Connect 4?
提问by dragonmnl
I've a java implementation of "Connect 4" game (with a variable number of columns and rows) .
我有一个“Connect 4”游戏的 Java 实现(具有可变数量的列和行)。
This implementation use (according to the choice of the user) Mini-max algorithm of Mini-max with Alpha-beta pruning with a maximum depth of searching of maxDepth
此实现使用(根据用户的选择)Mini-max 的 Mini-max 算法和 Alpha-beta 剪枝,最大搜索深度为maxDepth
My problem now is the design of a good evaluation function for the state of the board(this is the value returned at maxDepth).
我现在的问题是为板的状态设计一个好的评估函数(这是在 maxDepth 处返回的值)。
The value is between -100 (worst choise,it corresponds to a losing situation)and 100 (best choise,it corresponds to a winning situation)where 0is supposed to be "draw" situation.
该值介于-100(最坏选择,对应于失败的情况)和100(最佳选择,对应于获胜情况)之间,其中0应该是“平局”情况。
Actually I've implemented two functions (I report pseudo-code because the code is very long)
其实我已经实现了两个函数(因为代码很长,我报告了伪代码)
1)
1)
- no win / no lose
- 没有赢/没有输
--> if table is full ==> draw (0)
--> 如果表已满 ==> 绘制 (0)
--> if table isn't full ==> unsure situation (50)
--> 如果表未满 ==> 不确定情况 (50)
- win
- 赢
--> if my win: 100
--> 如果我赢了:100
--> if win of opponent: -100
--> 如果对手获胜:-100
2)
2)
Of me:
- InARow[0] = maximum number of pieces in a HORIZONTAL in a row
- InARow[1] = maximum number of pieces in a VERTICAL in a row
- InARow[2] = maximum number of pieces in a DIAGONAL (ascending) in a row
- InARow[3] = maximum number of pieces in a DIAGONAL (descending) in a row
Of the opponent
- InARow2[0] = maximum number of pieces in a HORIZONTAL in a row
- InARow2[1] = maximum number of pieces in a VERTICAL in a row
- InARow2[2] = maximum number of pieces in a DIAGONAL (ascending) in a row
- InARow2[3] = maximum number of pieces in a DIAGONAL (descending) in a row
value = (100* (InARow[0] + InARow[1] + InARow[2] + InARow[3]) )/16 - (100* (InARow2[0] + InARow2[1] + InARow2[2] + InARow2[3]) )/16
I need to design a third (and if possible better)function. Any suggestion?
我需要设计第三个(如果可能的话更好)的功能。有什么建议吗?
Thank you in advance.
先感谢您。
采纳答案by Geoffrey De Smet
Just count the number of possible 4 in rows that each player can still make and substract that from each other.
只需计算每个玩家仍然可以制作的行中可能的 4 的数量,然后将其相减。
For example, both players start with a score of 7*4 (horizontal) + 4*7 (vertical) + 4*4 (diagonal up) + 4*4 (diagonal down)
. If red puts one in the left bottom corner, then yellow loses a score of 1 + 1 + 1 + 0 = 3
. But if red puts one in the middle instead, yellow loses a score of 4 + 1 + 1 + 1 = 7
.
例如,两个玩家的得分都是7*4 (horizontal) + 4*7 (vertical) + 4*4 (diagonal up) + 4*4 (diagonal down)
。如果红色将一个放在左下角,那么黄色会失去1 + 1 + 1 + 0 = 3
. 但是如果红色将一个放在中间,黄色会失去4 + 1 + 1 + 1 = 7
.
Of course, if any player wins, then the score of the other player is -infinity
, regardless of the system above.
当然,如果任何玩家获胜,那么其他玩家的得分为-infinity
,无论上述系统如何。
回答by jeff
you have the base cases ironed out: my win = 100 pts, my loss = -100, tie = 0. The "unsure" case you can kill, it does not reflect the "goodness" of the board. So now you need to fill in the gaps. Cases you want to consider and assign values to:
你已经解决了基本情况:我的赢 = 100 分,我的损失 = -100,平局 = 0。你可以杀死的“不确定”情况,它并不能反映董事会的“善良”。所以现在你需要填补空白。您要考虑的情况并为其赋值:
- I have X in a row (If i have 3 in a row, that's better than only two in a row - your function should favor adding to longer rows over shorter ones)
- My opponent has X in a row (Likewise, the more he/she has in a row, the worse off we are)
- Count how many rows you are filling in (Placing a piece and forming 2 rows of 3 is better than placing a piece and only forming one row of 3)
- Count how many rows you are blocking (similarly, if you can drop a piece and block two opponents rows of 3, that's better than blocking a single row of 2)
- 我连续有 X(如果我连续有 3 个,那比只连续两个好 - 您的功能应该倾向于添加较长的行而不是较短的行)
- 我的对手连续有X(同样,他/她连续的越多,我们就越糟糕)
- 数一数你要填多少行(放置一块并形成 2 行 3 比放置一块仅形成 3 行更好)
- 计算你挡住了多少行(同样地,如果你能放下一块并挡住对手的两排 3 排,那比挡住单排 2 更好)
回答by user3771690
Here are two separate evaluation functions for connect 4
这是连接 4 的两个单独的评估函数
- One basic evaluation function is as suggested in another answer, we can calculate the number of possible 4 in a rows the player can still make and subtract it from the opponent. You can give different weights or importance to blocks that already have three tiles compared to blocks that have only 1 tile.
- Another stronger evaluation function can be built using the concept of threats. threat is a square that connects 4 when a tile is dropped there by the opponent. You can simply return the difference in the number of threats by each player, but we can do much better by actually filtering useless threats (like a threat just above an opponents threat, or all threats above a threat by both players) and even assigning bonus for some threats (like lowermost threat of a column or 2 consecutive threats by the same player).
- 另一个答案中建议了一个基本的评估函数,我们可以计算玩家仍然可以连续制作的 4 个可能的数量,然后从对手中减去它。与只有 1 个图块的块相比,您可以为已经有 3 个图块的块赋予不同的权重或重要性。
- 可以使用威胁的概念构建另一个更强大的评估功能。威胁是一个方块,当对手在那里丢下一块瓷砖时,它会连接 4。您可以简单地返回每个玩家的威胁数量差异,但我们可以通过实际过滤无用的威胁(例如威胁略高于对手威胁,或所有威胁高于双方玩家威胁的威胁)甚至分配奖金来做得更好对于某些威胁(例如列的最低威胁或同一玩家的 2 个连续威胁)。
The weights can be tuned by hand or self learned for a larger project.
对于更大的项目,可以手动调整权重或自学。