C++ 决策树实现问题:在代码中思考
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5646120/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
C++ Decision Tree Implementation Question: Think In Code
提问by CodingImagination
I've been coding for a few years but I still haven't gotten the hang of pseudo-coding or actually thinking things out in code yet. Due to this problem, I'm having trouble figuring out exactly what to do in creating a learning Decision Tree.
我已经编码了几年,但我仍然没有掌握伪编码的窍门,也没有真正在代码中思考问题。由于这个问题,我很难弄清楚在创建学习决策树时要做什么。
Here are a few sites that I have looked at and trust me there were plenty more
以下是我看过的一些网站,相信我还有更多
Along with several books such as Ian Millington's AI for Games which includes a decent run-down of the different learning algorithms used in decision trees and Behavioral Mathematics for Game Programming which is basically all about Decision Trees and theory. I understand the concepts for a decision tree along with Entropy, ID3 and a little on how to intertwine a genetic algorithm and have a decision tree decide the nodes for the GA. They give good insight, but not what I am looking for really.
连同几本书,例如 Ian Millington 的 AI for Games,其中包括对决策树中使用的不同学习算法和游戏编程行为数学的详细介绍,这基本上都是关于决策树和理论的。我了解决策树的概念以及熵、ID3 以及如何交织遗传算法并让决策树决定 GA 的节点的一些概念。他们提供了很好的洞察力,但不是我真正想要的。
I do have some basic code that creates the nodes for the decision tree, and I believe I know how to implement actual logic but it's no use if I don't have a purpose to the program or have entropy or a learning algorithm involved.
我确实有一些为决策树创建节点的基本代码,并且我相信我知道如何实现实际逻辑,但是如果我对程序没有目的或者没有熵或涉及学习算法,那么它就没有用。
What I am asking is, can someone help me figure out what I need to do to create this learning decision tree. I have my nodes in a class of their own flowing through functions to create the tree, but how would I put entropy into this and should it have a class, a struct I'm not sure how to put it together. Pseudo-code and an Idea of where I am going with all this theory and numbers. I can put the code together if only I knew what I needed to code. Any guidance would be appreciated.
我想问的是,有人能帮我弄清楚我需要做什么来创建这个学习决策树。我有我的节点在他们自己的类中通过函数来创建树,但是我将如何将熵放入其中,如果它有一个类,一个结构我不知道如何将它放在一起。伪代码和我将如何处理所有这些理论和数字的想法。只要我知道我需要编码什么,我就可以把代码放在一起。任何指导将不胜感激。
How Would I Go About This, Basically.
基本上,我将如何处理这个问题。
Adding a learning algorithm such as ID3 and Entropy. How should it be set up?
添加学习算法,如 ID3 和 Entropy。应该如何设置?
Once I do have this figured out on how to go about all this,I plan to implement this into a state machine that goes through different states in a game/simulation format. All of this is already set up, I just figure this could be stand-alone and once I figure it out I can just move it to the other project.
一旦我弄清楚如何解决所有这些问题,我计划将其实现到一个状态机中,该状态机以游戏/模拟格式经历不同的状态。所有这些都已经设置好了,我只是认为这可以是独立的,一旦我弄清楚了,我就可以将它移到另一个项目中。
Here is the source code I have for now.
这是我现在拥有的源代码。
Thanks in advance!
提前致谢!
Main.cpp:
主.cpp:
int main()
{
//create the new decision tree object
DecisionTree* NewTree = new DecisionTree();
//add root node the very first 'Question' or decision to be made
//is monster health greater than player health?
NewTree->CreateRootNode(1);
//add nodes depending on decisions
//2nd decision to be made
//is monster strength greater than player strength?
NewTree->AddNode1(1, 2);
//3rd decision
//is the monster closer than home base?
NewTree->AddNode2(1, 3);
//depending on the weights of all three decisions, will return certain node result
//results!
//Run, Attack,
NewTree->AddNode1(2, 4);
NewTree->AddNode2(2, 5);
NewTree->AddNode1(3, 6);
NewTree->AddNode2(3, 7);
//Others: Run to Base ++ Strength, Surrender Monster/Player,
//needs to be made recursive, that way when strength++ it affects decisions second time around DT
//display information after creating all the nodes
//display the entire tree, i want to make it look like the actual diagram!
NewTree->Output();
//ask/answer question decision making process
NewTree->Query();
cout << "Decision Made. Press Any Key To Quit." << endl;
//pause quit, oh wait how did you do that again...look it up and put here
//release memory!
delete NewTree;
//return random value
//return 1;
}
Decision Tree.h:
决策树.h:
//the decision tree class
class DecisionTree
{
public:
//functions
void RemoveNode(TreeNodes* node);
void DisplayTree(TreeNodes* CurrentNode);
void Output();
void Query();
void QueryTree(TreeNodes* rootNode);
void AddNode1(int ExistingNodeID, int NewNodeID);
void AddNode2(int ExistingNodeID, int NewNodeID);
void CreateRootNode(int NodeID);
void MakeDecision(TreeNodes* node);
bool SearchAddNode1(TreeNodes* CurrentNode, int ExistingNodeID, int NewNodeID);
bool SearchAddNode2(TreeNodes* CurrentNode, int ExistingNodeID, int NewNodeID);
TreeNodes* m_RootNode;
DecisionTree();
virtual ~DecisionTree();
};
Decisions.cpp:
决定.cpp:
int random(int upperLimit);
//for random variables that will effect decisions/node values/weights
int random(int upperLimit)
{
int randNum = rand() % upperLimit;
return randNum;
}
//constructor
//Step 1!
DecisionTree::DecisionTree()
{
//set root node to null on tree creation
//beginning of tree creation
m_RootNode = NULL;
}
//destructor
//Final Step in a sense
DecisionTree::~DecisionTree()
{
RemoveNode(m_RootNode);
}
//Step 2!
void DecisionTree::CreateRootNode(int NodeID)
{
//create root node with specific ID
// In MO, you may want to use thestatic creation of IDs like with entities. depends on how many nodes you plan to have
//or have instantaneously created nodes/changing nodes
m_RootNode = new TreeNodes(NodeID);
}
//Step 5.1!~
void DecisionTree::AddNode1(int ExistingNodeID, int NewNodeID)
{
//check to make sure you have a root node. can't add another node without a root node
if(m_RootNode == NULL)
{
cout << "ERROR - No Root Node";
return;
}
if(SearchAddNode1(m_RootNode, ExistingNodeID, NewNodeID))
{
cout << "Added Node Type1 With ID " << NewNodeID << " onto Branch Level " << ExistingNodeID << endl;
}
else
{
//check
cout << "Node: " << ExistingNodeID << " Not Found.";
}
}
//Step 6.1!~ search and add new node to current node
bool DecisionTree::SearchAddNode1(TreeNodes *CurrentNode, int ExistingNodeID, int NewNodeID)
{
//if there is a node
if(CurrentNode->m_NodeID == ExistingNodeID)
{
//create the node
if(CurrentNode->NewBranch1 == NULL)
{
CurrentNode->NewBranch1 = new TreeNodes(NewNodeID);
}
else
{
CurrentNode->NewBranch1 = new TreeNodes(NewNodeID);
}
return true;
}
else
{
//try branch if it exists
//for a third, add another one of these too!
if(CurrentNode->NewBranch1 != NULL)
{
if(SearchAddNode1(CurrentNode->NewBranch1, ExistingNodeID, NewNodeID))
{
return true;
}
else
{
//try second branch if it exists
if(CurrentNode->NewBranch2 != NULL)
{
return(SearchAddNode2(CurrentNode->NewBranch2, ExistingNodeID, NewNodeID));
}
else
{
return false;
}
}
}
return false;
}
}
//Step 5.2!~ does same thing as node 1. if you wanted to have more decisions,
//create a node 3 which would be the same as this maybe with small differences
void DecisionTree::AddNode2(int ExistingNodeID, int NewNodeID)
{
if(m_RootNode == NULL)
{
cout << "ERROR - No Root Node";
}
if(SearchAddNode2(m_RootNode, ExistingNodeID, NewNodeID))
{
cout << "Added Node Type2 With ID " << NewNodeID << " onto Branch Level " << ExistingNodeID << endl;
}
else
{
cout << "Node: " << ExistingNodeID << " Not Found.";
}
}
//Step 6.2!~ search and add new node to current node
//as stated earlier, make one for 3rd node if there was meant to be one
bool DecisionTree::SearchAddNode2(TreeNodes *CurrentNode, int ExistingNodeID, int NewNodeID)
{
if(CurrentNode->m_NodeID == ExistingNodeID)
{
//create the node
if(CurrentNode->NewBranch2 == NULL)
{
CurrentNode->NewBranch2 = new TreeNodes(NewNodeID);
}
else
{
CurrentNode->NewBranch2 = new TreeNodes(NewNodeID);
}
return true;
}
else
{
//try branch if it exists
if(CurrentNode->NewBranch1 != NULL)
{
if(SearchAddNode2(CurrentNode->NewBranch1, ExistingNodeID, NewNodeID))
{
return true;
}
else
{
//try second branch if it exists
if(CurrentNode->NewBranch2 != NULL)
{
return(SearchAddNode2(CurrentNode->NewBranch2, ExistingNodeID, NewNodeID));
}
else
{
return false;
}
}
}
return false;
}
}
//Step 11
void DecisionTree::QueryTree(TreeNodes* CurrentNode)
{
if(CurrentNode->NewBranch1 == NULL)
{
//if both branches are null, tree is at a decision outcome state
if(CurrentNode->NewBranch2 == NULL)
{
//output decision 'question'
///////////////////////////////////////////////////////////////////////////////////////
}
else
{
cout << "Missing Branch 1";
}
return;
}
if(CurrentNode->NewBranch2 == NULL)
{
cout << "Missing Branch 2";
return;
}
//otherwise test decisions at current node
MakeDecision(CurrentNode);
}
//Step 10
void DecisionTree::Query()
{
QueryTree(m_RootNode);
}
////////////////////////////////////////////////////////////
//debate decisions create new function for decision logic
// cout << node->stringforquestion;
//Step 12
void DecisionTree::MakeDecision(TreeNodes *node)
{
//should I declare variables here or inside of decisions.h
int PHealth;
int MHealth;
int PStrength;
int MStrength;
int DistanceFBase;
int DistanceFMonster;
////sets random!
srand(time(NULL));
//randomly create the numbers for health, strength and distance for each variable
PHealth = random(60);
MHealth = random(60);
PStrength = random(50);
MStrength = random(50);
DistanceFBase = random(75);
DistanceFMonster = random(75);
//the decision to be made string example: Player health: Monster Health: player health is lower/higher
cout << "Player Health: " << PHealth << endl;
cout << "Monster Health: " << MHealth << endl;
cout << "Player Strength: " << PStrength << endl;
cout << "Monster Strength: " << MStrength << endl;
cout << "Distance Player is From Base: " << DistanceFBase << endl;
cout << "Disntace Player is From Monster: " << DistanceFMonster << endl;
//MH > PH
//MH < PH
//PS > MS
//PS > MS
//DB > DM
//DB < DM
//good place to break off into different decision nodes, not just 'binary'
//if statement lower/higher query respective branch
if(PHealth > MHealth)
{
}
else
{
}
//re-do question for next branch. Player strength: Monster strength: Player strength is lower/higher
//if statement lower/higher query respective branch
if(PStrength > MStrength)
{
}
else
{
}
//recursive question for next branch. Player distance from base/monster.
if(DistanceFBase > DistanceFMonster)
{
}
else
{
}
//DECISION WOULD BE MADE
//if statement?
// inside query output decision?
//cout << <<
//QueryTree(node->NewBranch2);
//MakeDecision(node);
}
//Step.....8ish?
void DecisionTree::Output()
{
//take repsective node
DisplayTree(m_RootNode);
}
//Step 9
void DecisionTree::DisplayTree(TreeNodes* CurrentNode)
{
//if it doesn't exist, don't display of course
if(CurrentNode == NULL)
{
return;
}
//////////////////////////////////////////////////////////////////////////////////////////////////
//need to make a string to display for each branch
cout << "Node ID " << CurrentNode->m_NodeID << "Decision Display: " << endl;
//go down branch 1
DisplayTree(CurrentNode->NewBranch1);
//go down branch 2
DisplayTree(CurrentNode->NewBranch2);
}
//Final step at least in this case. A way to Delete node from tree. Can't think of a way to use it yet but i know it's needed
void DecisionTree::RemoveNode(TreeNodes *node)
{
//could probably even make it to where you delete a specific node by using it's ID
if(node != NULL)
{
if(node->NewBranch1 != NULL)
{
RemoveNode(node->NewBranch1);
}
if(node->NewBranch2 != NULL)
{
RemoveNode(node->NewBranch2);
}
cout << "Deleting Node" << node->m_NodeID << endl;
//delete node from memory
delete node;
//reset node
node = NULL;
}
}
TreeNodes.h:
树节点.h:
using namespace std;
//tree node class
class TreeNodes
{
public:
//tree node functions
TreeNodes(int nodeID/*, string QA*/);
TreeNodes();
virtual ~TreeNodes();
int m_NodeID;
TreeNodes* NewBranch1;
TreeNodes* NewBranch2;
};
TreeNodes.cpp:
树节点.cpp:
//contrctor
TreeNodes::TreeNodes()
{
NewBranch1 = NULL;
NewBranch2 = NULL;
m_NodeID = 0;
}
//deconstructor
TreeNodes::~TreeNodes()
{ }
//Step 3! Also step 7 hah!
TreeNodes::TreeNodes(int nodeID/*, string NQA*/)
{
//create tree node with a specific node ID
m_NodeID = nodeID;
//reset nodes/make sure! that they are null. I wont have any funny business #s -_-
NewBranch1 = NULL;
NewBranch2 = NULL;
}
回答by LumpN
Please correct me if I'm wrong but judging from the images at http://dms.irb.hr/tutorial/tut_dtrees.phpand http://www.decisiontrees.net/?q=node/21the actual decision logic should go in the nodes, not in the tree. You could model that by having polymorphic nodes, one for each decision there is to be made. With a bit of change to the tree construction and a minor minor modification for the decision delegation your code should be fine.
如果我错了,请纠正我,但从http://dms.irb.hr/tutorial/tut_dtrees.php和http://www.decisiontrees.net/?q=node/21 上的图像判断实际决策逻辑应该在节点中,而不是在树中。您可以通过具有多态节点来建模,每个决策都有一个。对树结构进行一些更改并对决策委托进行一些小的修改,您的代码应该没问题。
回答by indigo
Basically you need to break everything down into stages and then modularise each part of the algorithm that you are trying to implement.
基本上,您需要将所有内容分解为多个阶段,然后对您尝试实现的算法的每个部分进行模块化。
You can do this in pseudo-code or in code itself using functions/classes and stubs.
您可以使用函数/类和存根在伪代码或代码本身中执行此操作。
Each part of the algorithm you can then code up in some function and even test that function before adding it all together. You will basically end up with various functions or classes that are used for specific purposes in the algorithm implementation. So in your case for tree construction you will have one function that calculates entropy at a node, another that partitions the data into subsets at each node, etc.
然后,您可以在某个函数中编写算法的每个部分,甚至可以在将其添加到一起之前测试该函数。您基本上会得到用于算法实现中特定目的的各种函数或类。因此,在您构建树的情况下,您将拥有一个计算节点熵的函数,另一个将数据划分为每个节点的子集,等等。
I am talking in the general case here rather than specifically with respect to decision tree construction. Check out the book on Machine Learning by Mitchell if you need specific information on decision tree algorithms and the steps involved.
我在这里谈论的是一般情况,而不是专门针对决策树构建。如果您需要有关决策树算法和所涉及步骤的特定信息,请查看 Mitchell 撰写的关于机器学习的书。
回答by MARK
The pseudocode to implement a decision tree is as follows
实现决策树的伪代码如下
createdecisiontree(data, attributes)
Select the attribute a with the highest information gain
for each value v of the attribute a
Create subset of data where data.a.val==v ( call it data2)
Remove the attribute a from the attribute list resulting in attribute_list2
Call CreateDecisionTree(data2, attribute_list2)
You will have to store nodes at each level with some code like
您必须使用一些代码在每个级别存储节点,例如
decisiontree[attr][val]=new_node
决策树[属性][val]=new_node