Entropy and Gain for Decision Tree more than 1
kinglaplace
MemberPosts:3Contributor I
Hi,
I am a newbie in data mining. I am interested to implement decision tree to predict my case. My case has 9 output prediction. When I try to calculate manually, entropy and gain value more than 1. How to solve it?Then, where can I see the entropy and gain result in rapidminer, so I can compare with manual calculation?
Thank you.
Tagged:
0
Answers
hello@kinglaplace- welcome to the community. Can you please post your XML process (see "Read before Posting on right when you reply)? And have you looked at the videos on decision tree modeling (see"Creating a Decision Tree Model" here)?
Scott
Thank you for your help. Here are I send the data train. How to choose the best model for my data?
Hi@kinglaplace,
To choose the best model for your data, I recommend you the toolAutomatic model selection and optimization
Pavithra_Rao).
This tool help to choose the best model (the model which has the best performances) between several optimized models.
我执行这个工具与您的数据基准3models (decision tree, Random Forest, Gradient Boosted Tree).
It seems that Gradient Boosted Tree is the best : Accuracy = correct predictions /total predictions = 89.60% (mean), but it is very close of the performance of the Decision Tree.
注:你必须考虑我的其他性能trics like recall, precision too.
Here the process :
Now you can experiment by yourself with other models and/or other optimization settings of the actual models.
Regards,
Lionel
Thank you for your information. For decision tree, I've tried to implement by manually calculate for entropy and gain. But the value are more then 1. I always get maximum value for both maximum 1 in every references. How to get entropy and gain display in rapid miner?So I can compare with manual result that have been calculated. Then, I also always got in a lot of example of tree decision for two condition. But in my case there are 8 output condition. Is tree decision can be implemented in more than two output condition?
Thank you.
Hi@kinglaplace,
It seems to me that RapidMiner did not display the entropy and the gain in the results. There is the "cross-entropy" which is calculed byPerformance (Classification)operator, but it is a measure of the performance of the model and different from what you are looking for, in my opinion.
Decision tree can of course be implemented in case of 8 output conditions.
Regards,
Lionel