Predicting project continuity based on variables -
Hi RapidMiner community! I'm a new user and am having difficulties with my first analysis.
I have a data set with lots of projects with different (about 15) numerical ratings from 1->5. I also know whether this project is successful or not. I would like to use RapidMiner to compare the projects and find which ratings are the most important ones for succesful projects.
To do this, I'm using aDecision Tree操作员和设置target roleof the variable Success (0 or 1) aslabel.
However, when I try to run it I get the error that "Decision tree does not have sufficient capabilities for the given data set: numerical label is not supported". Please refer to the screenshots for more info.
我应该use a different operator? If not, what am I doing wrong?
I greatly appreciate your advice!
Lukas
Best Answer
-
MartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
Hi Lukas!
Welcome to the community.
The standard RM Decision Tree does indeed not support numerical labels. You can either use another algorithm like GBT or simply "cast" your numerical 0 and 1 into Nominals. This can be done with on various ways. E.g. a Generate attribute with the str() function.
~Martin
- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany0
Answers
If you already have a success label defined as a 0/1 numerical attribute, then probably the easiest way to convert that is to use the "numerical to binominal" operator. Make sure you check "include special attributes" and select your label variable. You can set min to 0 and max to any number less than 1 and it will convert your 0/1 number into a false/true binominal variable in one easy step.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
Thanks for your advice! It worked.