Optimize grid with X-validation and Performance Costs with wrong optimum
Hi!
I'm trying to optimize a k-nearest neighbour inside a x-validation. The performance is measured by the performance cost operator and the x-validation delivers an average value for the missclassification costs.
When i put the whole process inside the optimize grid operator, the performance seem not to have any impact on the selection of optimal parameters. By logging every run of the process, i can identify much better average cost results. Can anyone give me a hint on what i'm doing wrong?
Thanks in advance
Steffen
Best Answer
-
MartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
Dear Steffen,
you need to set the class ordering. The behaviour without that is odd. I would have expected an error and not that it does something wrong. If you set the class ordering in Performance (Costs) it works. See attached process.
Another comment: The minimum found by log and by optimize are different. The reason is because one is logging the macro the other the micro performance (weighted and unweighted average of the k-folds). On a bigger data set, this should not make a difference.
~Martin
Spoiler
<连接from_op = "优化Parameters (Grid)" from_port="performance" to_port="result 1"/>
<连接from_op = "优化Parameters (Grid)" from_port="parameter" to_port="result 2"/>- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany0
Answers
Could you post your process here for us to investigate that? Thx
<连接from_op = "优化Parameters (Grid)" from_port="performance" to_port="result 1"/>
<连接from_op = "优化Parameters (Grid)" from_port="parameter" to_port="result 2"/>
Dear Martin,
thank you for your help! Now it's working!
-Steffen