Classification accuracy (stacking)
keesloutebaan
MemberPosts:2Newbie
Hey there,
I am currently working on a polynomial classification project. The goal is to reach the highest possible accuracy.
I found out that the 'deep learning' and the 'gradient boosted trees' operator work really well.
Now, I want to find out if stacking can improve the performance. However, I tried a few combinations but every time, the performance drops.
Can someone maybe tell me if there are any important rules to take into account when it comes to stacking? When is it helpful and what settings are then required?
Thanks a lot
Tagged:
0
Best Answer
-
BalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified ExpertPosts:949UnicornHi,
the idea behind ensemble models like Stacking is that they improve the performance of non-perfect learners. But they can also create more complex, overfitted models.
Both GBT and to some extent Deep Learning are already complex ensemble models.
Stacking could only improve upon them if they had some systematic bias or error source, the errors were different, and the stacking model could somehow identify the right model for most cases that are predicted differently.
If any of these assumptions is not true, as is likely in your case, stacking or another model combination won't improve the result.
Regards,
Balázs6
Answers
as with any tree method, you can apply prepruning and postpruning.
Prepruning applies to the decisionbeforecreating a new split. maximal depth and min rows would restrict these, giving you a less complex (and maybe less overfitted) tree.
Postpruning is decidingaftera split has been created. min split improvement would apply a statistical test on each split result and decide if it was worth it. This again reduces the tree complexity.
That said, GBT (like random forest) is meant to reduce the overfitting problem of decision trees, so it is entirely possible that your model won't become better by changing these settings. (Because it's already coping well with possibly overfitted trees.)
For other options, see the documentation. They might be very data dependent.
Regards,
Balázs