Combination of Ada Boost and SMOTE operators
ozgeozyazar
MemberPosts:21Maven
Hi !
I am currently dealing with imbalanced data for my classification problem. I have read couple of articles that uses boost based ensembles. I would like to use combination of Adaboost and SMOTE but I am note sure that if I correctly applied it. Please find xml version below. If you have any idea to improve this process or if you believe that those algorithms applied incorrectly, could you please help me ?
Özge
I am currently dealing with imbalanced data for my classification problem. I have read couple of articles that uses boost based ensembles. I would like to use combination of Adaboost and SMOTE but I am note sure that if I correctly applied it. Please find xml version below. If you have any idea to improve this process or if you believe that those algorithms applied incorrectly, could you please help me ?
Özge
<运营商激活= " true " class = "过程”兼容ibility="8.0.001" expanded="true" name="Process">
<运营商激活= =“乘”那么“false”类tibility="8.0.001" expanded="true" height="68" name="Multiply" width="90" x="246" y="238"/>
<操作符= " true " class = " performance_clas激活sification" compatibility="8.0.001" expanded="true" height="82" name="Classification" width="90" x="179" y="238">
<运营商激活= =“乘”那么“false”类tibility="8.0.001" expanded="true" height="68" name="Multiply (2)" width="90" x="45" y="187"/>
<操作符= " true " class = " performance_clas激活sification" compatibility="8.0.001" expanded="true" height="82" name="Performance (3)" width="90" x="313" y="34">
<运营商激活= =“乘”那么“false”类tibility="8.0.001" expanded="true" height="68" name="Multiply (3)" width="90" x="45" y="187"/>
<操作符= " true " class = " performance_clas激活sification" compatibility="8.0.001" expanded="true" height="82" name="Performance (4)" width="90" x="313" y="34">
Tagged:
10
Answers
The process looks fine for me, actually using SMOTE inside the validation is recommended as it avoids overfitting. Are you facing any errors?
Thanks,
Varun
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
Dear@varunm1,
I have not get any error but I would like to be sure this model works it is aimed. Actually my main consern is using AdaBoost with SMOTE operator is an applicable approach or not ?
Best Regards,
Özge
I got your point. I see that you were using smote inside Adaboost, In my view, this should be applied just before AdaBoost because in your XML Adaboost is boosting unsampled data and then smote is applied which seems to minimize the impact of SMOTEboost technique.
@mschmitzany insights here.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
@varunm1I changed the model as you have indicated above and now I am getting this error.
即使我改变的邻居paremeter #1, unclick the equalize classes parameter and define the upsampling size I got this error.
What should I do for this ?
Dortmund, Germany
I tried with titanic data set and it worked fine in that case. I can take a look if can get the data. To check the error, you can set a breakpoint before smote and observe class distribution coming i to SMOTE.
Thanks
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
my minority class is negative class, Can it be a reason ?
Dortmund, Germany
48 for class 0 and 501 for class 1.
Dortmund, Germany
Smote is inside the cross validation operator and this result appears before smote operator.
But I am also using optimize parameters and I realized that without parameter optimization I do not get this error.
Best Regards,
Özge
That seems fine. Is there any that you can send data here or in Private Message on the community, so that we can take a look of what the issue is?
Thank you
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
Dortmund, Germany
If you still have issues, you can inform here.
Thanks
Varun
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
Thanks for your all efforts.
I have one more question. I would like to see the combined result of decision tree with AdaBoost. But I do not know if it is already exists or if I need to do additional things to get combined result.
Could you please help me ?
Best Regards,
Özge Özyazar
Do you mean you want Adaboost+decision tree results and Decision Tree alone results in the same process right? If this is what you were asking. Here is the XML. I set the seed 1992 for Cross-validation, so the data set is the same for both. You can find the attached .rmp file. Import this using (File --> Import Process) and you can change some repository locations and run it to see two performances (One fro AdaBoost+Decision Tree and the other one for Decision Tree only).
Hope this helps
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing