Leave One Out results in AUC of 0.5
erik_van_ingen
MemberPosts:8Learner I
My target label is binominal, number of examples is 553. Running supervised classification deep learning with cross validation:
- 10-foldresults inAUC = 0.846 and Accuracy = 76%
- Leave 3 Out (180 fold)results inAUC = 0.5 and Accuracy = 42%
Tagged:
1
Best Answers
-
rfuentealba Moderator, RapidMiner Certified Analyst, Member, University ProfessorPosts:568UnicornHello@erik_van_ingen,
It may sound a bit mind-boggling, but I wouldn't trust any of these. Why? Because you are using supervised Deep Learning and your number of examples is not big enough to justify it.
First of all, I would check if the classes are balanced enough to provide meaningful training and repeat the results.
Now, this raises another question: what kind of sampling are you using for your cross-validation? Try usingstratified samplingand check how it performs. If you are using linear sampling, for example, and you have your data ordered by yourlabelortargetvariable, leaving one out will probably not work well. Do this before beginning to work with SMOTE sampling or your sampling techniquedu jour.
Hope this helps,
Rodrigo.5 -
varunm1 Moderator, MemberPosts:1,207UnicornHello@erik_van_ingen
Would like to point two things. First, as the data set is small the higher number of folds give lesser test data which means you will have a lot of variance in your results. This is due to the inability of the test set to capture all the underlying distributions in data. I recommend going with a 3 or 5 fold for this dataset.
Second, I am not sure if you applied any feature selection techniques (forward, backward etc.) on your data, you can do that and see attributes that are helpful in predicting. This might improve your performances and reduce computational complexity as well.
Thanks for your understanding.Regards,
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
7
Answers
Furthermore, I used the generate weight operation to anticipate on the class imbalance. I tested this both outside the cross validation as within the cross validation.
I tested other ML operators as well as Naive Bayes, Gradient Boost and so forth. DL performed usually the best.
Yes, I am aware that the sample size is relatively low. What ML operator would best fit, given the sample size?
I still don't quite get it why AUC is close to chance whenever Leave-One-Out cross-validation is used.
I can see why accuracy measure has a high standard-deviation (each fold, you are either getting 100% correct prediction, or 0 percent correct prediction), but how is that also affecting AUC? Is it because how AUC is actually calculated (can you elaborate on this)?
By the way, class imbalance, modling techniques, and data size, seem not to have effect on this (try it with many variations of above in rapidminer) and same thing is observed about AUC.