How can we see the threshold chosen by the auto model classification model for final confusion mtx

unmunm MemberPosts:2Learner I
edited June 2019 inHelp
The auto model we created uses GBTree and produces a confusion matrix. We would like to see what threshold it had used for creating this matrix. Is there a way to view the threshold used?

Best Answers

Answers

  • unmunm MemberPosts:2Learner I
    Thanks@kypexinand@Telcontar120. Really appreciate your time answering this. Yes, we guessed so (As 0.5 as the threshold) but wanted to confirm it to see if its doing anything more intelligently. That answers the question!
    sgenzer
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University ProfessorPosts:1,751RM Founder
    Hi,

    We actually have been discussing this a bit. It is hard to do this in a really intelligent way for the reasons@kypexinhas been mentioning. Without knowing the business context, one value is almost as good as any other :-)

    然而,从这里到potenti有三种方法ally improve this a bit:
    1. Offer a full-blown cost matrix based approach for Auto Model and perform a threshold optimization for optimizing profits / costs
    2. Optimize thresholds in a way that Accuracy (or F-Measure or...) is maximized
    3. Do nothing and leave it as it is
    I personally do not like No 1 since it would take away some of the simplicity of AM in the early prototyping phase. But I see the benefits of course and could imagine to make this optional.

    No 2 is at least avoiding problems with strongly imbalanced data sets and is what many internal people here at RM would love to see for AM.

    No 3 is very efficient in terms of resources:smile:

    I appreciate any opinion here (including additional ideas). We may be able to improve this for one of the future releases if we have a good plan which is widely preferred.

    Thanks,
    Ingo
    sgenzer
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn
    Personally I think option #3 has the virtue of simplicity as well as efficiency---and thus is a good choice for automodel. Many users of automodel might not understand the nuances of threshhold selection and modification and I fear that if you incorporate that automatically into automodel (such as option #2) then that could lead to additional confusion and misunderstanding later. So my vote would be to keep option #3.

    Brian T.
    Lindon Ventures
    Data Science Consulting from Certified RapidMiner Experts
    sgenzer IngoRM
Sign InorRegisterto comment.