RM 9.4 feedback (official release) : Costs/Benefits calculation

lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn
Dear all,

First thanks you for implementing the costs/benefits calculus in this new release - I think lot of users (including me) waited for this new feature.

2 months ago I had several questions in this thread about the Costs/Benefits calcultation and thanks to@IngoRMto answer me, that's was clear :

https://community.www.turtlecreekpls.com/discussion/55904/questions-on-rapidminer-9-4-beta-new-releases

But in this official release , I'm seeing that "Total Cost/Benefit (expected) and the associated average were abandoned. My first question is why ?

The "Total Cost/Benefit (expected)" and the associated average are replaced by :
- "Total for best option"
- "Gain"

My second question is : can you explain how this 2 numbers are calculated (despite my efforts i was not able to retrieve them) and why these 2 new numbers are more relevant than the "Total Cost/Benefit (expected)" ?

Here my attempt to retrieve these 2 numbers with the Titanic Dataset with all options by default in AutoModel with NB model :




Third question : in the new column called "cost" why the cost is not counted as negative when the prediction is wrong (I suppose the following cost matrix as the following) :








Thanks you for your listening,

Regards,

Lionel
Tagged:
varunm1 Tghadially

Best Answers

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn
    Hi Ingo,

    Yes, your long and detailed explanation helps me a lot to understand these new concepts of Benefits/Costs.#noblackboxes :)
    Thank you for spending your time answering my questions.

    Now you'll think I'm picky about the details, but I will quote the deutsch philosopherFriedrich Nietzsche : "The Devil is in the details" >:)
    I begin :
    The 3 money indicators (Total Cost/Benefits, Total for Best Option, Gain) are calculatedon the whole validation set524年(即为《泰坦尼克号》数据集examples [1309 examples x 40%]) :



    But the displayed confusion matrix is NOT builded on the whole validation test :



    Here we can see that the number of examples used to build this confusion matrix (always for the Titanic) is
    219 + 135 + 7 + 14 =375 examplesA priori due to the factor 5 /7 introduced by thePerformance Average (Robust)operator.

    我的问题是一个同质性的问题e results, should the 3 moneys indicators not be calculated with this displayed confusion matrix ? In other words, actually, the displayed money indicators don't correspond directly to the displayed confusion matrix ...

    Thanks you for your patience and your listening...

    Regards,

    Lionel



    sgenzer Tghadially
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn
    You got me there
    So Friedrich Nietzsche was right .....>:)

    More seriouly, I agree with your point of view, Ingo, and once again, thanks for taking the time to answer me.

    Regards,

    Lionel
    sgenzer Tghadially IngoRM
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn
    This is a very interesting discussion. I haven't had a chance to dive into this new operator yet, but I had a couple of questions.
    @IngoRMhow is the new operator different from the existing Performance(Costs) operator? Or is it?
    It appears that they require the same inputs (a class order and then a misclassification cost matrix). In this framework, are you still allowed to enter benefits as negative costs?

    Brian T.
    Lindon Ventures
    Data Science Consulting from Certified RapidMiner Experts
    Tghadially
Sign InorRegisterto comment.