Macros in Cost Matrix (Performance)

FBTFBT MemberPosts:106Unicorn
edited December 2018 inHelp

Hallo community,

I am trying to run an optimization of a model based on costs for wrong and correct classifications. However, instead of assigning fixed values in the cost matrix, I would like to use macros and loop over my example set to set the required cost values. To give a bit more color, imagine you are trying to make a classification on customer churn and want to assign different cost values for each customer, i.e. the customer's revenue in the past 6 months.

Building the logic is not a big problem (at least I believe it isn't), however, the operator "Performance (Costs)" does apparently not accept macros as input values. Is there anything I can do about it, or any other work around?

This would be a short sample process based on the Titanic data:




<输出/ >





< parameter key="repository_entry" value="//Samples/data/Titanic"/>


< parameter key="attribute_name" value="Survived"/>
< parameter key="target_role" value="label"/>




< parameter key="Scoring" value="(if(Age < 5, 1.1, if(Age > 5 && Age < 25, 1.2, 1.3)))-1"/>






< parameter key="sampling_type" value="stratified sampling"/>















< parameter key="macro" value="Scoring"/>
< parameter key="macro_type" value="data_value"/>
< parameter key="attribute_name" value="Scoring"/>
< parameter key="example_index" value="%{example}"/>



< parameter key="cost_matrix" value="[0.0 1.0;1.0 0.0]"/>

< parameter key="class_name" value="Yes"/>
< parameter key="class_name" value="No"/>










































Alternatively, does anybody have an example process for the operator "Performance (User-Based)"? It does not have a tutorial process and I am having a hard time figuring out how exactly it works.

Best Answer

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Solution Accepted

    Hi,

    you can take Generate Attributes and Extract Performance to get a similar result. Just build a "cost" attribute which is %{churnChurn} for churner who is churn and so on. Afterwards, you extract the average of this as performance.

    You are of course halfway through to take a customer-based performance (e.g. his Customer Lifetime Value).

    Cheers,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
    FBT

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn

    我已经思考这个问题,conceptually I don't believe the performance cost operator can utilize different values for different cases, unless you are literally building a separate model for each case (like inside a Loop Examples, for instance). Since the thing that is being minimized is the misclassification cost across all observations based on different models, if it needed to have a different calculation for each observation, then there would potentially be a different model required.

    Having said that, I am not sure why it wouldn't accept a macro to set the values in the cost matrix---that's a question for the developers, I think.

    Brian T.
    Lindon Ventures
    Data Science Consulting from Certified RapidMiner Experts
  • FBTFBT MemberPosts:106Unicorn

    Thanks@mschmitz! That is exactly what I was looking for and I officially found my new favorite performance operator. :-)

    Also thanks to@Telcontar120, you are probably right that my initial workaround proposal is flawed and would not work as intened. Luckily, RM has apparently a great solution for every possible problem.

    sgenzer
Sign InorRegisterto comment.