"MetaCost vs Performance(Costs) operator"
我想知道是否有任何差异the implementation of the loss optimization function of the MetaCost operator vs the Performance(Costs) operator. I would not expect there to be. However, I am also seeing significant differences in outcomes when comparing a single DT learner using the Performance(Costs) operator with a cost matrix vs using the MetaCost operator with 1 iteration with an inner DT using the same cost matrix. There are wide divergences not only in the cost outcome but also other performance metrics such as accuracy and AUC, as well as the resulting models. See the attached example process:
<参数键= value =“use_local_random_seed真的"/>
<参数键= value =“use_local_random_seed真的"/>
<参数键= value =“use_local_random_seed真的"/>
<连接from_op = "应用模型(2)”from_port="labelled data" to_op="Performance (Cost DT)" to_port="example set"/>
@mschmitzany ideas on the underlying algorithms that would be relevant here, or other reasons these might be so different?
Answers
Hi@Telcontar120,
there is a severe difference. Performance Cost is "just" a performance measure. MetaCost is an ensemble learner which is i think tuning itself to work better on the cost metric.
From the docu:
The MetaCost operator makes its base classifier cost-sensitive by using the cost matrix specified in the cost matrix parameter. The method used by this operator is similar to the MetaCost method described by Pedro Domingos (1999).
The code for it is available here:https://github.com/rapidminer/rapidminer-studio/blob/master/src/main/java/com/rapidminer/operator/learner/meta/MetaCost.java
Btw,@hhomburgis the author
BR,
Martin
Dortmund, Germany
Yes, I understand that is the case and I know the difference between an ensemble and a base learner :-).
However, if you set the iterations of MetaCost to 1, then it should be using only one version of the inner learner, which in the example process I supplied is a DT with the same parameters as the second model which uses the same DT learner and the same cost matrix via Performance(Costs). In that case, why would the results be so different?
@hhomburgany ideas here?
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
@mschmitz@hhomburg@sgenzer@IngoAny ideas about this one? I'm still puzzling over why the differences are so great when the iterations for MetaCost = 1. Thanks for taking a look at it!
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts