DT : different attribute weights with/without cross-validation

lionelderkrikor · December 2017

Hi,

I created 2 processes including a decision tree model from the "Golf" dataset.

1. First a classic DT model :

In this case, for the attribute weights, i get :

Here the process :







<运营商激活= " true " class = "过程”兼容ibility="8.0.001" expanded="true" name="Process">

2.A DT model with a cross validation :

In this case, for the attribute weights, i get :

Here the process :







<运营商激活= " true " class = "过程”兼容ibility="8.0.001" expanded="true" name="Process">

In both cases, as expected, the two DT models are strictly the same. Why the attributes

weights are not equals ?

NB :In case ofsplit validation, I retrieve the attribute weights of case 1.

Thanks you for your feedback,

Regards,

Lionel

Pavithra_Rao · December 2017

HiLionel,

交叉验证将建立k (k = n + 1模型umber of folds) and the attribute weights i.e outputted is for the last iteration. As you would know in each iteration the training set and testing sets will have different subsets of data. Hence the weight output of classic DT model (where entire data is consumed by the model at a time) and CV DT model are not same.

Also, it's always good to generate weights of the attributes using entire dataset (i.e classic model) rather than the subset of the data (i.e via cross-validation/split validation).

Cheers,

lionelderkrikor · December 2017

HiPavithra,

Thank you for this clear explanation. Now I understand better these differences of results.

So, in practice, i have to duplicate my model outside the cross-validation operator to generate

the "good weights".

Thanks you,

Best regards,

Lionel

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

DT : different attribute weights with/without cross-validation

Best Answer

Answers