Model selection using cross validation

Smith · July 2014

Hallo,

I am trying to do a simple model selection using cross validation in Rapidminer.
The goal is to evaluate various classification methods usingthe same foldsof cross validation for each method and select the one with the best averaged performance over the folds. This should be done within one process automatically.

Below is the process I've created to accomplish this goal.

The potential problem I see with this approach is the fact that the X-Validation operator is placedwithinthe Loop operator. I am not sure if every classification method will be evaluated on the same folds of X-Validation.. Is this guaranteed when the local random seed parameter of X-Validation is set to true?
Is there maybe an easier way to finding the best classifier for the given classification task automatically?

Many thanks in advance.


















<运营商激活= " true " class = " x_validation”有限公司mpatibility="5.3.013" expanded="true" height="112" name="Validation" width="90" x="179" y="30">





































< portSpacing端口= " sink_through 1”间隔= " 0 " / >



































[ /code]

fras · July 2014

You are on the right way. Dont forget to optimize parameters of the decisions tree or replace
it by a more friendly learner.

Smith · August 2014

Hi,

thank you for your input.

If I understood you corretly, you do think that the parameter local random seed of the X-Validation operator ensures the same division of the dataset for applying and comparing each learning method?

Regarding the Optimize Parameters operator: indeed it would be nice to select not just the optimal method for the given problem, but also its optimal parameters within one process. Can you be more concrete and suggest an approach or even better post a process for doing this? I read about nested cross validation approaches that deal with this issue, but it seems like there is no standard way of doing this... Doing it sequentially (first select the best learner with default parameters and then find potentially better parameter values) is surely not an optimal way.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Model selection using cross validation

Answers