Cross validation

Yasmin · August 2018

Hello
I have a question about the output of cross validation. If we take 90% for training and 10% for testing, then why the result shows the whole data and doesn't show just 10% of test part?
I'll be thankful if someone answers my question.
Yasmin

lionelderkrikor · August 2018

Hi@Yasmin,

Legitimate question !

Here a possible element of answer :

In reality for a 10-fold cross validation, RapidMiner performs 11 iterations.

During the last iteration, RapidMiner applies the model to the whole training Dataset. So the length of the training set and the

length of the test set are the same.

Regards,

Lionel

NB : You can visualize this behaviour by setting a "Breakpoint After" on theApply Modeloperator (inside theCross Validationoperator)

tftemme · August 2018

Hi@Yasmin,

As it is true that the Cross Validation operator builds the final model on the whole data set (and thus performs a 11th iteration of the Training subprocess, in case the model port is connected), the Test process is only performed 10 times. But that is also the reason you have all your input data at the test result port. For every iteration step 10% of your input data is used in the test set. So within the Cross Validation all Examples of your input data are used once for testing.
For the outer result port all test sets are appended together, so you have again your whole input data set. You can visualize this by adding a Generate Attribute operator in the Test subprocess of the Cross Validation and generate an attributeiterationwith the valueeval(%{a})(宏%{}包含的次数current operator was applied).

Best regards,
Fabian

Yasmin · August 2018

Thank you so much@lionelderkrikorand@tftemmefor your complete responses.
Best regards.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Cross validation

Best Answers