Optimization Problem

Legacy UserLegacy User MemberPosts:0Newbie
edited November 2018 inHelp
Hi,

I would like to find the best classification model w.r.t. accuracy
for a given example set. To achieve best results, my idea is to evaluate
different supervised learners and optimize their parameters. In addition,
different feature selection algorithms should be applied to provide most
suitable input for the parameter optimization of each learner.

So, my idea is something like a nested model:
for each learner that should be evaluated
for each example set determined by particular feature selection
perform parameter optimization for given feature set and learner
return: model model with maximal accuracy

What do you think about this idea? Does it make sense to mix a feature
selection and a learner parameter optimization to find the most accurate
model, i.e. to first Or whould you proceed differently in that case?
Are other approaches more common in practice?

In am of the opinion that the most accurate model can be only found
when different example sets are provided for the parameter optimization
to get a high number of combinations for the performance evaluation.
Correct me if I'm wrong. :-)

If my idea is OK, I would ask you to help me modelling this use case
in RapidMiner. It should be something like sample
05_Features/10_ForwardSelection.xml but not using just the NearestNeighbor
as learner but an parameter optimization problem like
07_Meta/01_ParameterOptimization.xml.

This is the code for the feature selection:





























but I don't get to replace the NearestNeighbor but an parameter
optimization problem. Could you help me?

Regards,
Martin

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn
    Hi Martin,
    just replace it with an operator delivering an Model. You could do anything inside an OperatorChain if a model will be returned at the end.

    总的想法是不错,这可以解决所有our problems. Small problem: Computational costs will be a little bit too high with real data sets, because of the double or tripple exponential nature of the search space...

    Good luck:)

    Greetings,
    Sebastian
  • Legacy UserLegacy User MemberPosts:0Newbie
    Hi Sebastian et all,

    thank you for your answer.

    However, I'm still not able to replace the simple NearestNeighbor
    model (used for the feature selection optimization) by an optimized
    learner model returned by a GridParameterOptimization. This
    is my current non-working model:


    class="ExampleSource"> value="../data/polynomial.aml"/> class="FeatureSelection" expanded="yes"> class="XValidation" expanded="yes"> key="sampling_type" value="shuffled sampling"/> name="GridParameterOptimization" class="GridParameterOptimization"
    expanded="yes"> expanded="yes"> class="NearestNeighbors"> class="OperatorChain" expanded="yes"> class="ModelApplier"> name="ClassificationPerformance" class="ClassificationPerformance">
    name="ApplierChain" class="OperatorChain" expanded="yes"> name="Applier" class="ModelApplier"> key="application_parameters">
    name="Performance" class="Performance">
    key="log"> value="operator.FS.value.generation"/> key="performance" value="operator.FS.value.performance"/>

    The problem is the first operand of operator XValidation which
    expects a model but gets a ParameterSet and PerformanceVector.
    I have no idea how I can return the best model found by
    GridParameterOptimization and pass it to the next operator
    ApplierChain. Could you please help me extending my model?

    Thank you.

    Regards,
    Martin
  • TobiasMalbrechtTobiasMalbrecht Moderator, Employee, MemberPosts:294RM Product Management
    Hi Martin,

    你只需要把另一个学习者process after the [tt]GridParameterOptiomization[/tt] operator. For that learner you should set the optimized parameters via a [tt]ParameterSetter[/tt]. The following process gives you an idea, how to do that:















































    Hope that helps,
    Tobias
  • Legacy UserLegacy User MemberPosts:0Newbie
    Hi Tobias,

    yes, that was exactly what I was looking for. Thank you.

    Regards,
    Martin
Sign InorRegisterto comment.