Problem with generalized linear model (lambda seach)

scottchung64scottchung64 成员Posts:1Learner I
edited December 2018 inProduct Feedback - Resolved

Hi all,

I'm trying to do classification using generalized linear model.

In default setting, the lambda value is chosen by H2O (described in documentation).

However, I found that if I use lambda search, the performance is much better.

I don't understand what is the difference between this two method.

Is the better performance from doing lambda search comes from overfitting?

Thanks!

Best,

Scott

0
0 votes

Declined·Last Updated

Closing this idea - zero votes since March 2018. Please re-open if this is of interest. PROD-821

Comments

  • yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:363RM Data Scientist
    Solution Accepted

    Hi@scottchung64,

    You are correct. The lambda search is used for controlling the regularization to avoid overfitting. When performing regularization, penalties are introduced to the model buidling process to avoid overfitting. GLM needs to find the optimal values of the regularization parameters alpha and lambda. The lambda parameter controls the amount of regularization applied to the model.

    When you activate the labmda search in GLM operator, it will take longer time to find the best value of parameters.

    YY

  • staskhalitovstaskhalitov 成员Posts:3Contributor I

    is it possible to initiate an Alpha search?

    i see this: "Providing multiple alpha values via the advanced parameters triggers a search."

    but how do i actualy provide multiple values...what is the format?

  • yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:363RM Data Scientist

    Hi@staskhalitov,

    Good point. You will need to edit the "expert parameters" list

    alpha_search.PNGalpha_list.PNG











































    Hope it helps.

    YY

  • staskhalitovstaskhalitov 成员Posts:3Contributor I

    so I tried your xml, but it seems like the model just uses what ever value of Alpha you have in the initial settings, .6 in your example.

    It doesnt look like it considered the additional Alphas, .2 & .1, in the expert parameters.

    How do i actualy initiate a search for an Alpha per this description?

    alpha
    Description: The alpha parameter controls the distribution between the L1 (Lasso) and L2 (Ridge regression) penalties. A value of 1.0 for alpha represents Lasso, and an alpha value of 0.0 produces Ridge regression. Providing multiple alpha values via the advanced parameters triggers a search. Default is 0.0 for the L-BFGS solver, else 0.5.
    Range: real; 0.0-1.0
    Optional: true

    If i leave the initial Alpha .6 blank, and have additional Alphas in expert parameters i get an error.

  • yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:363RM Data Scientist

    Hi@staskhalitov,

    Thanks for the followup! Great catch. I double checked the model descriptions and unfortunately the additional alpha values are not used for alpha search. We are investigating the bug.@phellinger

    At the same time, you can manually do a grid search by loop. Here is an example:



































































    <操作符= " true " class = " performance_to_d激活ata" compatibility="9.0.002" expanded="true" height="82" name="Performance to Data" width="90" x="313" y="136"/>






    <连接from_op = "交叉验证“from_port = "莫del" to_port="output 1"/>



























    Best,

    YY

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM ModeratorPosts:2,959Community Manager
Sign InorRegisterto comment.