question marks in linear regression output
我跑18 independen的线性回归模型t variables and feature selection turned off. For some of the independent variables there were question marks for the standard error of the estimate, and therefore for the t-statistic and p-value for the coefficient. I ran the mode again with feature selection turned on and got the same question marks. What do these question marks mean? Thay cannot have anything to do with missing values as the regression would not have run to completion in that case. I am baffled about what these "?" symbols might mean. Help.....
Tagged:
0
Best Answers
-
varunm1 Moderator, MemberPosts:1,207UnicornHello@sgenzerand@AD2019
I tried to look at H2O documentation on linear regression, unfortunately, I found none. For GLM to provide p-values, there is a mandatory parameter selection that H2O recommends to get values without "?" (Unknown)
1. You should uncheck the " Use Regularization" option.
2. You should select "Add intercept"
3. You should select " compute p-values"
4. You should select " remove collinear columns"
If these are set then you will get the p values, std.error, etc without question marks. You will get question marks in this case only when the coefficient is 0.
I will see if I can find any information on linear regression.Regards,
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
8
Answers
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
简单的答案是,?marks are used in RapidMiner when values are missing. The better question is why are they missing...my educated guess here (pls correct me@varunm1 @mschmitzif my stats are wrong here) is that there can be no std coefficient or tolerance for an intercept of a LinReg model as it's a computed value. All of your actual data (the other attributes) have std coefficients which make sense. But my stats are a wee bit rusty so I look to these other smart folks to correct me.
Scott
Ah I understand. Good point. It's been a while since I've played with all of this (we normally use the GLM modeler instead of LinReg as it is far more versatile and robust). Let me investigate.
Scott