Feature Selection in Multiple Linear Regression
I selected "t-test" as the feature selection method in a multiple regression model, and set alpha to 0.01. Why is the regression output including an independent variable with a p-value of 0.05? Only when I reduce the alpha to 0.001 does this variable go away. If I set alpha = 0.01, shouldn't the selected model only show me independent variables whose p-value is less than 0.01?
thank you in advance for answers.
AD
thank you in advance for answers.
AD
Tagged:
0
Answers
Do you have any attributes that are generated from recoding numerical to nominal? I think RapidMiner might keep those as a group if so which means some of the individual values have p values greater than the threshold but the overall attribute does not, although it would be good for one of the RM staffers to confirm that.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts