"Feature selection like roulette wheels strategy"
Hi all,
Assume I have an example set with m samples and n features.
I have weighted these n features with a statistical weighting method (like Gini, Chi-squared, InfoGain, etc.).
Now I have n normalized weighted features.
How can I probabilistically choose p features from these n features? (p << n) // p is very smaller than n
I want each feature have a probability to be chosen amongst these p features and this probability should be its normalized weight.
Can anybody help me?
Please help me find the operator tree to solve this problem.
Thanks in advance.
-- Misagh.
Assume I have an example set with m samples and n features.
I have weighted these n features with a statistical weighting method (like Gini, Chi-squared, InfoGain, etc.).
Now I have n normalized weighted features.
How can I probabilistically choose p features from these n features? (p << n) // p is very smaller than n
I want each feature have a probability to be chosen amongst these p features and this probability should be its normalized weight.
Can anybody help me?
Please help me find the operator tree to solve this problem.
Thanks in advance.
-- Misagh.
Tagged:
0
Answers
there is already an operator [tt]RandomSelection[/tt] which randomly selects attribute subsets. Unfortunately this operator is not yet capable of using attribute weights as probabilities for drawing the attributes. We can put this on our todo list. The completition of the implementation however might take a while - not because it is really complicated to implement but rather our momentary schedule dictates to focus on our clients instead of extending RapidMiner functionality...
Cheers,
Tobias