分组的类
Hi
With RapidMiner is it possible to automatically collapse the classes in a learning set on a given number of classes by their cardinality so that variance? The goal is to improve the precision of methods such as SVM and KNN.
I have a learning set of 20.000 elements divided in more than 100 classes, with high variance in the number of elements and I need to reduce them to 20 classes.
For example:
Class A- 3 elements
Class B- 4 elements
Class C- 8 elements
It would be nice to have the opportunity to reduce to a given number of classes, i.e. 2 this way:
Class 1- 7 elements (obtained by Class A and
Class 2- 8 elements (obtained by Class C)
Please, help me!! I'm trying with operations research methods but have so less time...
Thank you!
With RapidMiner is it possible to automatically collapse the classes in a learning set on a given number of classes by their cardinality so that variance? The goal is to improve the precision of methods such as SVM and KNN.
I have a learning set of 20.000 elements divided in more than 100 classes, with high variance in the number of elements and I need to reduce them to 20 classes.
For example:
Class A- 3 elements
Class B- 4 elements
Class C- 8 elements
It would be nice to have the opportunity to reduce to a given number of classes, i.e. 2 this way:
Class 1- 7 elements (obtained by Class A and
Class 2- 8 elements (obtained by Class C)
Please, help me!! I'm trying with operations research methods but have so less time...
Thank you!
0
Answers
I'm not quite sure if I understood you correctly. You want to merge most similar classes to improve the precision? But this would not improve performance on the problem, instead it would simply change the problem...
But if you want to do this manually, you could use the MergeNominalValues operator to do this. Perhabs you should take a look at the parameterIteration operator and its examples in the meta directory of the example processes. It could save you a lot of typing.
Greetings,
Sebastian
Actually I solved my problem using a Operational Research method, the Assembly Line Balancing problem implementation.
Just a note: i tried to use the evolutionary parameter optimization of the examples, but even with the examples it took really many hours, so I decided to change approach.
Thanks for your availability and compliments for the software you realized and the choice of keeping it open source: it is really great!