"Getting started - very simple neural network training
Hi there,
I am working on the business intelligence part of a large project and looking for an appropriate tool. RapidMiner looks very promising so far though I have trouble to get a simple neural net training running. As a an initial point for experimenting, I'd like RapidMiner to learn the body mass index formula classification.
I've got two files, "training" and "testing", each containing 100 data examples. They both look like that
现在我可以导入文件“训练”,联系我t to a neural net. The neural net I connect with the output. RapidMiner trains the net and delivers the weights.
Now how do I test the weights with the test set? Help would be very much appreciated.
Kind Regards
Theo
I am working on the business intelligence part of a large project and looking for an appropriate tool. RapidMiner looks very promising so far though I have trouble to get a simple neural net training running. As a an initial point for experimenting, I'd like RapidMiner to learn the body mass index formula classification.
I've got two files, "training" and "testing", each containing 100 data examples. They both look like that
The format is: weight (kg) - height (cm) - age (y) - classification (1 = superb, 0 = ok, -1 = bad), e.g. the first line means a person weighs 69 kilo, is 189cm tall at the age of 38 which is "ok".
69 189 38 0
66 193 60 -1
63 161 59 1
36 74 187 1
68 182 37 0
63 169 46 1
75 158 30 -1
92 145 47 -1
52 160 50 0
...
现在我可以导入文件“训练”,联系我t to a neural net. The neural net I connect with the output. RapidMiner trains the net and delivers the weights.
Now how do I test the weights with the test set? Help would be very much appreciated.
Kind Regards
Theo
Tagged:
0
Answers
that's exactly what's RapidMiner was originally designed for. I will post a (senseless) example process which will make it instantly clear, how to use a test set for performance evaluation: It's a RapidMiner 5 process and I would recommend to switch to RapidMiner 5, if you are still using RapidMiner 4.x.
For getting an impression of what can be done with RapidMiner, I would recommend to go through the samples delivered with RapidMiner. Many important design patterns and most important operators are described there, together with sample processes.
Greetings,
Sebastian
thank you for the help, your example was very much the same scheme I had used. Except, of course, for the data import, which seems to make the most difficulties for me now. Could you maybe take a look at my process: It does learn now, but is still faulty, since the learning capability of the neural net seems to heaviliy vary with the amount of testing (!!) data.
Kind Regards
Theo
the Capabilities are not dependent on the amount of data, but of the type of data. I would suggest using the Repository for having a correct MetaData Transformation.
Without further information about what's going wrong, I cannot say, why it does.
Greetings,
Sebastian
I finally succeeded in building a process that learns the body mass index formula. The model learned by the neural net classifies softly between +1 (healthy) and -1 (overweight). Value are e.g. "+1.037" or "-0.360".
I would like the values to be "crisp", e.g. separated in 3 classes "+1", "zero" and "-1" which symbolize the health of the patient. How do I do that?
I already tried mapping and threshold but it didnt work out. The "map" operator would not touch the "prediction (BMI)" column which is produced by the "apply model" operator. How do I employ a correct mapping?
Thanks in advance
Theo
I would recommend using the discretize by user specification operator. You will have to enter the upper bound for each class and would have to select the prediction column as single attribute setting the attribute filter type to single attribute and then selecting the attribute.
Greetings,
Sebastian
thank you very much, "discretize" is exactly what was neededThough I still do not understand why it is not possible to use "map" or "threshold".
Is there an easy and intuitive way to modify existing modules? Or combine existing modules to new ones?
Kind Regards
Theo
the map operator is meant for nominal attributes, where you map one nominal value to another. The threshold operators are used for applying on classification confidences to bias the classification result manually.
What do you mean by modules?
Greetings,
Sebastian
I see. Sounds like it was also the right operator for the task I finally solved by "discretize". Sorry, I meant "operators".
Kind Regards
Theo
operators are fixed units inside RapidMiner. You might relative easily extend RapidMiner with your own operators solving your own problems by writing an Extension. This is done in Java and you might inherit from the existing operators there to change some of their behavior. A whitepaper for developers is on the way. (See the other many threads about this...)
Greetings,
Sebastian