Using SVM to predict a new row
Hi,
I would like to ask for help to build another prediction model, using SVM or other it is fine.
<运营商激活= " true "类=“检索”兼容ibility="8.2.001" expanded="true" height="68" name="Retrieve Polynomial" width="90" x="45" y="34">
<运营商激活= " true "类= compatib“追加”ility="8.2.001" expanded="true" height="82" name="Append" width="90" x="648" y="34"/>
Above is the process i am using currently. As i understood SVM learn operator, it will create a model based on the row behaviour of the data set. What I need to is a model based on the behaviour of the column of data set. I tried to transpose my data but that would make me lose the label which I need for prediction. So for my sample process above, I need the predicted result for att_201 based on the behaviour of the data set before the first transpose.
Tagged:
0
Answers
Hi@hung9022,
I'm not sure to understand, but I assume your problem is a "time-series" study, isn't it ?
You want to predict the (N+1)-th value of your attribute "atti" based on the "history" of the attribute atti, that is to say
the first to the N-th values of attribute "atti" ?
If it is the case, you can take a look at the extension "Time series" (to install from Marketplace if Rm's Version <9.0 / directly implemented in RM 9.0).
If I'm misunderstood, can you explain more explicitly, by sharing your dataset and giving an example of what you want to obtain.
I hope it helps,
Regards,
Lionel
hi@lionelderkrikor,
I have looked at the time-series you mentioned but that is not what i wanted, although it is close. I have uploaded 3 pictures to describe what i intend to do. The first is a screenshot of the transposed Polynomial data example, which represent the data i have. If i feed this transposed data set to a predictive learner, i.e. SVM, the operator will build 7 models based on the number of row to predict a new value, let say att_10 since it is not in the screenshot, as show in the "Normal model Prediction. What i need is a process that predict the new attribute based on the behaviour in each column as show in "What i want.png". It may be there is a set up in the time-series you mentioned but I am still new to Rapidminer so I haven't exactly figured out all of its function.
Regards,
Hi@hung9022,
1."What i need is a process that predict the new attribute based on the behaviour in each column..."
Based, on your screenshot, you want to generate and predict the values of attribute att_10, based
on the values of att_1 to att_9 ? That's impossible as is.
First, you have to build a model (for example SVM) based on alabeleddataset. It means that you need to have a dataset with the
values of attributes att_1 to att_9andthe associated values of att_10 (which is called the "label").
Once you have built the model, you can predict the att_10 values by applying the model to a new dataset which contains new values of att_1 to att_9.
2. That's why, I allow myself to insist, your description make me think that you want perform a time-series study.
In this case, you need to have a timestamp (or maybe just an Id).
To help you better, can you share your original dataset ?
I hope it helps,
Regards,
Lionel
Hi@lionelderkrikor,
this is the original data set, the first attribute is imported as ID. For this data set, i will remove the first row and use it as my target for prediction. If I were to use this data as a time series, how would I set it up with the time-series extension for prediction?
Regards,
Hi@hung9022,
I can't import your .csv in RapidMiner.
I think it is because you haven't attribute name in the first row (you have currently only some "9.326" and "118.691").
Can you correct it ?
Moreover the ID has to be "numeric" (Id = 1,2,3,4,5, etc.) for a time-series problem. (It can't be "Wxxxxx").
Regards,
Lionel
Hi@lionelderkrikor,
How about this?
Hi@hung9022,
How said previously, I treated your problem as a time-series problem :
- The Id (1,2,3,4 etc.) is used as timestamp. I chose arbitrarily that Id corresponds to days.
- I select thekNNmodel because it is much more adapted than theSVMmodel to your data.(Performance measured by RMSE).
- I used a Loop Attributes to perform the forecasting of all your attributes
Here a screenshot of the forecast of your six first attributes (see row 1) :
The process :
hi@lionelderkrikor,
Thanks for your help, your solution comes quite close to what I wanted to do. I will try to figure out the rest. Sorry for the late reply since I did not have access to Internet to till now.
Regards,