I want to predict a value by another values

davidraul36 · February 2018

Hello, I'm very newbie to RapidMiner and data science as well so bear me please.

I want to predict values from totally different values, it's like trying to finding a model for the relation between them.

For Example;

I have Excel spreedsheet with cloumns (A, B, C, D, F)

I want to use (A, B, C, D) to predict or getting model for the values in (F) then use it to test data...

Thanks in advance,

Thomas_Ott · February 2018

@davidraul36Here's what I would do. Clean up the date and time attributes and use a different algo. 74% trend accuracy and you can most likely optimize that with Optimize Parameters.







<运营商激活= " true " class = "过程”兼容ibility="8.0.001" expanded="true" name="Process">














<参数键=“2”值= " Open.true.real.attribute"/>
















<运营商激活= " true " class = " nominal_to_date”compatibility="8.0.001" expanded="true" height="82" name="Nominal to Date" width="90" x="447" y="34">
















<参数键=“2”值= " Open.true.real.attribute"/>


































<列出关键= " expert_parameters " / >












<运营商激活= " true " class = "系列:forecasting_performance" compatibility="7.4.000" expanded="true" height="82" name="Performance" width="90" x="179" y="34">

lionelderkrikor · February 2018

Hi@davidraul36,

Can you share your dataset(s) please ?

Regards,

Lionel

davidraul36 · February 2018

Here it's the data I use,

I want to find a model which finds the values of column "Avg" from all the other columns.

Telcontar120 · February 2018

您应该检查啊ut the "Getting Started" videos on the www.turtlecreekpls.com webpage, they are designed to help you get started with a basic predictive modeling project such as this one. You will need to define your "label" (the thing you are trying to predict) first.

Thomas_Ott · February 2018

@davidraul36I would do what@Telcontar120suggests, review some videos and try out the tutorials that are built into Studio itself. Then build a process and if you get stuck, post that XML to the community for help.

davidraul36 · February 2018

I already tried to do a model, but my model use the previous data of "Avg" to predict the next one.

I don't know what to do in the design to let "column (Avg)" as only a prediction without getting any info from it or its previous values.







<运营商激活= " true " class = "过程”兼容ibility="8.0.001" expanded="true" name="Process">






















<参数键=“2”值= " Open.true.real.attribute"/>




































<参数键= value =“select_label_by_dimensionfalse"/>




















































<运营商激活= " true " class = "系列:forecasting_performance" compatibility="7.4.000" expanded="true" height="82" name="Performance" width="90" x="179" y="85">
































<参数键=“2”值= " Open.true.real.attribute"/>




































<参数键= value =“select_label_by_dimensionfalse"/>

Thomas_Ott · February 2018

@davidraul36I see that you set this up as a time series problem. Was there a particular reason to seperate the time and date columns?

davidraul36 · February 2018

Since it's a direct time series problem, I have tried time series examples.

I was trying to predict the moving average values, instead of common lag.

I have tried another model, by selecting "Avg" as label and all other columns as "attributes" then use any operators like Neural, SVM, then apply model on test data...

So is that OK?

davidraul36 · February 2018

Sorry for my newbie behaviour

here it's the XML







<运营商激活= " true " class = "过程”兼容ibility="8.0.001" expanded="true" name="Process">






















<参数键=“2”值= " Open.true.real.attribute"/>












































<列出关键= " expert_parameters " / >

















<参数键=“2”值= " Open.true.real.attribute"/>

Thomas_Ott · February 2018

This should work but your trend accuracy sucks now. So what was screwing this up was how you transformed your AVG attribute into the label. I made some small modifications and dropped out the AVG column from the test set (cause that's what you want to test). If you want to compare the test set AVG with what's predicted, then set the AVG attribute as a 'dummy' role. See the next process below this one.







<运营商激活= " true " class = "过程”兼容ibility="8.0.001" expanded="true" name="Process">














<参数键=“2”值= " Open.true.real.attribute"/>



















<参数键=“2”值= " Open.true.real.attribute"/>


























































<运营商激活= " true " class = "系列:forecasting_performance" compatibility="7.4.000" expanded="true" height="82" name="Performance" width="90" x="179" y="34">

With Dummy Role







<运营商激活= " true " class = "过程”兼容ibility="8.0.001" expanded="true" name="Process">














<参数键=“2”值= " Open.true.real.attribute"/>



















<参数键=“2”值= " Open.true.real.attribute"/>



























































<运营商激活= " true " class = "系列:forecasting_performance" compatibility="7.4.000" expanded="true" height="82" name="Performance" width="90" x="179" y="34">

Thomas_Ott · February 2018

The more I look at this, the more I think you need to use a Sort operator to feed in the time series correctly. I wouldn't split the Date and Time into two units, RapidMiner can easily understand date-time together.

davidraul36 · February 2018

Thank you so much for spending so much time helping me, I really appreciate that.

Great Software and Great community!

I'm just curious about why the chart doesn't plot smoothly.

However,

Thank you so much,

Kindest regards,

Thomas_Ott · February 2018

@davidraul36That's probably because you have AVG values for each hour in your date-time. Rolled up to daily value you'd get the standard daily moving average. I would use an Aggregate operator for that.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

I want to predict a value by another values

Best Answer

Answers