Walking Forward Testing

OprickOprick MemberPosts:35Contributor II
Hello,
I built a multivariate regression forecasting using NN. Results seems to be ok so far.
However and since I'm forecasting the next value (+1) using all past values I would like to be able to test the model in a walk forward way, i.e. using past values to predict next in a rolling way till last example. I thought about using sliding window validation.

Each sliding window iteration produces one example only, because it is the prediction of next one. It seems to me that this is the right approach to get this kind of validation.

However I would like to visualize all iterations results appended to have something like this:https://3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com/wp-content/uploads/2016/12/Sunspot-Dataset-Train-Test-Split.png

我虽然对收集和记得/召回开放rators, but with zero success.

Enclosed my mock process and data source.

Thanks for your help

mock.xlsx 24.3K
Example.rmp 27.2K
Tagged:

Best Answer

  • hughesfleming68hughesfleming68 MemberPosts:323Unicorn
    edited April 2019 Solution Accepted
    Hi Oprick, you are correct to use the sliding window operator. Normally you would use the log operator to collect details about what is happening within the window. Looking at the train/test image, it looks like that because it is only showing where the data is split.

    你想覆盖一个序列的预测吗s over actual data?
    Oprick varunm1

Answers

  • SGolbertSGolbert RapidMiner Certified Analyst, MemberPosts:344Unicorn

    you can probably do what you want by using a combination of the Windowing operator and a normal Cross Validation:




















































    <参数键= value =“split_on_batch_attribute false"/>





































    Builds a model on the current training data set (90 % of the data by default, 10 times).<br><br>Make sure that you only put numerical attributes into a linear regression!




















    Applies the model built from the training data set on the current test set (10 % by default).<br/>The Performance operator calculates performance indicators and sends them to the operator result.

    A cross validation including a linear regression.














    I couldn't find the Sliding Window Validation operator, I think it got replaced by Forecast Validation.


    You can also try replacing the neural network with a recurrent neural network, provided by the Add LSTM Layer operator of the Deep Learning extension. I am a beginner on the topic, so I can't give you any example yet, but I'm convinced that RNN are the way to go for time series.


    Regards,
    Sebastian
  • hughesfleming68hughesfleming68 MemberPosts:323Unicorn
    edited April 2019
    @Oprick...Looking at your process again and as@SGolbertsuggested, you should take a look at the windowing operator. I do think on the other hand that you should stick with the value series extension. There are usage cases where both are necessary. The new time series operators don't completely replace the older extension.

    Kind regards,

    Alex
  • OprickOprick MemberPosts:35Contributor II
    Hi@hughesfleming68I was looking at wrong direction. You pointed the correct way:)
    I used log family operators. I was not very familiar with that.
    Now I can overlap prediction_label and label.

    Enclosed the corrected process. I hope it helps someone else.

    Many Thanks
    hughesfleming68
  • hughesfleming68hughesfleming68 MemberPosts:323Unicorn
    edited April 2019
    Hi@Oprick, I saw how you set up your data. Basically you did manually what the windowing operator does in the beginning of your process. Give it a try the next time around as you could probably save yourself some steps. There are some other advantages as well with larger windows as you can use previous time steps as attributes.

    regards,

    Alex
  • OprickOprick MemberPosts:35Contributor II
    Hi@hughesfleming68
    Sure I will. Thanks for the tip;)
Sign InorRegisterto comment.