Expert opinion requested on Times Series based Prediction

luc_bartkowskiluc_bartkowski MemberPosts:46Maven
edited August 2020 inHelp

So I'm studying machine learning using RapidMiner and I'm now focusing on Time Series Prediction.

My son earns some pocket money by trading stocks, forex and futures. He does that with technical analyses of prices.

He looks for an asset that shows a clear trend in conformance ofSelecting Forecasting Methods in Data Science.

Then my son zooms in on the M-curves of the latest period. Using support and trendlines he "predicts" the future price of the asset.

My thought was to give him a Machine Learning perspective on his analyses.

So I looked atOil Futuresand build a process model on it, based on the daily "Last" values. The model looks like this:

oilpredmod.jpg

In the upper left I have implemented 3 RapidMiner Macros:

  1. %{AnalysesDateFrom}: From where to pick up the "wave to surf" trend like my son is doing.
  2. %{PredictionDateFrom}: This is my "hold off" parameter. I train the model to this date. I let the model predict from this date.
  3. %{PredictionHorizon}: It sets the Horizon parameters in the Windowing operator, in the Sliding Window Validation operator and in the Forecasting Performance operator implemented in the subprocess of the Sliding Window Validation operator so all operators work with the same Horizon.

When I run the model with %{AnalysesDateFrom} = "Feb 10, 2016", %{PredictionHorizon}=10 and %{PredictionDateFrom}="Aug 28, 2017" (last month) the model returns a prediction_trend_accuracy: 0.625 +/- 0.099 (mikro: 0.625). For what this accuracy figure is worth, I know that value prediction is "slippery ice", I'm therefore more interested in trends.

My question is related to the next graph in which I have plotted the prediction together with the real "Last" values.

oilpredgraph.jpeg

This plot clearly shows that the trend of the prediction is in conformance of the trend of the real "Last" values.

What I don't understand is that the prediction and the real "Last" values are "in phase" which each other. I would expect a phase shift between both lines, a phase shift equivalent to the Prediction Horizon. That phase shift is not visible. What am I doing wrong here?

The only explanation I can think of for the absence of a phase shift is that the value of an asset in a moment in time is the best indication of the future value of this asset. In other words: the current value of an asset incorporates already future values of this asset. That would explain that the lines of real values and the prediction values are in sync with each other. But I am not sure so I would like to receive an expert opinion on this.

Tagged:

Best Answer

  • luc_bartkowskiluc_bartkowski MemberPosts:46Maven
    Solution Accepted

    I have found the answer on my question.

    My source data is sorted on dates because I use a SQL script to prevent to load too much data compared to my RM license.

    I use the following SQL: "SELECT * FROM oil ORDER BY Date DESC LIMIT 9999".

    The example set as input for the Windowing operators are sorted decending on Date.

    When I sort the example set on Date ascending then the model works as expected.

    See next pictures

    sort1.jpegAdded Sort operator

    sort2.jpegNew resulting example set

    sort3.jpegPrediction is almost equivalent with oilLast-0

    No phase shift. Of course not. Question answered.
    Watch out for sorting dates.

    Apperently RM is not using the value of a Date attribute during "Set Role to ID" but it establish an ID on basis of the input sort order.

    Greetings,

    Luc

    sgenzer MartinLiebig hermawan_eriadi

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist

    Hi Luc,

    i think the answer is simple. Your prediction(label) is the oil price tomorrow (or in x days). While your OilLast-0 is the OilPrice today (-0 indicates 0 days lookback).

    你的金属氧化物半导体t likely want to also generate a Label in the lower windowing and compare this to the prediction.

    Cheers,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • luc_bartkowskiluc_bartkowski MemberPosts:46Maven

    Thank you@mschmitzfor your fast reply,

    "Your prediction(label) is the oil price tomorrow (or in x days)".

    "While your OilLast-0 is the OilPrice today (-0 indicates 0 days lookback)."

    I understand both. But I don't see it in the graph and the exampleset:

    oilpredgraph.jpegoilpredvalues.jpg

    I checked also the examplesets of the upper and lower Windowing operators using a "breakpoint after".

    My source data is stored in MySQL. I compared both to make sure that my process is working as expected.

    The value of the Label on August 25 is based upon the "Last" value of August 11 in the source data.

    August 11 is 10 days before August 25 so that is correct.

    The values of the "-0" attributes of August 25 are equivalent to the attributes of the source data on August 25.

    That is also correct.

    windowing.jpeg

    The results of the lower Windowing operator are also correct.

    The values of all "-0" attributes on September 28 are equivalent to the source data on September 28.

    windowing2.jpeg

    So I don't understand the graph. It looks like the prediction is following the real values of "Last" instead of the other way around.

    This is my process model:







































    <枚举关键= "参数" / >






    <运营商激活= " true "类=“检索”兼容ibility="7.6.001" expanded="true" height="68" name="Retrieve (2)" width="90" x="45" y="136">

























































    <参数键= " attribute_name " value = " oilLast " / >


















































































































    <参数键= " attribute_name " value = " oilLast " / >














































    <连接from_op = "用" from_port = "输出2”to_op="Set Role (2)" to_port="example set input"/>










    <连接from_op = "窗口(2)“from_port = " example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>





    Configuration




    Thanks for your support.

    Cheers,

    Luc

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    I'll be posting my Historical Volatility process when I have a chance to write it up. In that process you take a t=0 time series and predict at t+1 value. From there you can see how it works.

  • luc_bartkowskiluc_bartkowski MemberPosts:46Maven

    I think I have found the answer on my question.
    But I don't know how to implement it.

    Looking again to the problem I conclude the following:

    windowingas should.jpeg

    On August 11 the Label should look at the "Last" value of August 25 to learn/validate. See the blue markup.

    Instead, as I indicated before, the upper Windowing operator is looking backwards, it puts the last value of August 11 as Label on August 25.

    I tried to configure the upper Windowing operator looking forwards in stead of backwards by configuring a negative -10 or (%{PredictionHorizon})*-1) in the Horizon parameter. The Horizon parameter of the Windowing operator doesn't accept negative integers, only positive integers. So I don't know how to implement a forward looking Label instead of a backward looking Label.

    I'm using v. 7.6001

    Greetings,

    Luc

  • online360online360 MemberPosts:34Contributor I

    Hi!

    I tested your process and have to say that I really like it but have one question on it:

    How do you show any forecasts then (like for the following week)?

    Thanks!

Sign InorRegisterto comment.