"Horizon value in Sliding Window Validation"

haddockhaddock MemberPosts:849Maven
edited May 2019 inHelp
I'm trying to predict currency values from indicators, on a daily basis. Each daily example has a Label expressed as the change in the currency over the next N days, a date as an ID, and numeric attributes.

Validation is done use the Sliding Window Validation operator. While most of the parameters are self-explanatory I am having difficulties with the "horizon" parameter whose definition is as follows

"Number of examples which are between the training and testing examples (integer; 1-+Infinity; default: 1)"

At a trivial level, if I break before the learner and break before the applier operators in the validation, I can see that when horizon is set to 1 the validation starts on the next example after the training set, meaning that there are in fact 0 examples between the training and testing examples. No big deal, just thought I'd point it out.

My question is whether I should use 1 as the parameter for horizon, or whether horizon should reflect the forecast period, such that if the prediction is for 10 days, horizon should be set to 9. I can see arguments on either side, so appeal to wiser minds for guidance..





Tagged:

Answers

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University ProfessorPosts:1,751RM Founder
    Hi,

    ...meaning that there are in fact 0 examples between the training and testing examples. No big deal, just thought I'd point it out.
    Hmm, that's right. Any idea how this behaviour is correctly called in English?

    My question is whether I should use 1 as the parameter for horizon, or whether horizon should reflect the forecast period, such that if the prediction is for 10 days, horizon should be set to 9. I can see arguments on either side, so appeal to wiser minds for guidance..
    The horizon in the sliding validation actually is independent of the horizon in the windowing used for learning. It just defines the gap between training and testing examples and can be used, for example, if you want to predict the values for the next year based on the data from the last one (so the validation horizon is 365 days). The learner windowing on the other hand could be set to a "1-day-horizon" if you want to predict the next day's value.

    Hope that helps. Cheers,
    Ingo
  • haddockhaddock MemberPosts:849Maven
    Nice one Ingo!

    I'd taken that approach, setting horizon to 1 when it should be say 20 gives a rough proxy for training performance only! On the definition front perhaps this is clearer.

    "Increment from last training to first testing example (integer; 1-+Infinity; default: 1=next example)"

    Thanks again for your clarification.
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University ProfessorPosts:1,751RM Founder
    Hi,

    thanks for the note. I changed the comment on this parameter.

    Cheers,
    Ingo
Sign InorRegisterto comment.