"Sliding Window Validation - What Model?"
Hi All,
I will admit I am perplexed by the sliding window validation process (what it does and the parameters). In trying to understand it, the first question is what model is actually fit at the end? Is it the one using the most recent records (with the number of said records depending on the settings in the operator)?
I will admit I am perplexed by the sliding window validation process (what it does and the parameters). In trying to understand it, the first question is what model is actually fit at the end? Is it the one using the most recent records (with the number of said records depending on the settings in the operator)?
Tagged:
0
Answers
do you mean, on which data the model is fitted that will be delivered at the mod port?
With kind regards,
Sebastian Land
Yes, that is what I mean. What is that final model - is it fit using the last k records, where k is set in the parameters as the window?
I had the same question and couldn't find an answer.
What model is delivered at the mod port? If a model is returned, what is it's value for future data?
My understanding is that a new model is created and tested for each window. What we are really validating is how well the process of learning a model works, right? Thus, no single model returned at the port will be of value.
我是clearly confused. Please help.
Thank you
Best, Marius
Stated differently, if I have 1000 rows with a training window of 50 validated on the next row, I will have gone through 949 models each with 50 rows of data for training. The model returned, however, will be trained on 1000 rows?
If the reason I am training on 50 rows to predict the next is because the process generating the rows is not stationary, does it not follow that the final model trained on 1000 rows will be of little value in predicting the 1001 row?
I remember having exactly this exchange with Ingo a year or so ago right here; I was using SVMs to make short term forecasts in foreign exchange markets, and optimised the look-back and prediction horizon sizes in a sliding window validation. The performance figures were fine, as you would expect, but I had to store the model at every iteration within the validation , just to get the last one. Yes, wasteful of course, yes easily fixable, that's the wonder of open source!
What I, like you, never worked out was the correct scenario for using a model built on all the examples of a concept drift.
Happy days!