Forecast Validation
Synopsis
This operator performs a validation of a forecast model, which predicts the future values of a time series.
Description
The operator creates sliding windows from the input time series, specified by thetime series attributeparameter. In each validation step the training window is provided at the innertraining setport of theTrainingsubprocess. Its size is defined by the parameterwindow size. The training window can be used to train a forecast model (e.g. an ARIMA model, by the ARIMA operator), which has to be provided to themodelport of theTrainingsubprocess.
The innertest setport of theTestingsubprocess, contains the values of the test window. Its size is defined by the parameterhorizon size. The forecast model of theTrainingsubprocess is used to predict these values. For the next validation fold, the training and the test windows are shifted bykvalues, defined by the parameterstep size.
Contrary to the Cross Validation operator the number of values which has to be forecasted by the forecast model has to be equal to thehorizon size. Thus, the forecasted values are already added to the ExampleSet provided at thetest setport, an additional Apply Forecast operator is not necessary. The attribute holding the test window values has thelabelrole, while the attribute holding the forecasted values has the预测role. Thus a Performance operator (e.g. Performance (Regression)) can be used to calculate the performance of the forecast.
The described behavior is the default example based windowing. It can be changed to time based windowing or custom windowing by changing theunitparameter. For time based windowing, the windowing parameter are specified in time durations/periods. For the "custom" windowing an additional ExampleSet has to be provided to the new "custom windows" input port. It holds the start (and optional the stop values) of the windows. For more details see theunitparameter and the description of the corresponding parameters.
Expert settings (for exampleno overlapping windows,empty window handling, ..) can be enabled by selecting the correspondingexpert settingsparameter.
If themodelport of the Forecast Validation operator is connected, a final window with the same size as the training windows, but ending at the last example of the input series is used to train a final forecast model. This final model is provided at themodeloutput port. It can be directly used by the Apply Forecast operator to predict the future values for the input time series. The operator also deliver alltest setExampleSets, appended to one ExampleSet and the averaged Performance Vector.
This operator works on all time series (numerical, nominal and time series with date time values).
Input
example set
The ExampleSet which contains the time series data as an attribute.
custom windows
The example set which contains the start (and stop) values of the custom windows. Only needs to be connected if the parameterunitis set tocustom.
Output
model
If the model port of the Forecast Validation operator is connected, a final window with the same size as the training windows, but ending at the last example of the input series is used to train a final forecast model, which is delivered at this port. The final forecast model can be directly used by the Apply Forecast operator to predict the future values for the input time series.
example set
The ExampleSet that was given as input is passed through without changes.
test result set
Alltest setExampleSets, appended to one ExampleSet.
performance
This is an expandable port. You can connect any performance vector (result of a Performance operator) to the result port of the innerTestingsubprocess. Theperformanceoutput port delivers the average of the performances over all folds of the validation
Parameters
Time series attribute
The time series attribute holding the time series values for which the forecast model shall be build. The required attribute can be selected from this option. The attribute name can be selected from the drop down box of the parameter if the meta data is known.
Has indices
This parameter indicates if there is an index attribute associated with the time series. If this parameter is set to true, the index attribute has to be selected.
Indices attribute
If the parameterhas indicesis set to true, this parameter defines the associated index attribute. It can be either a date, date_time or numeric value type attribute. The attribute name can be selected from the drop down box of the parameter if the meta data is known.
Expert settings
This parameter can be selected to show expert settings for a more detailed configuration of the operator. The expert settings are:windows defined,custom start point,custom end point,date format,no overlapping windowsandempty window handling.
Unit
The mode on how windows are defined. It defines the unit of the window parameters (window size,step size,horizon sizeandhorizon offset).
- example based: The window parameters are specified in number of examples. This is the default option.
- time based: The window parameter are specified in time durations/periods (units ranging from milliseconds to years).
- custom: An additional example set has to be provided to the new "custom windows" input port. It holds the start (and optional the stop values) of the windows.
Windows defined
This parameter defines the point from which the windows are defined of. It is an expert setting and hence it is only shown if the parameterexpert settingsis selected.
- from start:第一个窗口will start at the first example of the input data set. The following windows are set up according to the window parameters.
- from end: The last window will end at the last example of the input data set. The previous windows are set up according to the window parameters.
- custom start:第一个窗口will start at the custom start point provided by the parametercustom start point/custom start time. The following windows are set up according to the window parameters.
- custom end: The last window will end at the custom end point provided by the parametercustom end point/custom end time. The previous windows are set up according to the window parameters.
Custom start point
If the parameterwindows definedis set tocustom startand theunitis set toexample based, this parameter defines the custom point from which the windows start. It is an expert setting and hence it is only shown if the parameterexpert settingsis selected.
Custom end point
If the parameterwindows definedis set tocustom endand theunitis set toexample based, this parameter defines the custom point where the windows end. It is an expert setting and hence it is only shown if the parameterexpert settingsis selected.
Custom start time
If the parameterwindows definedis set tocustom startand theunitis set totime based, this parameter defines the custom date time point from which the windows start.
The date time format used to interpret the string provided in this parameter is defined by the parameterdate format. It is an expert setting and hence it is only shown if the parameterexpert settingsis selected.
Custom end time
If the parameterwindows definedis set tocustom endand theunitis set totime based, this parameter defines the custom date time point where the windows end.
The date time format used to interpret the string provided in this parameter is defined by the parameterdate format. It is an expert setting and hence it is only shown if the parameterexpert settingsis selected.
Date format
Date format used for thecustom start timeandcustom end timeparameters. It is an expert setting and hence it is only shown if the parameterexpert settingsis selected.
Window size
The number of values in the training window. The ExampleSet provided at thetraining setport of theTrainingsubprocess will havewindow sizenumber of examples. Thewindow sizehas to be smaller or equal to the length of the time series.
Window size time
The time duration/period of the training window.
The example set provided at thetraining setport of theTrainingsubprocess will have all examples which are in the corresponding window.
Thewindow size timehas to be smaller or equal to the time duration of the time series.
No overlapping windows
If this parameter is set to true, the parameterstepsize自动确定,以便所有赢dows and horizons don't overlap. The stepsize is set towindow size+horizon size. It is an expert setting and hence it is only shown if the parameterexpert settingsis selected.
Step size
The step size between the first values of two consecutive windows. E.g. with a window size of 10 and a step size of 2, the first window has the values from 0, ..., 9, the second window the values from 2, ..., 11 and so on. Ifno overlapping windowsis set to true thestep sizeis automatically determined depending onwindow sizeandhorizon size.
Step size time
The step size (in units of time) between the start points of two consecutive windows. E.g. with a window size of 1 week and a step size of 2 days, the first window has the days from 0, ..., 6, the second window the days from 2, ..., 8 and so on. Ifno overlapping windowsis set to true thestep size timeis automatically determined depending onwindow size time,地平线大小时间andhorizon offset time.
Horizon size
The number of values in the test window. The ExampleSet provided at thetest setport of theTestingsubprocess will havehorizon sizenumber of examples. It will have an attribute holding the original time series values in the test window (attribute name is the name of thetime series attributeparameter), and an attribute holding the values in the test window, forecasted by the forecast model from theTrainingsubprocess (attribute name isforecast of<time series attribute>). In addition, the ExampleSet has an attribute with the forecast position, ranging from 1 tohorizon size. If the parameterhas indicesis set to true the ExampleSet has also an attribute holding the last index value of the training window.
Horizon size time
The time duration/period taken in the test window.
The ExampleSet provided at thetest setport of theTestingsubprocess will have the examples in the corresponding windows It will have an attribute holding the original time series values in the test window (attribute name is the name of thetime series attributeparameter), and an attribute holding the values in the test window, forecasted by the forecast model from theTrainingsubprocess (attribute name isforecast of<time series attribute>). In addition, the ExampleSet has an attribute with the forecast position, ranging from 1 tomaximum number of horizon values. If the parameterhas indicesis set to true the ExampleSet has also an attribute holding the last index value of the training window.
Windows stop definition
Defines if the end of the custom windows are either defined by the start of the next window (windows are spanning over the whole index range) or from an additional attribute.
- from next window start: The end of the windows are defined by the start of the next window (windows are spanning over the whole index range) Training windows end at the start of the next horizon window (or the next training window, if there aren't horizon windows). Horizon windows end at the start of the next training window. Be aware that the last value of the start definition values (the last value of thehorizon start attributeor the last value of thewindow start attribute, if there aren't horizon windows) is only used as the end of the final window.
- from attribute: The end of the windows are defined by additional attribute(s) in the custom window example set. The attribute names have to be provided by the parameterswindow stop attributeandhorizon stop attribute.
Window start attribute
This parameter defines the attribute in the custom window example set (the example set provided at thecustom windowsinput port) which contains the start values for the custom training windows.
Thewindow start attribute,window stop attribute,horizon start attributeandhorizon stop attributehave to be of the same data type. If the data type is integer, the windowing is example based (see parameterunit) otherwise the attributes needs to be the same data type as the indices attribute.
Window stop attribute
This parameter defines the attribute in the custom window example set (the example set provided at thecustom windowsinput port) which contains the end values for the custom training windows.
Thewindow start attribute,window stop attribute,horizon start attributeandhorizon stop attributehave to be of the same data type. If the data type is integer, the windowing is example based (see parameterunit) otherwise the attributes needs to be the same data type as the indices attribute.
Horizon start attribute
This parameter defines the attribute in the custom window example set (the example set provided at thecustom windowsinput port) which contains the start values for the custom horizon windows.
Thewindow start attribute,window stop attribute,horizon start attributeandhorizon stop attributehave to be of the same data type. If the data type is integer, the windowing is example based (see parameterunit) otherwise the attributes needs to be the same data type as the indices attribute.
Horizon stop attribute
This parameter defines the attribute in the custom window example set (the example set provided at thecustom windowsinput port) which contains the stop values for the custom horizon windows.
Thewindow start attribute,window stop attribute,horizon start attributeandhorizon stop attributehave to be of the same data type. If the data type is integer, the windowing is example based (see parameterunit) otherwise the attributes needs to be the same data type as the indices attribute.
Empty window handling
这个参数定义了如何空窗口(窗口which do not contain an Example) will be handled. It is an expert setting and hence it is only shown if the parameterexpert settingsis selected.
- add empty exampleset: Empty windows will be added as an empty ExampleSet, or a row with missing values.
- skip: Empty windows will be skipped completely in the processing. If horizon windows are created as well and either the training or the horizon window is empty, the processing for both windows is skipped
- fail: A user error is thrown, if an empty window occurs.
Enable parallel execution
This parameter enables the parallel execution of the inner processes. Please disable the parallel execution if you run into memory problems.