Sliding Window Validation
Synopsis
This operator performs a sliding window validation for a machine learning model trained on time dependent input data.
Description
The operator creates sliding windows from the input data. In each validation step the training window is provided at the innertraining setport of theTrainingsubprocess. The size of the training window is defined by the parametertraining window size. The training window can be used to train a machine learning model which has to be provided to themodelport of theTrainingsubprocess.
The test window of the input data is provided at the innertest setport of theTestingsubprocess. Its size is defined by the parametertest window size. The model trained in theTrainingsubprocess is provided at themodelport of theTestingsubprocess. It can be applied on thetest set. The performance of this prediction can be evaluated and the performance vector has to be provided to theperformanceport of theTestingprocess. For the next validation fold, the training and the test windows are shifted bykvalues, defined by the parameterstep size.
The described behavior is the default example based windowing. It can be changed to time based windowing or custom windowing by changing theunitparameter. For time based windowing, the windowing parameter are specified in time durations/periods. For the "custom" windowing an additional ExampleSet has to be provided to the new "custom windows" input port. It holds the start (and optional the stop values) of the windows. For more details see theunitparameter and the description of the corresponding parameters.
Expert settings (for exampleno overlapping windows,empty window handling, ..) can be enabled by selecting the correspondingexpert settingsparameter.
The sliding window validation ensures that the machine learning model built in theTrainingsubprocess is always evaluated on Examples which are after the training window.
If themodeloutput port of the Sliding Window Validation operator is connected, a final window with the same size as the training windows, but ending at the last example of the input series is used to train a final model. This final model is provided at themodeloutput port.
这个操作符的作品on all time series (numerical, nominal and time series with date time values).
Input
example set
This input port receives an ExampleSet to apply the sliding window validation.
custom windows
The example set which contains the start (and stop) values of the custom windows. Only needs to be connected if the parameterunitis set tocustom.
Output
model
If themodeloutput port of the Sliding Window Validation operator is connected, a final window with the same size as the training windows, but ending at the last example of the input series is used to train a final model. This final model is provided at themodeloutput port.
example set
The ExampleSet that was given as input is passed through without changes.
test result set
Alltest setExampleSets, appended to one ExampleSet.
performance
This is an expandable port. You can connect any performance vector (result of a Performance operator) to the result port of the innerTestingsubprocess. Theperformanceoutput port delivers the average of the performances over all folds of the validation
Parameters
Has indices
This parameter indicates if there is an index attribute associated with the time series. If this parameter is set to true, the index attribute has to be selected.
Indices attribute
If the parameterhas indicesis set to true, this parameter defines the associated index attribute. It can be either a date, date_time or numeric value type attribute. The attribute name can be selected from the drop down box of the parameter if the meta data is known.
Expert settings
This parameter can be selected to show expert settings for a more detailed configuration of the operator. The expert settings are:windows defined,custom start point,custom end point,date format,no overlapping windows, andempty window handling.
Unit
The mode on how windows are defined. It defines the unit of the window parameters (training window size,step size,test window sizeandtest window offset).
- example based: The window parameters are specified in number of examples. This is the default option.
- time based: The window parameter are specified in time durations/periods (units ranging from milliseconds to years).
- custom: An additional example set has to be provided to the new "custom windows" input port. It holds the start (and optional the stop values) of the windows.
Windows defined
这个参数定义了wi的点ndows are defined of. It is an expert setting and hence it is only shown if the parameterexpert settings被选中。
- from start: The first window will start at the first example of the input data set. The following windows are set up according to the window parameters.
- from end: The last window will end at the last example of the input data set. The previous windows are set up according to the window parameters.
- custom start: The first window will start at the custom start point provided by the parametercustom start point/custom start time. The following windows are set up according to the window parameters.
- custom end: The last window will end at the custom end point provided by the parametercustom end point/custom end time. The previous windows are set up according to the window parameters.
Custom start point
If the parameterwindows definedis set tocustom startand theunitis set toexample based, this parameter defines the custom point from which the windows start. It is an expert setting and hence it is only shown if the parameterexpert settings被选中。
Custom end point
If the parameterwindows definedis set tocustom endand theunitis set toexample based, this parameter defines the custom point where the windows end. It is an expert setting and hence it is only shown if the parameterexpert settings被选中。
Custom start time
If the parameterwindows definedis set tocustom startand theunitis set totime based, this parameter defines the custom date time point from which the windows start.
The date time format used to interpret the string provided in this parameter is defined by the parameterdate format. It is an expert setting and hence it is only shown if the parameterexpert settings被选中。
Custom end time
If the parameterwindows definedis set tocustom endand theunitis set totime based, this parameter defines the custom date time point where the windows end.
The date time format used to interpret the string provided in this parameter is defined by the parameterdate format. It is an expert setting and hence it is only shown if the parameterexpert settings被选中。
Date format
Date format used for thecustom start timeandcustom end timeparameters. It is an expert setting and hence it is only shown if the parameterexpert settings被选中。
Training window size
The number of values in the training window. The ExampleSet provided at thetraining setport of theTrainingsubprocess will havetraining window sizenumber of examples. Thetraining window sizehas to be smaller or equal to the length of the time series.
Training window size time
The time duration/period of the training window.
The example set provided at thetraining setport of theTrainingsubprocess will have all examples which are in the corresponding training window.
Thetraining window size timehas to be smaller or equal to the time duration of the time series.
No overlapping windows
If this parameter is set to true, the parameterstepsizeis determined automatically, so that all training and test windows don't overlap. The stepsize is set totraining window size+test window size. It is an expert setting and hence it is only shown if the parameterexpert settings被选中。
Step size
The step size between the first values of two consecutive windows. E.g. with a training window size of 10 and a step size of 2, the first training window has the values from 0, ..., 9, the second training window the values from 2, ..., 11 and so on. Ifno overlapping windowsis set to true thestep sizeis automatically determined depending ontraining window sizeandtest window size.
Step size time
The step size (in units of time) between the start points of two consecutive windows. E.g. with a training window size of 1 week and a step size of 2 days, the first training window has the days from 0, ..., 6, the second training window the days from 2, ..., 8 and so on. Ifno overlapping windowsis set to true thestep size timeis automatically determined depending ontraining window size time,test window size timeandtest window offset time.
Test window size
The number of values in the test window. The ExampleSet provided at thetest setport of theTestingsubprocess will havetest window sizenumber of examples. Thetest window sizehas to be smaller or equal to the length of the time series.
Test window size time
The time duration/period taken in the test window.
The ExampleSet provided at thetest setport of theTestingsubprocess will have the examples in the corresponding test windows. It will have an attribute holding the original time series values in the test window (attribute name is the name of thetime series attributeparameter), and an attribute holding the values in the test window, forecasted by the forecast model from theTrainingsubprocess (attribute name isforecast of<time series attribute>). In addition, the ExampleSet has an attribute with the forecast position, ranging from 1 tomaximum number of test values. If the parameterhas indicesis set to true the ExampleSet has also an attribute holding the last index value of the training window.
Windows stop definition
定义是否eithe自定义窗口的结束r defined by the start of the next window (windows are spanning over the whole index range) or from an additional attribute.
- from next window start: The end of the windows are defined by the start of the next window (windows are spanning over the whole index range) Training windows end at the start of the next test window. Test windows end at the start of the next training window. Be aware that the last value of the start definition values (the last value of thetest window start attribute) is only used as the end of the final window.
- from attribute: The end of the windows are defined by additional attribute(s) in the custom window example set. The attribute names have to be provided by the parameterstraining window stop attributeandtest window stop attribute.
Training window start attribute
This parameter defines the attribute in the custom window example set (the example set provided at thecustom windowsinput port) which contains the start values for the custom training windows.
Thetraining window start attribute,training window stop attribute,test window start attributeandtest window stop attributehave to be of the same data type. If the data type is integer, the windowing is example based (see parameterunit) otherwise the attributes needs to be the same data type as the indices attribute.
Training window stop attribute
This parameter defines the attribute in the custom window example set (the example set provided at thecustom windowsinput port) which contains the end values for the custom training windows.
Thetraining window start attribute,training window stop attribute,test window start attributeandtest window stop attributehave to be of the same data type. If the data type is integer, the windowing is example based (see parameterunit) otherwise the attributes needs to be the same data type as the indices attribute.
Test window start attribute
This parameter defines the attribute in the custom window example set (the example set provided at thecustom windowsinput port) which contains the start values for the custom test windows.
Thetraining window start attribute,training window stop attribute,test window start attributeandtest window stop attributehave to be of the same data type. If the data type is integer, the windowing is example based (see parameterunit) otherwise the attributes needs to be the same data type as the indices attribute.
Test window stop attribute
This parameter defines the attribute in the custom window example set (the example set provided at thecustom windowsinput port) which contains the stop values for the custom test windows.
Thetraining window start attribute,training window stop attribute,test window start attributeandtest window stop attributehave to be of the same data type. If the data type is integer, the windowing is example based (see parameterunit) otherwise the attributes needs to be the same data type as the indices attribute.
Empty window handling
This parameter defines how empty windows (windows which do not contain an Example) will be handled. It is an expert setting and hence it is only shown if the parameterexpert settings被选中。
- add empty exampleset: Empty windows will be added as an empty ExampleSet, or a row with missing values.
- skip: Empty windows will be skipped completely in the processing. If either the training or the test window is empty, the processing for both windows is skipped.
- fail: A user error is thrown, if an empty window occurs.
Enable parallel execution
This parameter enables the parallel execution of the inner processes. Please disable the parallel execution if you run into memory problems.