Highest Peak Transformation
Synopsis
This operator performs a Highest Peak Transformation for one or more time series attributes.
Description
A peak transformation detects peaks in the time series and outputs an indicator series (and optional a peaked series) as the result. The meaning of the indicator series and the actual peak detection algorithm are described below.
的最大数量nof peaks to be extracted is defined by the parameternumber of peaks, the type of peaks to be detected is defined by the parameterpeak types.
The indicator time series consists of the flag values :
- (0) no peak
- (1) maximum
- (-1) minimum
The operator provides the original time series, the indicator time series and (if parameteradd peaked seriesis selected) the peaked time series at thepeak transformed example setoutputport. The peaked time series has all values set to missing where there is no peak (indicator series is 0).
The highest peak detection algorithm checks for extrema in the time series and adds all values around the extrema to the peak for which the relative change between two values is larger than theminimum change. The parametersloppy valuesdefines how many values are allowed which not fulfill this condition. Note that only values above/below (for maximum/minimum) the average are considered to be peak values. An heuristic (see parameteruse heuristics) can be used to determine values for the parameterssloppy valuesandminimum change.
The exact peak detection procedure is as follows:
-
- 找到全局极值current Area (start Area is the whole time series). The method only consider values above/below the average as candidates for an extremum. The actual Area is skipped, if no value above/below the average exist in the Area.
-
- 发现左和右的高峰
-
- Add current peak to the result
-
- Repeat steps 1.-3. for the Areas left and right of the current Area.
The procedure to find the left and right end of the peak is as follows:
-
- Check if the next value left/right to the last value fulfills the peak condition:
- The next value has to be above/below (maximum/minimum) the average. If it is not, the search for peak values is stopped
- The relative change (decrease/increase for maximum/minimum) between last value and next value has to be larger than theminimum changeper step (see the description of the parameterminimum changefor more details).
- Allow forsloppy valuesnumber of values where the relative change is not larger than theminimum change
-
- If the peak condition is fulfilled, update last value to the next value
-
- Repeat steps 1.-2. until peak condition is not fulfilled
If a peak is detected, the high-low amplitude of the peak is calculated. Therefore the minimum and maximum values in the whole peak area (and 1 slice left and right of the peak area) are calculated. The high-low amplitude is the difference between maximum and minimum in the peak area. The operator only returns thenhighest peaks in terms of the high-low amplitude of the peaks.
This operator works only on numerical time series.
Input
example set
The ExampleSet which contains the time series data as attributes.
Output
peak transformed example set
The ExampleSet containing the results of the peak transformation. It contains the original time series, the peak indicator time series (peak flag values (-1,0,+1)) for the selected attributes and optionally the peaked time series.
original
The ExampleSet that was given as input is passed through without changes.
Parameters
Attribute filter type
This parameter allows you to select the filter for the time series attributes selection filter; the method you want to select the attributes which holds the time series values. Only numeric attributes can be selected as time series attributes. The different filter types are:
- all:这个选项selects all attributes of the ExampleSet to be time series attributes. This is the default option.
- single:这个选项allows the selection of a single time series attribute. The required attribute is selected by theattributeparameter.
- subset:这个选项allows the selection of multiple time series attributes through a list (see parameterattributes). If the meta data of the ExampleSet is known all attributes are present in the list and the required ones can easily be selected.
- regular_expression:这个选项allows you to specify a regular expression for the time series attribute selection. The regular expression filter is configured by the parametersregular expression, use except expression and except expression.
- value_type:这个选项allows selection of all the attributes of a particular type to be time series attributes. It should be noted that types are hierarchical. For example real and integer types both belong to the numeric type. The value type filter is configured by the parametersvalue type, use value type exception, except value type.
- block_type:这个选项allows the selection of all the attributes of a particular block type to be time series attributes. It should be noted that block types may be hierarchical. For example value_series_start and value_series_end block types both belong to the value_series block type. The block type filter is configured by the parametersblock type, use block type exception, except block type.
- no_missing_values:这个选项selects all attributes of the ExampleSet as time series attributes which do not contain a missing value in any example. Attributes that have even a single missing value are not selected.
- numeric_value_filter: All numeric attributes whose examples all match a given numeric condition are selected as time series attributes. The condition is specified by thenumeric conditionparameter.
Attribute
The required attribute can be selected from this option. The attribute name can be selected from the drop down box of the parameter if the meta data is known.
Attributes
The required attributes can be selected from this option. This opens a new window with two lists. All attributes are present in the left list. They can be shifted to the right list, which is the list of selected time series attributes.
Regular expression
Attributes whose names match this expression will be selected. The expression can be specified through theedit and preview regular expressionmenu. This menu gives a good idea of regular expressions and it also allows you to try different expressions and preview the results simultaneously.
Use except expression
If enabled, an exception to the first regular expression can be specified. This exception is specified by theexcept regular expressionparameter.
Except regular expression
This option allows you to specify a regular expression. Attributes matching this expression will be filtered out even if they match the first expression (expression that was specified inregular expressionparameter).
Value type
This option allows to select a type of attribute. One of the following types can be chosen: numeric, integer, real.
Use value type exception
If enabled, an exception to the selected type can be specified. This exception is specified by theexcept value typeparameter.
Except value type
The attributes matching this type will be removed from the final output even if they matched the before selected type, specified by thevalue typeparameter. One of the following types can be selected here: numeric, integer, real.
Block type
This option allows to select a block type of attribute. One of the following types can be chosen: value_series, value_series_start, value_series_end.
Use block type exception
If enabled, an exception to the selected block type can be specified. This exception is specified by theexcept block typeparameter.
Except block type
The attributes matching this block type will be removed from the final output even if they matched the before selected type by theblock typeparameter. One of the following block types can be selected here: value_series, value_series_start, value_series_end.
Numeric condition
The numeric condition used by the numeric condition filter type. A numeric attribute is selected if all examples match the specified condition for this attribute. For example the numeric condition '>6' will keep all numeric attributes having a value of greater than 6 in every example. A combination of conditions is possible: '>6 &&<11' or '<= 5 ||<0'. But && and || cannot be used together in one numeric condition. Conditions like '(>0 &&<2) || (>10 &&<12)' are not allowed because they use both && and ||.
Invert selection
If this parameter is set to true the selection is reversed. In that case all attributes not matching the specified condition are selected as time series attributes. Special attributes are not selected independent of theinvert selectionparameter as along as theinclude special attributesparameter is not set to true. If so the condition is also applied to the special attributes and the selection is reversed if this parameter is checked.
Include special attributes
特殊属性是具有特殊属性的方式es. These are: id, label, prediction, cluster, weight and batch. Also custom roles can be assigned to attributes. By default special attributes are not selected as time series attributes irrespective of the filter conditions. If this parameter is set to true, special attributes are also tested against conditions specified and those attributes are selected that match the conditions.
Has indices
This parameter indicates if there is an index attribute associated with the time series. If this parameter is set to true, the index attribute has to be selected.
Indices attribute
If the parameter has indices is set to true, this parameter defines the associated index attribute. It can be either a date, date_time or numeric value type attribute. The attribute name can be selected from the drop down box of the parameter if the meta data is known.
Number of peaks
Maximum number of peaks to be detected. If the highest peak detection algorithm detects more peaks, only the largest (in terms of high-low amplitude of the peaks) are kept. Be aware that this maximum number is either for both peak types separately or combined (see parameterpeak types).
Peak types
This parameter defines the types (maximum/minimum) of peaks to be detected by the peak detection algorithm.nis the value of thenumber of peaksparameter.
- only maxima: Only maximum peaks are detected. (maximal number of peaks isn)
- only minima: Only minimum peaks are detected. (maximal number of peaks isn)
- maxima and minima separately: Both maximum and minimum peaks are detected. The number of peaks is counted for each type separately (so that the maximal number of peaks is2n)
- maxima and minima combined: Both maximum and minimum peaks are detected. The number of peaks is counted for both types combined (so that the maximal number of peaks isn)
Use heuristics
If selected the parameterssloppy valuesandminimum changeare determined by an heuristic (n=<length of time series>):
sloppy valuesis set to sqrt(n/ 2.0).
minimum changeis set to the average of (percentile(90)-mean) / (stdx 0.1 xn) (only maximum) or (mean-percentile(10)) / (stdx 0.1 xn) (only minimum) or (percentile(90)-percentile(10)) / (stdx 0.1 xn) (both peak types) over all selected time series (at maximum 0.5).
Be aware that this is only a rough heuristic, for optimized results the parameters have to be adapted to your data.
Sloppy values
Allowed number of sloppy values (values for which the relative change is smaller than the minimum change) until end of current peak is reached. Number of sloppy values should be increased for noisy data.
Minimum change
Threshold on the relative change between last and next value to count the next value as a peak value.
The relative change is calculated as the decrease / increase (maximum / minimum) between next value and last value divided by the distance of last value to the average:
relative change= (lastValue-nextValue) / (lastValue-average)
If the relative change is larger than the minimum change the next value is counted as a peak value. As the minimum change is the threshold per slice, the parameter defines how sharp a peak has to be, to be detected. If you expect wide peaks in your data, decrease the minimum change.
Add peaked series
If selected the peaked series will be added, which contains the actual values for the detected peaks and missing values for non-peak areas.
Ignore invalid values
if selected invalid values (missing, positive and negative infinity) are ignored in the peak detection algorithm.