Replace Missing Values
Synopsis
This Operator replaces missing values in Examples of selected Attributes by a specified replacement.
Description
Missing values can be replaced by the minimum, maximum or average value of that Attribute. Zero can also be used to replace missing values. Any replenishment value can also be specified as a replacement of missing values.
Differentiation
Impute Missing Values
This Operator estimates values for the missing values by applying a model learned for missing values.
Replace Infinite Values
This Operator replaces infinte values by specified replacements.
Declare Missing Value
In contrast to the Replace Missing Values Operators, this Operator set specific values of selected Attributes to missing values.
Input
example set
预计一个ExampleSet这个输入端口。
Output
example set
The ExampleSet with missing values replaced.
original
The ExampleSet that was given as input is passed through without changes.
preprocessing model
This port delivers the preprocessing model. It can be used by the Apply Model Operator to perform the specified replacement of missing values on another ExampleSet. This is helpful for example if the Replace Missing Values Operator is used during training and the same replacement has to be applied on test or actual data.
The preprocessing model can also be grouped together with other preprocessing models and learning models by the Group Models Operator.
Parameters
Create view
Create a View instead of changing the underlying data. If this option is checked, the replacement is delayed until the transformations are needed. This parameter can be considered a legacy option.
Attribute filter type
This parameter allows you to select the Attribute selection filter; the method you want to use for selecting Attributes. It has the following options:
- all: This option selects all the Attributes of the ExampleSet, no Attributes are removed. This is the default option.
- single: This option allows the selection of a single Attribute. The required Attribute is selected by theattributeparameter.
- subset: This option allows the selection of multiple Attributes through a list (see parameterattributes). If the meta data of the ExampleSet is known all Attributes are present in the list and the required ones can easily be selected.
- regular_expression: This option allows you to specify a regular expression for the Attribute selection. The regular expression filter is configured by the parametersregular expression, use except expression and except expression.
- value_type: This option allows selection of all the Attributes of a particular type. It should be noted that types are hierarchical. For example real and integer types both belong to the numeric type. The value type filter is configured by the parametersvalue type, use value type exception, except value type.
- block_type: This option allows the selection of all the Attributes of a particular block type. It should be noted that block types may be hierarchical. For example value_series_start and value_series_end block types both belong to the value_series block type. The block type filter is configured by the parametersblock type, use block type exception, except block type.
- no_missing_values: This option selects all Attributes of the ExampleSet which do not contain a missing value in any Example. Attributes that have even a single missing value are removed.
- numeric_value_filter: All numeric Attributes whose Examples all match a given numeric condition are selected. The condition is specified by thenumeric conditionparameter. Please note that all nominal Attributes are also selected irrespective of the given numerical condition.
Attribute
The required Attribute can be selected from this option. The Attribute name can be selected from the drop down box of the parameter if the meta data is known.
Attributes
The required Attributes can be selected from this option. This opens a new window with two lists. All Attributes are present in the left list. They can be shifted to the right list, which is the list of selected Attributes that will make it to the output port.
Regular expression
Attributes whose names match this expression will be selected. The expression can be specified through theedit and preview regular expressionmenu. This menu gives a good idea of regular expressions and it also allows you to try different expressions and preview the results simultaneously.
Use except expression
If enabled, an exception to the first regular expression can be specified. This exception is specified by theexcept regular expressionparameter.
Except regular expression
这个选项allows you to specify a regular expression. Attributes matching this expression will be filtered out even if they match the first expression (expression that was specified inregular expressionparameter).
Value type
这个选项allows to select a type of Attribute. One of the following types can be chosen: nominal, numeric, integer, real, text, binominal, polynominal, file_path, date_time, date, time.
Use value type exception
If enabled, an exception to the selected type can be specified. This exception is specified by theexcept value typeparameter.
Except value type
The Attributes matching this type will be removed from the final output even if they matched the before selected type, specified by thevalue typeparameter. One of the following types can be selected here: nominal, numeric, integer, real, text, binominal, polynominal, file_path, date_time, date, time.
Block type
这个选项allows to select a block type of Attribute. One of the following types can be chosen: single_value, value_series, value_series_start, value_series_end, value_matrix, value_matrix_start, value_matrix_end, value_matrix_row_start.
Use block type exception
If enabled, an exception to the selected block type can be specified. This exception is specified by theexcept block typeparameter.
Except block type
属性匹配这个块类型将再保险moved from the final output even if they matched the before selected type by theblock typeparameter. One of the following block types can be selected here: single_value, value_series, value_series_start, value_series_end, value_matrix, value_matrix_start, value_matrix_end, value_matrix_row_start.
Numeric condition
The numeric condition used by the numeric condition filter type. A numeric Attribute is kept if all Examples match the specified condition for this Attribute. For example the numeric condition '>6' will keep all numeric Attributes having a value of greater than 6 in every Example. A combination of conditions is possible: '>6 &&<11' or '<= 5 ||<0'. But && and || cannot be used together in one numeric condition. Conditions like '(>0 &&<2) || (>10 &&<12)' are not allowed because they use both && and ||. Nominal Attributes are always kept, regardless of the specified numeric condition.
Include special attributes
Special Attributes are Attributes with special roles. These are: id, label, prediction, cluster, weight and batch. Also custom roles can be assigned to Attributes. By default all special Attributes are delivered to the output port irrespective of the conditions in the Select Attribute Operator. If this parameter is set to true, special Attributes are also tested against conditions specified in the Select Attribute Operator and only those Attributes are selected that match the conditions.
Invert selection
If this parameter is set to true the selection is reversed. In that case all Attributes matching the specified condition are removed and the other Attributes remain in the output ExampleSet. Special Attributes are kept independent of theinvert selectionparameter as along as theinclude special attributesparameter is not set to true. If so the condition is also applied to the special Attributes and the selection is reversed if this parameter is checked.
Default
This parameter specifies how missing values are replaced by default. This default option is used for all Attributes which are not specified by thecolumnsparameter.
- none: Missing values are not replaced by default.
- minimum: Missing values are replaced by the minimum value of that Attribute.
- maximum: Missing values are replaced by the maximum value of that Attribute.
- average: Missing values are replaced by the average value of that Attribute.
- zero: Missing values are replaced by zero.
- value: Missing values are replaced by the value specified in thereplenishment valueparameter.
Columns
Different Attributes can be provided with a different type of replacements through this parameter. The default function selected by thedefaultparameter is applied on Attributes that are not explicitly mentioned in thecolumnsparameter.
Replenishment value
If thedefaultparameter is set to value, this parameter specifies the value which is used to replace missing values.