Equalize Time Stamps
Synopsis
This operators computes an equalized time series of an input time series with date time indices.
Description
我输出时间序列将会有新的等距ndex values. The configuration of the new index values are defined by the parameterequalize method. Each method has different ways how the number of examples, start date, stop date and step size of the new index values are determined. For details see the description of the parameterequalize method.
Note that the time domain (see parameter时间域) is an important distinction for equalizing time stamps. Calendar entries for example are not equidistant on a time duration scale (e.g. months have different length). Nevertheless for many use cases (e.g. sales time series) it is important to have monthly 'equidistant' time stamps. In other use cases (e.g. sensor data) it is important to have equidistant time stamps on a microsecond scale.
The new values of the equalized time series attributes will be computed by using the same functionality as theReplace Missing Values (Series)operator (note that this functionality is configured to ensure finite values). The three parametersreplace type numerical,replace type nominalandreplace type date timedefines how the new values are computed.
This operator works on all time series (numerical, nominal, date-time) which have date time indices.
Input
example set
The ExampleSet which contains the time series data as attributes.
Output
equalized example set
The ExampleSet contains the equalized time series.
original
The ExampleSet that was given as input is passed through without changes.
Parameters
Indices attribute
The attribute holding the indices values of the time series. It has to be date-time. The attribute name can be selected from the drop down box of the parameter if the meta data is known.
Equalize method
This parameter defines the used equalize method. The configuration also depends on the parameterstime_domain,round start and stop dateandfit number of examples to range.
- same range and number of examples as orginal data: The same range ('start' and 'stop date') and the same 'number of examples' as the original data is used. The calculation of the 'step size' depends on the parameter 'time domain'. It is either the exact duration (on a millisecond scale) between<start date>and<stop date>divided by (<number of examples>- 1) ('time domain' is 'time') or the period (on a number of days scale) divided by (<number of examples>- 1) ('time domain' is 'calendar'). For latter the number of examples can also be adapted to fit the range again (see parameter 'fit number of examples to range').
- number of examples, start value and step size: The 'number of examples', the 'start date' and the 'step size' are provided. The 'number of examples' and the 'start date' can be retrieved from the original data or provided as custom values (see the parameters 'number of examples', 'custom number of examples', 'start date', 'custom date value'). The step size has to be provided by the parameter 'step size (duration)' or 'step size (period)', depending on the parameter 'time domain'. The stop date is calculated as<start date>+ (<number of examples>- 1) x<step size>
- number of examples and range(start,stop): The 'number of examples', the 'start date' and the 'stop date' are provided. The 'number of examples', the 'start date' and the 'stop date' can be retrieved from the original data or provided as custom values (see the parameters 'number of examples', 'custom number of examples', 'start date', 'custom start date','stop date', 'custom stop date'). The calculation of the 'step size' depends on the parameter 'time domain'. It is either the exact duration (on a millisecond scale) between<start date>and<stop date>divided by (<number of examples>- 1) ('time domain' is 'time') or the period (on a number of days scale) divided by (<number of examples>- 1) ('time domain' is 'calendar'). For latter the number of examples can also be adapted to fit the range again (see parameter 'fit number of examples to range').
- range(start,stop) and step size: The 'start date', the 'stop date' and the 'step size' are provided. The 'start date' and the 'stop date' can be retrieved from the original data or provided as custom values (see the parameters 'start date', 'custom start date','stop date', 'custom stop date'). The 'step size' has to be provided by the parameter 'step size (duration)' or 'step size (period)', depending on the parameter 'time domain'. The 'number of examples' is calculated that the<start date>+ (<number of examples>- 1) x<step size>is after the<stop date>and that<start date>+ (<number of examples>- 2) x<step size>is before (thus the last index value is the first of the index values which is after the<stop date>).
Number of examples
Specify how the number of examples is retrieved.
- same as original data: Same value as the original data.
- custom: The value is specified by the parameter 'custom number of examples'.
Custom number of examples
New number of examples for the equalized time series
Start value
Specify how the start date is retrieved.
- same as original data: Same value as the original data.
- custom: The value is specified by the parameter 'custom start date'.
Custom start date
New start date of the index values for the equalized time series.
Stop value
Specify how the stop date is retrieved.
- same as original data: Same value as the original data.
- custom: The value is specified by the parameter 'custom stop date'.
Custom stop date
New stop date of the index values for the equalized time series.
Date format
Date format used forcustom start dateandcustom stop dateparameters.
Time domain
Time domain for which the time series shall be equalized. Note that this is an important distinction for equalizing time stamps. Calendar entries for example are not equidistant on a time duration scale (e.g. months have different length). Nevertheless for many use cases (e.g. sales time series) it is important to have monthly 'equidistant' time stamps. In other use cases (e.g. sensor data) it is important to have equidistant time stamps on a microsecond scale.
- time: Time differences and step size are handled exactly. This means they are handled as durations with microsecond precision.
- calendar: Time differences and step size are handled as period in multiples of days, weeks, months and years.
Round start and stop date
If selected start and stop date values (either retrieved from the original data or specified by the corresponding parameters) are rounded to the previous/next exact day. Truncating the time stamps from their hour, minutes, seconds part. This is done before the non-provided configuration parameters of theequalize methodare computed.
Fit number of examples to range
This parameter is only enabled for时间域=calendarand for theequalized methods:same range and number of examples as original dataandnumber of examples and range(start,stop).
If selected the number of examples is fitted to the provided range after the range is determined. This is needed, due to the fact that the step size is a multiple of one day and therefore the actual stop date can be way after the provided one.
Step size (time duration)
Step size (as a duration with microsecond precision) between the new index values of the equalized time series. Used in case parameter时间域istime.
Step size (time period)
Step size (as a period in multiple of days, weeks, months or years) between the new index values of the equalized time series. Used in case parameter时间域iscalendar.
Replace type numerical
The kind of replacement which is used to compute the new numerical values of the equalized time series.
- previous value: The previous value in the series is used as a replacement. Neighboring missing values are all replaced by the first previous valid value. Missing values at the start of a series are replaced by the next valid value.
- next value: The next value in the series is used as a replacement. Neighboring missing values are all replaced by the next valid value. Missing values at the end of a series are replaced by the first previous valid value.
- average: The average of the neighboring values in the series is used as a replacement. Neighboring missing values are all replaced by the average of the neighboring valid values. Missing values at the start and end of a series are replaced by the next, respectively previous valid value.
- linear interpolation: A linear interpolation (using the old and new index values) between the two neighboring values in the series is used to calculate the replacement value. The next valid neighboring values are used to perform a linear interpolation and all missing values are replaced by the replacement values calculated by the linear interpolation. Missing values at the start and end of a series are replaced by the next, respectively previous valid value.
- value: All missing values are replaced by a constant value, specified by thereplace value numericalparameter.
Replace type nominal
The kind of replacement which is used to compute the new nominal values of the equalized time series.
- previous value: The previous value in the series is used as a replacement. Neighboring missing values are all replaced by the first previous valid value. Missing values at the start of a series are replaced by the next valid value.
- next value: The next value in the series is used as a replacement. Neighboring missing values are all replaced by the next valid value. Missing values at the end of a series are replaced by the first previous valid value.
- value: All missing values are replaced by a constant value, specified by thereplace value nominalparameter.
Replace type date time
The kind of replacement which is used to compute the new date time values of the equalized time series.
- previous value: The previous value in the series is used as a replacement. Neighboring missing values are all replaced by the first previous valid value. Missing values at the start of a series are replaced by the next valid value.
- next value: The next value in the series is used as a replacement. Neighboring missing values are all replaced by the next valid value. Missing values at the end of a series are replaced by the first previous valid value.
- average: The average of the neighboring values in the series is used as a replacement. Neighboring missing values are all replaced by the average of the neighboring valid values. Missing values at the start and end of a series are replaced by the next, respectively previous valid value.
- linear interpolation: A linear interpolation (using the old and new index values) between the two neighboring values in the series is used to calculate the replacement value. The next valid neighboring values are used to perform a linear interpolation and all missing values are replaced by the replacement values calculated by the linear interpolation. Missing values at the start and end of a series are replaced by the next, respectively previous valid value.
- value: All missing values are replaced by a constant value, specified by thereplace value date timeparameter.
Replace value numerical
Ifreplace type numericalis set tovaluethis parameter specifies the replacement value for all missing values of numerical time series.
Replace value nominal
Ifreplace type nominalis set tovaluethis parameter specifies the replacement value for all missing values of nominal time series.
Replace value date time
Ifreplace type date timeis set tovaluethis parameter specifies the replacement value for all missing values of time series with date time values.