Skip to main content

Optimize by Generation (YAGGA2)

Synopsis

This operator may select some attributes from the original attribute set and it may also generate new attributes from the original attribute set. YAGGA2 (Yet Another Generating Genetic Algorithm 2) does not change the original number of attributes unless adding or removing (or both) attributes proves to have a better fitness. This algorithm is an improved version of YAGGA.

Description

Sometimes the selection of features alone is not sufficient. In these cases other transformations of the feature space must be performed. The generation of new attributes from the given attributes extends the feature space. Maybe a hypothesis can be easily found in the extended feature space. This operator can be considered to be a blend of attribute selection and attribute generation procedures. It may select some attributes from the original set of attributes and it may also generate new attributes from the original attributes. The (generating) mutation can do one of the following things with different probabilities:

  • Probability p/4: Add a newly generated attribute to the feature vector.
  • Probability p/4: Add a randomly chosen original attribute to the feature vector.
  • Probability p/2: Remove a randomly chosen attribute from the feature vector.

在addition to the usual YAGGA operator, this operator allows more feature generators and provides several techniques for redundancy prevention. This leads to smaller ExampleSets containing less redundant features.

A genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems. Genetic algorithms belong to the larger class of evolutionary algorithms (EA), which generate solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover. For studying the basic algorithm of a genetic algorithm please study the description of theOptimize Selection (Evolutionary)operator.

This operator is a nested operator i.e. it has a subprocess. The subprocess must return a performance vector. You need to have basic understanding of subprocesses in order to apply this operator. Please study the documentation of theSubprocessoperator for basic understanding of subprocesses.

Differentiation

Optimize by Generation (YAGGA)

The YAGGA2 operator is an improved version of the usual YAGGA operator, this operator allows more feature generators and provides several techniques for redundancy prevention. This leads to smaller ExampleSets containing less redundant features.

Input

example set in

This input port expects an ExampleSet. This ExampleSet is available at the first port of the nested chain (inside the subprocess) for processing in the subprocess.

Output

example set out

The genetic algorithm is applied on the input ExampleSet. The resultant ExampleSet is delivered through this port.

attribute weights out

The attribute weights are delivered through this port.

performance out

This port delivers the Performance Vector for the selected attributes. A Performance Vector is a list of performance criteria values.

Parameters

Limit max total number of attributes

This parameter indicates if the total number of attributes in all generations should be limited. If set to true, the maximum number is specified by themax total number of attributesparameter.

Max total number of attributes

This parameter is only available when thelimit max total number of attributesparameter is set to true. This parameter specifies the maximum total number of attributes in all generations.

Use local random seed

This parameter indicates if alocal random seedshould be used for randomization. Using the same value oflocal random seedwill produce the same randomization.

Local random seed

This parameter specifies thelocal random seed. This parameter is only available if theuse local random seedparameter is set to true.

Show stop dialog

This parameter determines if a dialog with astopbutton should be displayed which stops the search for the best feature space. If the search for the best feature space is stopped, the best individual found till then will be returned.

Maximal fitness

This parameter specifies the maximal fitness. The optimization will stop if the fitness reaches this value.

Population size

This parameter specifies the population size i.e. the number of individuals per generation.

Maximum number of generations

This parameter specifies the number of generations after which the algorithm should be terminated.

Use plus

This parameter indicates if the summation function should be applied for a generation of new attributes.

Use diff

This parameter indicates if the difference function should be applied for a generation of new attributes.

Use mult

This parameter indicates if the multiplication function should be applied for a generation of new attributes.

Use div

This parameter indicates if the division function should be applied for a generation of new attributes.

Reciprocal value

This parameter indicates if the reciprocal function should be applied for a generation of new attributes.

Use early stopping

This parameter enables early stopping. If not set to true, always the maximum number of generations are performed.

Generations without improval

This parameter is only available when the使用早期停止parameter is set to true. This parameter specifies the stop criterion for early stopping i.e. it stops afterngenerations without improvement in the performance.nis specified by this parameter.

Tournament size

This parameter specifies the fraction of the current population which should be used as tournament members.

Start temperature

This parameter specifies the scaling temperature.

Dynamic selection pressure

If this parameter is set to true, the selection pressure is increased to maximum during the complete optimization run.

Keep best individual

If set to true, the best individual of each generation is guaranteed to be selected for the next generation.

P initialize

The initial probability for an attribute to be switched on is specified by this parameter.

P crossover

The probability for an individual to be selected for crossover is specified by this parameter.

Crossover type

可以选择交叉的类型parameter.

Use heuristic mutation probability

If this parameter is set to true, the probability for mutations will be chosen as1/nwherenis the number of attributes. Otherwise the probability for mutations should be specified through thep mutationparameter

P mutation

The probability for an attribute to be changed is specified by this parameter. If set to -1, the probability will be set to1/nwherenis the total number of attributes.

Use square roots

This parameter indicates if the square root function should be applied for a generation of new attributes.

Use power functions

This parameter indicates if the power (of one attribute to another attribute) function should be applied for a generation of new attributes.

Use sin

This parameter indicates if the sine function should be applied for a generation of new attributes.

Use cos

This parameter indicates if the cosine function should be applied for a generation of new attributes.

Use tan

This parameter indicates if the tangent function should be applied for a generation of new attributes.

Use atan

This parameter indicates if the arc tangent function should be applied for a generation of new attributes.

Use exp

This parameter indicates if the exponential function should be applied for a generation of new attributes.

Use log

This parameter indicates if the logarithmic function should be applied for a generation of new attributes.

Use absolute values

This parameter indicates if the absolute function should be applied for a generation of new attributes.

Use min

This parameter indicates if the minimum function should be applied for a generation of new attributes.

Use max

This parameter indicates if the maximum function should be applied for a generation of new attributes.

Use sgn

This parameter indicates if the signum function should be applied for a generation of new attributes.

Use floor ceil functions

This parameter indicates if the floor and ceiling functions should be applied for a generation of new attributes.

Restrictive selection

This parameter indicates if the restrictive generator selection should be used. Execution is usually faster if this parameter is set to true.

Remove useless

This parameter indicates if useless attributes should be removed.

Remove equivalent

This parameter indicates if equivalent attributes should be removed.

Equivalence samples

nnumber of samples are checked to prove equivalency wherenis the value of this parameter.

Equivalence epsilon

如果他们两个属性被认为是等价的difference is not bigger than epsilon.

Equivalence use statistics

If this parameter is set to true, attribute statistics are recalculated before equivalence check.

Unused functions

This parameter specifies the space separated list of functions which are not allowed in arguments for the attribute construction.

Constant generation prob

This parameter specifies the probability for a generation of random constant attributes.

Associative attribute merging

This parameter specifies if post processing should be performed after the crossover. It is only possible for runs with only one generator.

Optimize by Generation (YAGGA)