Skip to main content

Optimize Weights (Evolutionary)

Synopsis

This operator calculates the relevance of the attributes of the given ExampleSet by using an evolutionary approach. The weights of the attributes are calculated using a Genetic Algorithm.

Description

The Optimize Weights (Evolutionary) operator is a nested operator i.e. it has a subprocess. The subprocess of the Optimize Weights (Evolutionary) operator must always return a performance vector. For more information regarding subprocesses please study theSubprocessoperator. The Optimize Weights (Evolutionary) operator calculates the weights of the attributes of the given ExampleSet by using a Genetic Algorithm. The higher the weight of an attribute, the more relevant it is considered.

A genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems. Genetic algorithms belong to the larger class of evolutionary algorithms (EA), which generate solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover.

In genetic algorithm 'mutation' means switching features on and off and 'crossover' means interchanging used features. Selection is done by the specified selection scheme which is selected by theselection schemeparameter. A genetic algorithm works as follows:

Generate an initial population consisting ofpindividuals. The numberpcan be adjusted by thepopulation sizeparameter.

For all individuals in the population

  1. Perform mutation, i.e. set used attributes to unused with probabilityp_mand vice versa. The probabilityp_mcan be adjusted by the corresponding parameters.
  2. Choose two individuals from the population and perform crossover with probabilityp_c. The probabilityp_ccan be adjusted by thep crossoverparameter. The type of crossover can be selected by thecrossover typeparameter.
  3. Perform selection, map all individuals according to their fitness and drawpindividuals at random according to their probability wherepis the population size which can be adjusted by thepopulation sizeparameter.
  4. As long as the fitness improves, go to step number 2.

If the ExampleSet contains value series attributes with block numbers, the whole block will be switched on and off. Exact, minimum or maximum number of attributes in combinations to be tested can be specified by the appropriate parameters. Many other options are also available for this operator. Please study the parameters section for more information.

Input

example set in

This input port expects an ExampleSet. This ExampleSet is available at the first port of the nested chain (inside the subprocess) for processing in the subprocess.

attribute weights in

This port expects attribute weights. It is not compulsory to use this port.

through

This operator can have multiplethroughports. When one input is connected with thethroughport, anotherthroughport becomes available which is ready to accept another input (if any). The order of inputs remains the same. The Object supplied at the firstthroughport of this operator is available at the firstthroughport of the nested chain (inside the subprocess). Do not forget to connect all inputs in correct order. Make sure that you have connected the right number of ports at the subprocess level.

Output

example set out

The genetic algorithm is applied on the input ExampleSet. The resultant ExampleSet with reduced attributes is delivered through this port.

weights

The attribute weights are delivered through this port.

performance

This port delivers the Performance Vector for the selected attributes. A Performance Vector is a list of performance criteria values.

Parameters

Population size

这个爸爸rameter specifies the population size i.e. the number of individuals per generation.

Maximum number of generations

这个爸爸rameter specifies the number of generations after which the algorithm should be terminated.

Use early stopping

这个爸爸rameter enables early stopping. If not set to true, always the maximum number of generations are performed.

Generations without improval

这个爸爸rameter is only available when theuse early stoppingparameter is set to true. This parameter specifies the stop criterion for early stopping i.e. it stops afterngenerations without improvement in the performance.nis specified by this parameter.

Normalize weights

这个爸爸rameter indicates if the final weights should be normalized. If set to true, the final weights are normalized such that the maximum weight is 1 and the minimum weight is 0.

Use local random seed

这个爸爸rameter indicates if alocal random seedshould be used for randomization. Using the same value oflocal random seedwill produce the same randomization.

Local random seed

这个爸爸rameter specifies thelocal random seed. This parameter is only available if theuse local random seedparameter is set to true.

Show stop dialog

这个爸爸rameter determines if a dialog with astopbutton should be displayed which stops the search for the best feature space. If the search for the best feature space is stopped, the best individual found till then will be returned.

User result individual selection

If this parameter is set to true, it allows the user to select the final result individual from the last population.

Show population plotter

这个爸爸rameter determines if the current population should be displayed in the performance space.

Population criteria data file

这个爸爸rameter specifies the path to the file in which the criteria data of the final population should be saved.

Maximal fitness

这个爸爸rameter specifies the maximal fitness. The optimization will stop if the fitness reaches this value.

Selection scheme

这个爸爸rameter specifies the selection scheme of this evolutionary algorithms.

Tournament size

这个爸爸rameter is only available when theselection schemeparameter is set to 'tournament'. It specifies the fraction of the current population which should be used as tournament members.

Start temperature

这个爸爸rameter is only available when theselection schemeparameter is set to 'Boltzmann'. It specifies the scaling temperature.

动态选择压力

这个爸爸rameter is only available when theselection schemeparameter is set to 'Boltzmann' or 'tournament'. If set to true the selection pressure is increased to maximum during the complete optimization run.

Keep best individual

If set to true, the best individual of each generations is guaranteed to be selected for the next generation.

Save intermediate weights

这个爸爸rameter determines if the intermediate best results should be saved.

Intermediate weights generations

这个爸爸rameter is only available when thesave intermediate weightsparameter is set to true. The intermediate best results would be saved everykgenerations wherekis specified by this parameter.

Intermediate weights file

这个爸爸rameter specifies the file into which the intermediate weights should be saved.

Mutation variance

这个爸爸rameter specifies the (initial) variance for each mutation.

1 5 rule

这个爸爸rameter determines if the 1/5 rule for variance adaption should be used.

Bounded mutation

If this parameter is set to true, the weights are bounded between 0 and 1.

P crossover

The probability for an individual to be selected for crossover is specified by this parameter.

Crossover type

可以选择交叉的类型parameter.

Use default mutation rate

这个爸爸rameter determines if the default mutation rate should be used for nominal attributes.

Nominal mutation rate

这个爸爸rameter specifies the probability to switch nominal attributes between 0 and 1.

Initialize with input weights

这个爸爸rameter indicates if this operator should look for attribute weights in the given input and use them as a starting point for the optimization.