Logistic Regression (Evolutionary)
Synopsis
This operator is a kernel logistic regression learner for binary classification tasks.
Description
Logistic regression is a type of regression analysis used for predicting the outcome of a categorical (a variable that can take on a limited number of categories) criterion variable based on one or more predictor variables. The probabilities describing the possible outcome of a single trial are modeled, as a function of explanatory variables, using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and usually a continuous independent variable (or several), by converting the dependent variable to probability scores
This operator supports various kernel types includingdot, radial, polynomial, sigmoid, anova, epachnenikov, gaussian combinationandmultiquadric. An explanation of these kernel types is given in the parameters section.
Input
training set
This input port expects an ExampleSet. This operator cannot handle nominal attributes; it can be applied on data sets with numeric attributes. Thus often you may have to use the Nominal to Numerical operator before application of this operator.
Output
model
The Logistic Regression model is delivered from this output port. This model can now be applied on unseen data sets.
example set
The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.
Parameters
Kernel type
The type of the kernel function is selected through this parameter. Following kernel types are supported:dot, radial, polynomial, sigmoid, anova, epachnenikov, gaussian combination, multiquadric
- dot: The dot kernel is defined byk(x,y)=x*yi.e. it is inner product ofxandy.
- radial: The radial kernel is defined byexp(-g ||x-y||^2)wheregis thegamma, it is specified by thekernel gammaparameter. The adjustable parametergamma在核心的性能上起着重要的作用l, and should be carefully tuned to the problem at hand.
- polynomial: The polynomial kernel is defined byk(x,y)=(x*y+1)^dwheredis the degree of polynomial and it is specified by thekernel degreeparameter. The polynomial kernels are well suited for problems where all the training data is normalized.
- sigmoid: The sigmoid kernel is defined by a two layered neural nettanh(a x*y+b)whereaisalphaandbis theintercept constant. These parameters can be adjusted using thekernel aandkernel bparameters. A common value foralphais 1/N, where N is the data dimension. Note that not all choices ofaandblead to a valid kernel function.
- anova: The anova kernel is defined by raised to powerdof summation ofexp(-g (x-y))wheregisgammaanddisdegree. gamma and degree are adjusted by thekernel gammaandkernel degreeparameters respectively.
- epachnenikov: The epachnenikov kernel is this function(3/4)(1-u2)forubetween -1 and 1 and zero foruoutside that range. It has two adjustable parameterskernel sigma1andkernel degree.
- gaussian_combination: This is the gaussian combination kernel. It has the adjustable parameterskernel sigma1, kernel sigma2andkernel sigma3.
- multiquadric: The multiquadric kernel is defined by the square root of||x-y||^2 + c^2. It has the adjustable parameterskernel sigma1andkernel sigma shift.
Kernel gamma
这是内核parameter gamma. This is only available when thekernel typeparameter is set toradialoranova.
Kernel sigma1
这是内核parameter sigma1. This is only available when thekernel typeparameter is set toepachnenikov,gaussian combinationormultiquadric.
Kernel sigma2
这是内核parameter sigma2. This is only available when thekernel typeparameter is set togaussian combination.
Kernel sigma3
这是内核parameter sigma3. This is only available when thekernel typeparameter is set togaussian combination.
Kernel shift
这是内核parameter shift. This is only available when thekernel typeparameter is set tomultiquadric.
Kernel degree
这是内核parameter degree. This is only available when thekernel typeparameter is set topolynomial,anovaorepachnenikov.
Kernel a
这是内核parameter a. This is only available when thekernel typeparameter is set tosigmoid
Kernel b
这是内核parameter b. This is only available when thekernel typeparameter is set tosigmoid
C
This is the complexity constant which sets the tolerance for misclassification, where higher C values allow for 'softer' boundaries and lower values create 'harder' boundaries. A complexity constant that is too large can lead to over-fitting, while values that are too small may result in over-generalization.
Start population type
This parameter specifies the type of start population initialization.
Max generations
This parameter specifies the number of generations after which the algorithm should be terminated.
Generations without improval
This parameter specifies the stop criterion for early stopping i.e. it stops afterngenerations without improvement in the performance.nis specified by this parameter.
Population size
This parameter specifies the population size i.e. the number of individuals per generation. If set to -1, all examples are selected.
Tournament fraction
This parameter specifies the fraction of the current population which should be used as tournament members.
Keep best
This parameter specifies if the best individual should survive. This is also called elitist selection. Retaining the best individuals in a generation unchanged in the next generation, is called elitism or elitist selection.
Mutation type
This parameter specifies the type of the mutation operator.
Selection type
This parameter specifies the selection scheme of this evolutionary algorithms.
Crossover prob
The probability for an individual to be selected for crossover is specified by this parameter.
Use local random seed
This parameter indicates if alocal random seedshould be used for randomization. Using the same value oflocal random seedwill produce the same randomization.
Local random seed
This parameter specifies thelocal random seed. This parameter is only available if theuse local random seedparameter is set to true.
Show convergence plot
This parameter indicates if a dialog with a convergence plot should be drawn.