校长Component Analysis (Kernel)
Synopsis
This operator performs Kernel Principal Component Analysis (PCA) which is a non-linear extension of PCA.
Description
Kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are done in a reproducing kernel Hilbert space with a non-linear mapping. By the use of integral operator kernel functions, one can efficiently compute principal components in high-dimensional feature spaces, related to input space by some nonlinear map. The result will be the set of data points in a non-linearly transformed space. Please note that in contrast to the usual linear PCA the kernel variant also works for large numbers of attributes but will become slow for large number of examples.
RapidMiner provides the
运营商应用线性PCA。校长Component Analysis is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated attributes into a set of values of uncorrelated attributes called principal components. This transformation is defined in such a way that the first principal component's variance is as high as possible (accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it should be orthogonal to (uncorrelated with) the preceding components.
Differentiation
校长Component Analysis
Kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. In contrast to the usual linear PCA the kernel variant also works for large numbers of attributes but will become slow for large number of examples.
Input
example set input
This input port expects an ExampleSet. It is output of the Retrieve operator in the attached Example Process. The output of other operators can also be used as input. It is essential that meta data should be attached with the data for the input because attributes are specified in their meta data. The Retrieve operator provides meta data along with the data. Please note that this operator cannot handle nominal attributes; it works on numerical attributes.
Output
example set output
The kernel-based Principal Component Analysis is performed on the input ExampleSet and the resultant ExampleSet is delivered through this port.
original
The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.
preprocessing model
This port delivers the preprocessing model, which has the information regarding the parameters of this operator in the current process.
Parameters
Kernel type
The type of the kernel function is selected through this parameter. Following kernel types are supported:dot, radial, polynomial, neural, anova, epachnenikov, gaussian combination, multiquadric
- dot: The dot kernel is defined byk(x,y)=x*yi.e. it is inner product ofxandy.
- radial: The radial kernel is defined byexp(-g ||x-y||^2)wheregis thegamma, it is specified by thekernel gammaparameter. The adjustable parametergammaplays a major role in the performance of the kernel, and should be carefully tuned to the problem at hand.
- polynomial: The polynomial kernel is defined byk(x,y)=(x*y+1)^dwheredis the degree of polynomial and it is specified by thekernel degreeparameter. The polynomial kernels are well suited for problems where all the training data is normalized.
- neural: The neural kernel is defined by a two layered neural nettanh(a x*y+b)whereaisalphaandbis theintercept constant. These parameters can be adjusted using thekernel aandkernel bparameters. A common value foralphais 1/N, where N is the data dimension. Note that not all choices ofaandblead to a valid kernel function.
- anova: The anova kernel is defined by raised to powerdof summation ofexp(-g (x-y))wheregisgammaanddisdegree. gamma and degree are adjusted by thekernel gammaandkernel degreeparameters respectively.
- epachnenikov: The epachnenikov kernel is this function(3/4)(1-u2)forubetween -1 and 1 and zero foruoutside that range. It has two adjustable parameterskernel sigma1andkernel degree.
- gaussian_combination: This is the gaussian combination kernel. It has adjustable parameterskernel sigma1, kernel sigma2andkernel sigma3.
- multiquadric: The multiquadric kernel is defined by the square root of||x-y||^2 + c^2. It has adjustable parameterskernel sigma1andkernel sigma shift.
Kernel gamma
This is the kernel parameter gamma. This is only available when thekernel typeparameter is set toradialoranova.
Kernel sigma1
This is the kernel parameter sigma1. This is only available when thekernel typeparameter is set toepachnenikov,gaussian combinationormultiquadric.
Kernel sigma2
This is the kernel parameter sigma2. This is only available when thekernel typeparameter is set togaussian combination.
Kernel sigma3
这是sigma3内核参数。这只是available when thekernel typeparameter is set togaussian combination.
Kernel shift
This is the kernel parameter shift. This is only available when thekernel typeparameter is set tomultiquadric.
Kernel degree
This is the kernel parameter degree. This is only available when thekernel typeparameter is set topolynomial,anovaorepachnenikov.
Kernel a
This is the kernel parameter a. This is only available when thekernel typeparameter is set toneural.
Kernel b
This is the kernel parameter b. This is only available when thekernel typeparameter is set toneural.