Detect Outlier (Clustering)
Synopsis
This operator is allows you to use cluster based methods for anomaly detection. It currently supports CBLOF, CMGOS and LDCOF
Description
Computes a robust PCA-based anomaly score. For robustness, trimming of the original data set based on the Mahalanobis distance is performed first. Then, PCA is computed and a score is determined based on the top upper and/or lower PCs. This operator follows the papers "A Novel Anomaly Detection Scheme Based on Principal Component Classifier" by Shyu et al (2003) and "Robust Methods for Unsupervised PCA-based Anomaly Detection" by Kwitt et al. (2006). In contrast to the original publications, this operator computes a normalized score instead of classifying into normal/anomalous instances.
Please be aware of the fact that this method can deal with "one big" cluster only and will probably fail if normal data consists of multiple clusters (non-linear dependencies) as well as on local anomaly detection tasks.
Input
exa
The example set you want to run the algorithm on.
Output
exa
的得分进行了le set.
国防部
An anomaly model which can be used to apply this model on new data.
Parameters
Probability for normal class
This is the expected probability of normal data instances. Usually it should be between 0.95 and 1.0.
Cumulative variance
Cumulative variance threshold for selecting major components.
Number of major pcs
Number of major components to keep.
Number of minor pcs
Number of minor components to keep.
Eigenvalue threshold max
The maximum allowed eigenvalue for minor components taken into account.