Detect Outlier (Clustering)

Synopsis

This operator is allows you to use cluster based methods for anomaly detection. It currently supports CBLOF, CMGOS and LDCOF

Description

Computes a robust PCA-based anomaly score. For robustness, trimming of the original data set based on the Mahalanobis distance is performed first. Then, PCA is computed and a score is determined based on the top upper and/or lower PCs. This operator follows the papers "A Novel Anomaly Detection Scheme Based on Principal Component Classifier" by Shyu et al (2003) and "Robust Methods for Unsupervised PCA-based Anomaly Detection" by Kwitt et al. (2006). In contrast to the original publications, this operator computes a normalized score instead of classifying into normal/anomalous instances.

Please be aware of the fact that this method can deal with "one big" cluster only and will probably fail if normal data consists of multiple clusters (non-linear dependencies) as well as on local anomaly detection tasks.

Input

exa

The example set you want to run the algorithm on.

Output

exa

的得分进行了le set.

国防部

An anomaly model which can be used to apply this model on new data.

Parameters

Probability for normal class

This is the expected probability of normal data instances. Usually it should be between 0.95 and 1.0.

Cumulative variance

Cumulative variance threshold for selecting major components.

Number of major pcs

Number of major components to keep.

Number of minor pcs

Number of minor components to keep.

Eigenvalue threshold max

The maximum allowed eigenvalue for minor components taken into account.