Skip to main content

Multiclass classification

Performance metrics

We have already discussedbinary classificationat some length, and multiclass classification is similar, so here we will concentrate on the differences. The primary difference is that there are not just two possible outcomes, but three or more.

Assuming that all of the possible outcomes are equally interesting to you, with no special emphasis on any of them, it's reasonable to chooseAccuracyas your performance metric, and to prefer the model with the highestAccuracy.

When there areNpossible outcomes, the confusion matrix hasN xNelements. The correct predictions are on the diagonal, and incorrect predictions are off-diagonal. Although the summary performance table in theModel Comparisononly mentionsAccuracyandClassification Error, you can still find the analogues ofPrecisionandRecall通过查看Performancetab for each individual model.Precisionis computed for each row;Recallis computed for each column.

Predicted vs Actual Actually A Actually B Actually C Precision
Predicted A True A True A / Predicted A
Predicted B 真正的B 真正的B/ Predicted B
Predicted C 真正的C 真正的C/ Predicted C
Recall True A / Actually A 真正的B/ Actually B 真正的C/ Actually C

In the example below, a multiclass classification problem with three possible outcomes, the model made two wrong predictions, indicated by a red mark in the confusion matrix, when applied to the test set. Consequently, theRecallfor the third column (12/14) and thePrecisionfor the second row (15/17) are less than 100%.

multiclass-confusion-matrix.png