Technical Papers

# Metrics for multi-class classification: an overview

Margherita Grandini, Enrico Bagli, Giorgio Visani

## ABSTRACT of Metrics for multi-class classification: an overview

Classification tasks in machine learning involving more than two classes are known by the name of "multi-class classification". Performance indicators are very useful when the aim is to evaluate and compare different classification models or machine learning techniques. Many metrics come in handy to test the ability of a multi-class classifier. Those metrics turn out to be useful at different stage of the development process, e.g. comparing the performance of two different models or analysing the behaviour of the same model by tuning different parameters. In this white paper we review a list of the most promising multi-class metrics, we highlight their advantages and disadvantages and show their possible usages during the development of a classification model.

## Introduction to Metrics for multi-class classification: an overview

In the vast field of Machine Learning, the general focus is to predict an outcome using the available data. The prediction task is also called "classification problem" when the outcome represents different classes, otherwise is called "regression problem" when the outcome is a numeric measurement. As regards to classification, the most common setting involves only two classes, although there may be more than two. In this last case the issue changes his name and is called "multi-class classification".

From an algorithmic standpoint, the prediction task is addressed using the state of the art mathematical techniques. There are many different solutions, however each one shares a common factor: they use available data (X variables) to obtain the best prediction ^ Y of the outcome variable Y . In Multi-class classification, we may regard the response variable Y and the prediction ^ Y as two discrete random variables: they assume values in f1 (1....Kg) and each number represents a different class. The algorithm comes up with the probability that a specific unit belongs to one possible class, then a classification rule is employed to assign a single class to each individual. The rule is generally very simple, the most common rule assigns a unit to the class with the highest probability. A classification model gives us the probability of belonging to a specific class for each possible units. Starting from the probability assigned by the model, in the two-class classification problem a threshold is usually applied to decide which class has to be predicted for each unit. While in the multi-class case, there are various possibilities; among them, the highest probability value and the softmax are the most employed techniques.

Performance indicators are very useful when the aim is to evaluate and compare different classification models or machine learning techniques.

There are many metrics that come in handy to test the ability of any multi-class classifier and they turn out to be useful for: i) comparing the performance of two different models, ii) analysing the behaviour of the same model by tuning different parameters.

Many metrics are based on the Confusion Matrix, since it encloses all the relevant information about the algorithm and classification rule performance.

## ARE YOU A DEVELOPER?

Check out all the resources for TPPs and developers on the Crif Platform development portal.