Confusion matrix

Keith Tan, CFA

A confusion matrix is a table that is used to evaluate the performance of a classification algorithm. It is often used in machine learning and data analysis to measure the accuracy of a model and to identify sources of error.

To create a confusion matrix, the predicted values of the classification algorithm are compared to the true values of the data. The predictions are typically divided into two categories: true positive (correctly classified as positive) and true negative (correctly classified as negative). The true values are also divided into two categories: positive and negative.

The confusion matrix is then constructed as a table, with the true values as the rows and the predicted values as the columns. The cells in the table represent the number of observations that fall into each combination of true and predicted values. For example, a cell in the “true positive” row and “true positive” column would represent the number of observations that were correctly classified as positive.

Confusion matrices are used to calculate various performance metrics, such as precision, recall, and accuracy, which can help to understand the strengths and weaknesses of the classification algorithm. They are also useful for identifying areas where the algorithm may be making mistakes and for identifying potential areas for improvement.

See also:  Contingency table, Heat map, Tree map

« Back to Index