https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html
confusion [kən'fjuːʒ(ə)n]:n. 混淆,混乱,困惑
混淆矩阵是一个误差矩阵,用来可视化地评估监督学习算法的性能。混淆矩阵大小为 (n_classes, n_classes) 的方阵,其中 n_classes 表示类的数量。通过混淆矩阵,可以很容易看出系统是否会弄混两个类,这也是混淆矩阵名字的由来。
混淆矩阵是一种特殊类型的列联表 (contingency table) 或交叉制表 (cross tabulation or crosstab)。其有两维 (真实值和预测值),这两维都具有相同的类 (classes) 的集合。
sklearn 和 TensorFlow 中有 API。
sklearn.metrics.confusion_matrix(y_true, y_pred, labels=None, sample_weight=None)
Compute confusion matrix to evaluate the accuracy of a classification.
计算混淆矩阵以评估分类的准确性。
By definition a confusion matrix C C C is such that C i , j C_{i,j} Ci,j is equal to the number of observations known to be in group i i i but predicted to be in group j j j.
根据定义,混淆矩阵 C C C 是这样的, C i , j C_{i,j} Ci,j 等于已知在 i i i 组中但预测在 j j j 组中的观察数。
Thus in binary classification, the count of true negatives is C 0 , 0 C_{0,0} C0,0, false negatives is C 1 , 0 C_{1,0} C1,0, true positives is C 1 , 1 C_{1,1} C1,1 and false positives is C 0 , 1 C_{0,1} C0,1.
因此,在二进制分类中,真阴性的计数是 C 0 , 0 C_{0,0} C0,0,假阴性是 C 1 , 0 C_{1,0} C1,0,真阳性是 C 1 , 1 C_{1,1} C1,1,误报是 C 0 , 1 C_{0,1} C0,1。
y_true : array, shape = [n_samples]
Ground truth (correct) target values.
y_pred : array, shape = [n_samples]
Estimated targets as returned by a classifier.
labels : array, shape = [n_classes], optional
List of labels to index the matrix. This may be used to reorder or select a subset of labels. If none is given, those that appear at least once in y_true or y_pred are used in sorted order.
索引矩阵的标签列表。这可用于重新排序或选择标签的子集。如果 labels 为 None,那么在 y_true 或 y_pred 中至少出现一次的那些按排序顺序使用。
sample_weight : array-like of shape = [n_samples], optional
Sample weights.
C : array, shape = [n_classes, n_classes]
Confusion matrix.
>>> from sklearn.metrics import confusion_matrix
>>> y_true = [2, 0, 2, 2, 0, 1]
>>> y_pred = [0, 0, 2, 2, 0, 2]
>>> confusion_matrix(y_true, y_pred)
array([[2, 0, 0],
[0, 0, 1],
[1, 0, 2]])
>>> y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
>>> y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
>>> confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])
array([[2, 0, 0],
[0, 0, 1],
[1, 0, 2]])
In the binary case, we can extract true positives, etc as follows:
>>> tn, fp, fn, tp = confusion_matrix([0, 1, 0, 1], [1, 1, 1, 0]).ravel()
>>> (tn, fp, fn, tp)
(0, 2, 1, 1)
https://en.wikipedia.org/wiki/Confusion_matrix
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html