Python 机器学习常见的计算指标

1、阐述 tp fp fn tn其原理

预测值为正例,记为P(Positive)
预测值为反例,记为N(Negative)
预测值与真实值相同,记为T(True)
预测值与真实值相反,记为F(False)
TP -- 预测值和真实值一样,预测值为正样本(真实值为正样本)
TN -- 预测值和真实值一样,预测值为负样本(真实值为负样本)
FP -- 预测值和真实值不一样,预测值为正样本(真实值为负样本)
FN -- 预测值和真实值不一样,预测值为负样本(真实值为正样本)

2、两张图片 为例计算 4个参数

import numpy as np
import math

def compute_pos_neg(y_actual, y_hat):
    TP = 0; FP = 0;TN = 0; FN = 0
    for i in range(len(y_hat)): 
        if y_actual[i]==y_hat[i]==1:
           TP += 1
        if y_hat[i]==1 and y_actual[i]!=y_hat[i]:
           FP += 1
        if y_actual[i]==y_hat[i]==0:
           TN += 1
        if y_hat[i]==0 and y_actual[i]!=y_hat[i]:
           FN += 1
    return TP,FP,TN,FN
def metrics(TP,FP,TN,FN):
    a=TP+FP
    b=TP+FN
    c=TN+FP
    d=TN+FN
    #mcc=((TP*TN)-(FP*FN))/(math.sqrt(float(a*b*c*d)+0.0001))
    #F1=(2*TP)/float(2*TP+FP+FN+.0000001)
    precision=TP/float(TP+FP+.0000001)
    #recall=TP/float(TP+FN+.0000001)
    return precision
#计算正确率
    

import cv2
import numpy as np
y_t = cv2.imread("C:/Users/11549/Desktop/LSTM_for_iamge_forgeries/forgery_localization_HLED-master/test_data/test_result/pre/7.tif")
y_actual=cv2.imread('C:/Users/11549/Desktop/LSTM_for_iamge_forgeries/forgery_localization_HLED-master/test_data/test_result/label/7.tif')
cv2.imshow('_',tmp_pre)
cv2.waitKey(0)

tmp_actual=y_actual[:,:,0]
tmp_t=y_t[:,:,0]

y_true = np.reshape(tmp_actual , [-1])/255
y_pred = np.reshape( tmp_t, [-1])/255

[tp,fp ,tn,fn]=compute_pos_neg(y_true , y_pred)

true_p_rate=tp/(tp+fn)
flase_n_rate=fp/(fp+tn)

3、绘制roc曲线图

# coding=UTF-8
from sklearn import metrics
import matplotlib.pylab as plt
import numpy as np
import cv2

img = cv2.imread('test_roc/0.tif')
img1=img[:,:,0]/255
img11=img1.reshape(1, -1)
img3=np.floor(img11)
img2=img3[0].tolist()

# cv2.imshow("img_win_name", pre1)
# cv2.waitKey(0)   

pre=np.load('test_roc/final_probabilities.npy')
pre1=pre[0,:,:,]
pre1=pre1[:,:,1]
pre11=pre1.reshape(1, -1)
pre2=pre11[0].tolist()


#真实值
GTlist = img2
#模型预测值
Problist = pre2
 
fpr, tpr, thresholds = metrics.roc_curve(GTlist, Problist, pos_label=1)
roc_auc = metrics.auc(fpr, tpr)  #auc为Roc曲线下的面积
print(roc_auc)
 
 
plt.plot(fpr, tpr, 'b',label='AUC = %0.2f'% roc_auc)
plt.legend(loc='lower right')
# plt.plot([0, 1], [0, 1], 'r--')
plt.xlim([-0.1, 1.1])
plt.ylim([-0.1, 1.1])
plt.xlabel('False Positive Rate') #横坐标是fpr
plt.ylabel('True Positive Rate')  #纵坐标是tpr
plt.title('Receiver operating characteristic example')
plt.show()

平均roc曲线

# coding=UTF-8
from sklearn import metrics
import matplotlib.pylab as plt
import numpy as np
 
 
GTlist = [1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0,
          0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
          0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
 
Problist = [0.99, 0.98, 0.97, 0.93, 0.85, 0.80, 0.79, 0.75, 0.70, 0.65,
           0.64, 0.63, 0.55, 0.54, 0.51, 0.49, 0.30, 0.2, 0.1, 0.09,
            0.1, 0.5, 0.6, 0.7, 0.8, 0.5, 0.2, 0.3, 0.2, 0.5]
 
#num of image
totalNumberOfImages = 2
numberOfDetectedLesions = sum(GTlist)
totalNumberOfCandidates = len(Problist)
fpr, tpr, thresholds = metrics.roc_curve(GTlist, Problist, pos_label=1)
#FROC
fps = fpr * (totalNumberOfCandidates - numberOfDetectedLesions) / totalNumberOfImages
sens = tpr
 
fps_itp = np.linspace(0.125, 8, num=10001)
sens_itp = np.interp(fps_itp, fps, sens)
frvvlu = 0
nxth = 0.125
for fp, ss in zip(fps_itp, sens_itp):
    if abs(fp - nxth) < 3e-4:
        print(ss)
        frvvlu += ss
        nxth *= 2
    if abs(nxth - 16) < 1e-5: break
print(frvvlu / 7, nxth)
 
#画图
plt.plot(fps, sens, color='b', lw=2)
plt.legend(loc='lower right')
# plt.plot([0, 1], [0, 1], 'r--')
plt.xlim([0.125, 8])
plt.ylim([0, 1.1])
plt.xlabel('Average number of false positives per scan') #横坐标是fpr
plt.ylabel('True Positive Rate')  #纵坐标是tpr
plt.title('FROC performence')
plt.show()

https://blog.csdn.net/pursuit_zhangyu/article/details/89073567

https://blog.csdn.net/weixin_44022515/article/details/105434949?utm_medium=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.add_param_isCf&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.add_param_isCf 

4 acc F1  tpr  fpr 等指标

精度precision是从预测结果的角度来统计的,是说预测为正样本的数据中,有多少个是真正的正样本,即“找的对”的比例,如下

 precision = TP/( TP+FP)

召回率recallTPR(灵敏度(true positive rate))是一个概念,都是从真实的样本集来统计的,是说在总的正样本中,模型找回了多少个正样本,即“找的全”的比例,如下:

recall/TPR  = TP/(TP+FN)

FPR(false positive rate),它是指实际负例中,错误的判断为正例的比例,这个值往往越小越好,如下: 

FPR = FP/(FP+TN)

F1分数(F1-score)是分类问题的一个衡量指标。F1分数认为召回率和精度同等重要, 一些多分类问题的机器学习竞赛,常常将F1-score作为最终测评的方法。它是精确率和召回率的调和平均数,最大为1,最小为0。计算公式如下:

F1 = 2TP/(2TP+FP+FN)


 

 

参考连接:https://www.jianshu.com/p/57956d7dd55a

参考连接2:https://blog.csdn.net/qq_41092190/article/details/106872020

你可能感兴趣的:(python)