r2_score与accuracy_score的区别

r2_score与accuracy_score都是sklearn.metrics中的计算准确率的函数,r2_score适用于回归问题,accuracy_score适用于分类问题,r2_score的输入可以是浮点数,而accuracy_score必须是整形。以下是我用回归模型得到的一个结果与label:

y_test>>
>>array([37.6, 27.9, 22.6, 13.8, 35.2, 10.4, 23.9, 29. , 22.8, 23.2, 33.2,
       19. , 20.3, 36.1, 24.4, 17.2, 17.9, 19.6, 19.7, 15. ,  8.1, 23. ,
       44.8, 23.1, 32.2, 10.8, 23.1, 21.2, 22.2, 24.1, 17.3,  7. , 12.7,
       17.8, 26.4, 19.6, 25.1,  8.3, 48.8, 34.9, 13.8, 14.4, 30.1, 12.7,
       27.1, 24.8,  7. , 20.5, 21.5, 14. , 20.4, 22.2, 21.4, 13.5, 19.4,
       24.7, 43.8, 14.1, 28.6, 19.7, 16.8, 23.2, 16.2, 41.3, 22.7,  8.3,
       18.4, 24.7, 21.7, 20.6, 16.7, 22.1, 19.4, 27.5, 27.9, 30.1, 17.4,
       15.4, 31. , 14.2, 19.6, 50. , 21.7, 11.7, 19.4, 13. , 17.5,  9.7,
       20.3, 18.6, 50. , 19.6, 21.4, 18.4, 22.6, 25. , 15.6, 26.6, 22.4,
       13.1, 23. , 24.5, 13.1, 50. ,  8.8, 20.6, 12.1, 50. , 24.1, 16.1,
       23.9, 24.3, 13.1, 30.3, 15.2, 13.8, 26.4, 16.6, 18.9, 17.6, 18.7,
       33.4, 20.7, 17.1, 23.4, 26.5, 21.4, 21.5, 19.2, 50. , 50. , 23. ,
       10.5, 17.8, 10.9, 21. , 13.8, 10.5, 22.2, 30.5, 19.4, 15.6, 20.2,
       19.3, 34.6, 50. , 24. , 18.7, 19.8, 22.5, 13.3, 50. , 11.8, 11. ,
       23.7, 35.4, 15.2, 24.4, 33.4, 31.6, 13.4, 34.9, 14.4, 35.4, 25.3,
       18.3, 16.6])
y_pred>>
>>array([42.8935   , 29.481709 , 22.72152  , 11.088902 , 36.746834 ,
        7.662948 , 25.651855 , 26.15493  , 22.378397 , 20.48603  ,
       31.907364 , 21.468454 , 19.578026 , 33.49744  , 24.67272  ,
       19.555578 , 11.753345 , 19.214586 , 18.709223 , 31.103544 ,
       15.263955 , 18.056026 , 46.764435 , 20.962772 , 28.798655 ,
       12.729027 , 24.205221 , 19.359875 , 23.571661 , 23.366013 ,
       14.890141 , 14.506024 , 12.925098 , 18.63654  , 23.8615   ,
       18.018988 , 24.455381 ,  9.127385 , 44.779827 , 30.45871  ,
       14.542292 , 12.877257 , 24.352776 , 19.731707 , 24.305807 ,
       29.125399 ,  6.7823567, 18.452114 , 19.873116 , 15.277843 ,
       20.32614  , 20.995363 , 23.444897 , 13.011057 , 15.695927 ,
       22.84157  , 40.82488  , 16.551765 , 28.262928 , 22.071127 ,
       20.19384  , 23.047424 , 15.8135805, 33.986725 , 20.49732  ,
        9.303558 , 17.742878 , 23.760605 , 22.339514 , 20.652113 ,
       17.204014 , 23.734058 , 19.895483 , 16.18355  , 27.209406 ,
       30.087381 , 23.134361 , 16.869543 , 27.65251  , 15.05091  ,
       20.549377 , 48.79272  , 19.200642 , 14.059941 , 19.002571 ,
       15.965145 , 19.86551  , 12.318311 , 18.471226 , 18.855583 ,
       45.140923 , 17.670074 , 19.20084  , 15.104687 , 21.808868 ,
       28.353735 , 15.893917 , 27.171299 , 23.067072 , 16.823427 ,
       20.375757 , 23.710732 , 14.467891 , 45.121735 ,  6.5880733,
       15.1293745, 11.600112 , 35.43435  , 20.81486  , 20.350136 ,
       26.358343 , 22.667938 , 12.507106 , 31.27376  , 17.66091  ,
       14.144321 , 27.607632 , 20.588734 , 19.055277 , 19.021236 ,
       21.626543 , 32.219128 , 23.16837  , 20.517244 , 22.340687 ,
       25.62456  , 18.89042  , 21.882483 , 19.37657  , 45.726486 ,
       45.302307 , 22.80424  , 10.673209 , 15.832766 , 11.838937 ,
       21.147217 , 15.90029  ,  7.2176995, 22.684954 , 32.22356  ,
       18.45632  , 16.48581  , 18.660404 , 19.969202 , 31.537455 ,
       46.797897 , 27.281012 , 17.965696 , 20.106264 , 25.843502 ,
       13.627041 , 36.923767 , 11.221506 , 11.055304 , 23.62714  ,
       33.77936  , 15.634852 , 23.593822 , 31.300879 , 31.799702 ,
       13.880334 , 29.288174 , 18.79593  , 34.537792 , 24.082203 ,
       19.168953 , 17.133886 ], dtype=float32)
print( r2_score(y_test, y_pred))>>
>>0.8818477865859173

当直接把r2_score直接替换成accuracy_score时,会报错

ValueError                                Traceback (most recent call last)
 in 
      8 y_pred = model.predict(X_test)
      9 
---> 10 print( accuracy_score(y_test, y_pred))

D:\Program Files\ANACONDA\lib\site-packages\sklearn\metrics\classification.py in accuracy_score(y_true, y_pred, normalize, sample_weight)
    174 
    175     # Compute accuracy for each possible representation
--> 176     y_type, y_true, y_pred = _check_targets(y_true, y_pred)
    177     check_consistent_length(y_true, y_pred, sample_weight)
    178     if y_type.startswith('multilabel'):

D:\Program Files\ANACONDA\lib\site-packages\sklearn\metrics\classification.py in _check_targets(y_true, y_pred)
     86     # No metrics support "multiclass-multioutput" format
     87     if (y_type not in ["binary", "multiclass", "multilabel-indicator"]):
---> 88         raise ValueError("{0} is not supported".format(y_type))
     89 
     90     if y_type in ["binary", "multiclass"]:

ValueError: continuous is not supported

这是由于accuracy_score只适用于分类问题,而分类的结果都是用int型表示。
加入下列代码:

y_tests=[round(value) for value in y_test]
predictions=[round(value) for value in y_pred]
print( accuracy_score(y_tests, predictions))

结果:

>>0.20359281437125748

为啥结果相差如此之大呢?这是因为y_tests与predictions都四舍五入变成整形了

y_tests>>
>>[38.0,
 28.0,
 23.0,
 14.0,
 35.0,
 10.0,
 24.0,
 29.0,
 23.0,
 23.0,
 33.0,
 19.0,
 20.0,
 36.0,
 24.0,
 17.0,
 18.0,
 20.0,
 20.0,
 15.0,
 8.0,
 23.0,
 45.0,
 23.0,
 32.0,
 11.0,
 23.0,
 21.0,
 22.0,
 24.0,
 17.0,
 7.0,
 13.0,
 18.0,
 26.0,
 20.0,
 25.0,
 8.0,
 49.0,
 35.0,
 14.0,
 14.0,
 30.0,
 13.0,
 27.0,
 25.0,
 7.0,
 20.0,
 22.0,
 14.0,
 20.0,
 22.0,
 21.0,
 14.0,
 19.0,
 25.0,
 44.0,
 14.0,
 29.0,
 20.0,
 17.0,
 23.0,
 16.0,
 41.0,
 23.0,
 8.0,
 18.0,
 25.0,
 22.0,
 21.0,
 17.0,
 22.0,
 19.0,
 28.0,
 28.0,
 30.0,
 17.0,
 15.0,
 31.0,
 14.0,
 20.0,
 50.0,
 22.0,
 12.0,
 19.0,
 13.0,
 18.0,
 10.0,
 20.0,
 19.0,
 50.0,
 20.0,
 21.0,
 18.0,
 23.0,
 25.0,
 16.0,
 27.0,
 22.0,
 13.0,
 23.0,
 24.0,
 13.0,
 50.0,
 9.0,
 21.0,
 12.0,
 50.0,
 24.0,
 16.0,
 24.0,
 24.0,
 13.0,
 30.0,
 15.0,
 14.0,
 26.0,
 17.0,
 19.0,
 18.0,
 19.0,
 33.0,
 21.0,
 17.0,
 23.0,
 26.0,
 21.0,
 22.0,
 19.0,
 50.0,
 50.0,
 23.0,
 10.0,
 18.0,
 11.0,
 21.0,
 14.0,
 10.0,
 22.0,
 30.0,
 19.0,
 16.0,
 20.0,
 19.0,
 35.0,
 50.0,
 24.0,
 19.0,
 20.0,
 22.0,
 13.0,
 50.0,
 12.0,
 11.0,
 24.0,
 35.0,
 15.0,
 24.0,
 33.0,
 32.0,
 13.0,
 35.0,
 14.0,
 35.0,
 25.0,
 18.0,
 17.0]
predictions>>
>>[43.0,
 29.0,
 23.0,
 11.0,
 37.0,
 8.0,
 26.0,
 26.0,
 22.0,
 20.0,
 32.0,
 21.0,
 20.0,
 33.0,
 25.0,
 20.0,
 12.0,
 19.0,
 19.0,
 31.0,
 15.0,
 18.0,
 47.0,
 21.0,
 29.0,
 13.0,
 24.0,
 19.0,
 24.0,
 23.0,
 15.0,
 15.0,
 13.0,
 19.0,
 24.0,
 18.0,
 24.0,
 9.0,
 45.0,
 30.0,
 15.0,
 13.0,
 24.0,
 20.0,
 24.0,
 29.0,
 7.0,
 18.0,
 20.0,
 15.0,
 20.0,
 21.0,
 23.0,
 13.0,
 16.0,
 23.0,
 41.0,
 17.0,
 28.0,
 22.0,
 20.0,
 23.0,
 16.0,
 34.0,
 20.0,
 9.0,
 18.0,
 24.0,
 22.0,
 21.0,
 17.0,
 24.0,
 20.0,
 16.0,
 27.0,
 30.0,
 23.0,
 17.0,
 28.0,
 15.0,
 21.0,
 49.0,
 19.0,
 14.0,
 19.0,
 16.0,
 20.0,
 12.0,
 18.0,
 19.0,
 45.0,
 18.0,
 19.0,
 15.0,
 22.0,
 28.0,
 16.0,
 27.0,
 23.0,
 17.0,
 20.0,
 24.0,
 14.0,
 45.0,
 7.0,
 15.0,
 12.0,
 35.0,
 21.0,
 20.0,
 26.0,
 23.0,
 13.0,
 31.0,
 18.0,
 14.0,
 28.0,
 21.0,
 19.0,
 19.0,
 22.0,
 32.0,
 23.0,
 21.0,
 22.0,
 26.0,
 19.0,
 22.0,
 19.0,
 46.0,
 45.0,
 23.0,
 11.0,
 16.0,
 12.0,
 21.0,
 16.0,
 7.0,
 23.0,
 32.0,
 18.0,
 16.0,
 19.0,
 20.0,
 32.0,
 47.0,
 27.0,
 18.0,
 20.0,
 26.0,
 14.0,
 37.0,
 11.0,
 11.0,
 24.0,
 34.0,
 16.0,
 24.0,
 31.0,
 32.0,
 14.0,
 29.0,
 19.0,
 35.0,
 24.0,
 19.0,
 17.0]

在处理分类问题上,把预测结果处理成整数型是很常见的,且分类的结果也只有正确和错误,accuracy_score就是分类准确率分数是指所有分类正确的百分比,而回归任务需要计算的则是你的结果与正确结果之间的差距,这就是r2_score函数的目的。
r2_score也能·作为分类问题的计算准确率,此时的效果与accuracy_score一样,使用方法是加一个关键字 average=‘micro’,当我使用另外一个分类模型的结果(为了防止篇幅过大,这里就不给出了)进行计算时:

print( f1_score(y_test, y_pred, average='micro') )
print(accuracy_score(y_test, y_pred))
>>0.9473684210526315
0.9473684210526315

你可能感兴趣的:(机器学习)