Python(R)均方根误差平均绝对误差导图

要点

  1. 回归模型评估指标
  2. 评估薪水预测模型
  3. 评估员工倦怠率模型
  4. 评估大气分析生成式对抗模型
  5. 目标对象缺失下,性能估算法追踪模型误差指标
  6. 降尺度大气学模拟模型准确性评估
  7. 蛋白染色质相互作用模型评估
    Python(R)均方根误差平均绝对误差导图_第1张图片

Python回归误差指标

平均绝对误差表示数据集中实际值和预测值之间的绝对差的平均值。它测量数据集中残差的平均值。

M A E = 1 N ∑ i = 1 N ∣ y i − y ^ ∣ M A E=\frac{1}{N} \sum_{i=1}^N\left|y_i-\hat{y}\right| MAE=N1i=1Nyiy^

Python方法一:使用公式

actual = [2, 3, 5, 5, 9] 
calculated = [3, 3, 8, 7, 6] 

n = 5
sum = 0

# for loop for iteration 
for i in range(n): 
	sum += abs(actual[i] - calculated[i]) 

error = sum/n 

# display 
print("Mean absolute error : " + str(error)) 

方法二:使用sklearn

from sklearn.metrics import mean_absolute_error as mae 

actual = [2, 3, 5, 5, 9] 
calculated = [3, 3, 8, 7, 6] 

error = mae(actual, calculated) 

print("Mean absolute error : " + str(error)) 

均方误差表示数据集中原始值和预测值之间的平方差的平均值。它测量残差的方差。

M S E = 1 N ∑ i = 1 N ( y i − y ^ ) 2 M S E=\frac{1}{N} \sum_{i=1}^N\left(y_i-\hat{y}\right)^2 MSE=N1i=1N(yiy^)2

示例:给定的数据点:(1,1)、(2,1)、(3,2)、(4,2)、(5,4) ,回归线方程:Y = 0.7X – 0.1
X Y Y ^ i 1 1 0.6 2 1 1.29 3 2 1.99 4 2 2.69 5 4 3.4 \begin{array}{|l|l|l|} \hline X & Y & \hat{Y}_i \\ \hline 1 & 1 & 0.6 \\ \hline 2 & 1 & 1.29 \\ \hline 3 & 2 & 1.99 \\ \hline 4 & 2 & 2.69 \\ \hline 5 & 4 & 3.4 \\ \hline \end{array} X12345Y11224Y^i0.61.291.992.693.4
方法一:

from sklearn.metrics import mean_squared_error 

Y_true = [1,1,2,2,4] 

Y_pred = [0.6,1.29,1.99,2.69,3.4] 

mean_squared_error(Y_true,Y_pred) 

输出: 0.21606
方法二:

import numpy as np 

Y_true = [1,1,2,2,4]
Y_pred = [0.6,1.29,1.99,2.69,3.4] 
MSE = np.square(np.subtract(Y_true,Y_pred)).mean() 

输出: 0.21606

均方根误差是均方误差的平方根。它测量残差的标准偏差。

R M S E = M S E = 1 N ∑ i = 1 N ( y i − y ^ ) 2 R M S E=\sqrt{M S E}=\sqrt{\frac{1}{N} \sum_{i=1}^N\left(y_i-\hat{y}\right)^2} RMSE=MSE =N1i=1N(yiy^)2

使用Scikit-learn

from sklearn.metrics import mean_squared_error
import numpy as np
# Example arrays (replace with your data)
y_true = np.array([3, -0.5, 2, 7])
y_pred = np.array([2.5, 0.0, 2, 8])
mse = mean_squared_error(y_true, y_pred)
rmse = np.sqrt(mse)
print(f"Root Mean Square Error (RMSE): {rmse}")

输出:0.6123724356957945

计算回归模型

import numpy as np
import pandas as pd
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

boston = fetch_openml(data_id=531)
data = pd.DataFrame(boston.data, columns=boston.feature_names)
data['PRICE'] = boston.target

X = data.drop('PRICE', axis=1).values  # Convert to NumPy array
y = data['PRICE'].values  # Convert to NumPy array
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize linear regression model
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

# Calculate RMSE (Root Mean Squared Error)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print(f"Root Mean Squared Error: {rmse}")

输出:4.928602182665333

判定系数或 R 平方表示线性回归模型解释的因变量方差的比例。它是一个无标度分数,即无论值是小还是大,R 平方的值都会小于一。

R 2 = 1 − ∑ ( y i − y ^ ) 2 ∑ ( y i − y ˉ ) 2 R^2=1-\frac{\sum\left(y_i-\hat{y}\right)^2}{\sum\left(y_i-\bar{y}\right)^2} R2=1(yiyˉ)2(yiy^)2

from sklearn.metrics import r2_score

y =[10, 20, 30]
f =[10, 20, 30]
r2 = r2_score(y, f)
print('r2 score for perfect model is', r2)
r2 score for perfect model is 1.0
y =[10, 20, 30]
f =[20, 20, 20]
r2 = r2_score(y, f)
print('r2 score for a model which predicts mean value always is', r2)
r2 score for a model which predicts mean value always is 0.0
y = [10, 20, 30]
f = [30, 10, 20]
r2 = r2_score(y, f)
print('r2 score for a worse model is', r2)
r2 score for a worse model is -2.0

更新:亚图跨际

你可能感兴趣的:(Python,交叉知识,R,回归模型,薪水预测,员工倦怠,大气分析,性能估算算法,降尺度,染色质)