红酒产地预测问题的任务是:根据红酒的各项指标,鉴定红酒的产地。
数据:sklearn工具库。
样本数178,每个样本表示1瓶红酒,13个特征,如红酒颜色、蒸馏度等。
类标签:3个。
导入数据,利用Softmax回归算法预测红酒产地,并输出accuracy,画出ROC曲线。
利用Softmax回归算法预测红酒产地,并输出accuracy,画出ROC曲线。
#导入数据
from sklearn.datasets import load_wine
rwine = load_wine() #导入红酒数据
#这里的m,n为维度,数据是178*13的矩阵
m=178
n = 13
X = rwine.data
y = rwine.target
c = rwine["target"].astype(np.int)
y = convert_to_vectors(c)
print(rwine.feature_names)
print(rwine.target_names)
print(X.shape)
print(y.shape)
X_train, X_test, y_train, y_test, c_train, c_test = train_test_split(X, y, c, test_size=0.2)
X_train = process_features(X_train)
X_test = process_features(X_test)
1、Softmax回归算法
Logistic回归模型:2元分类问题。
Softmax回归模型:k元分类问题。
Softmax回归是以Softmax函数为模型假设,且以k元交叉熵为目标函数的经验损失最小化算法。
import numpy as np
def softmax(scores):
e = np.exp(scores)
s = e.sum(axis=1)
for i in range(len(s)):
e[i] /= s[i]
return e
class SoftmaxRegression:
def fit(self, X, y, eta_0=50, eta_1=100, N=1000):
m, n = X.shape
m, k = y.shape
w = np.zeros(n * k).reshape(n,k)
self.w = w
for t in range(N):
i = np.random.randint(m)
x = X[i].reshape(1,-1)
proba = softmax(x.dot(w))
g = x.T.dot(proba - y[i])
w = w - eta_0 / (t + eta_1) * g
self.w += w
self.w /= N
def predict_proba(self, X):
return softmax(X.dot(self.w))
def predict(self, X):
proba = self.predict_proba(X)
return np.argmax(proba, axis=1)
2、LogisticRegression算法
梯度下降法求解Logistic 回归问题,Logistic回归的目标函数(交叉熵)是一个凸函数。交叉熵梯度,经过运算,以下为矩阵表示:
import numpy as np
def sigmoid(scores):
return 1 / (1 + np.exp(-scores))
class LogisticRegression:
def fit(self, X, y, eta_0=10, eta_1=50, N=1000):
m, n = X.shape
w = np.zeros((n,1))
self.w = w
for t in range(N):
i = np.random.randint(m)
x = X[i].reshape(1,-1)
pred = sigmoid(x.dot(w))
g = x.T * (pred - y[i])
w = w - eta_0 / (t + eta_1) * g
self.w += w
self.w /= N
def predict_proba(self, X):
return sigmoid(X.dot(self.w))
def predict(self, X):
proba = self.predict_proba(X)
return (proba >= 0.5).astype(np.int)
1、调用自定义的Softmax回归算法,解决红酒预测问题。
model = SoftmaxRegression()
model.fit(X_train, y_train, eta_0=50, eta_1=100, N=5000)
c_pred = model.predict(X_test)
accuracy = accuracy_score(c_test, c_pred)
print("accuracy = {}".format(accuracy))
2、调用sklearn的LogisticRegression算法,解决红酒预测问题。
# 调用LogisticRegression算法
model1 = LogisticRegression()
model1.fit(X_train, y_train, eta_0=10, eta_1=50, N=500)
proba = model1.predict_proba(X_test)
roc.plot_roc_curve(proba, y_test)
利用Softmax回归算法预测红酒产地,并输出accuracy,画出ROC曲线。
如果需要实验报告和资源:
就当给我买个冰棍了!!!
https://download.csdn.net/download/m0_61504367/85153794