LDA and QDA

简单的R实现:

library(MASS)
Iris <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]), Sp = rep(c("s", "c", "v"), rep(50, 3)))
y_hat <- predict(lda(Sp~.,Iris, prior = c(1, 1, 1)/3),Iris)$class
sum(y_hat == Iris$Sp)/150

一般numpy中对于ndarray的处理对python内置list也是可以作用的。(构造函数实现了转型)
下面对sklearn中线性判别及二次判别进行了调用,并与DIY二次判别进行比较,
这里使用了python字符串执行及实例化函数exec及eval提高了DIY函数的可重用性:

from __future__ import division
import numpy as np 
from sklearn.datasets import load_iris 
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis, QuadraticDiscriminantAnalysis
from functools import partial

iris = load_iris()
data, target = iris.data, iris.target

X, y = data, target
clf = LinearDiscriminantAnalysis()
clf.fit(X, y)
y_hat = clf.predict(X)

print "the sklearn lda accuracy :"
print np.sum(y_hat == y) / y.shape[0]

clf = QuadraticDiscriminantAnalysis()
clf.fit(X, y)
y_hat = clf.predict(X)

print "the sklearn lda accuracy :"
print np.sum(y_hat == y) / y.shape[0]


def discrtminant_func(x ,X, prior_probability):
 cov_matrix = np.cov(X.T)
 mean_vector = np.mean(X, axis = 0)

 return np.dot(np.dot((x - mean_vector).T , np.linalg.inv(cov_matrix)), x - mean_vector) * (-0.5) + \
   np.log(2 * np.pi) * (-0.5 * X.shape[0]) + np.log(np.linalg.det(cov_matrix)) * (-0.5) + np.log(prior_probability)

factors = np.unique(y)
prior_probability = 1 / len(factors)
for i in range(len(factors)):
 exec("X" + str(factors[i]) + "= X[y == " + str(factors[i]) + ",]")
 exec("discrtminant_func" + str(factors[i]) + "=partial(discrtminant_func, X = X" + str(factors[i]) + ", prior_probability = " + str(prior_probability) + ")")

def y_predict_qda(x):
 return np.argmax([eval("discrtminant_func" + str(factor) + "(x)") for factor in factors])

print "the diy qda accuracy :"
print np.sum(np.array(map(y_predict_qda, X)) == y) / y.shape[0]



与分类相对应的是降维,其仅仅是组间与组内方差阵的相对特征值。


















你可能感兴趣的:(机器学习,Sklearn)