BPR 学习小结

前天学习了贝叶斯个性化推荐,现在写篇博客来总结一下知识点,说是总结,其实也就是把里面的公式抄一下,说说其中的思想以及python代码的实现
首先,贝叶斯个性化推荐是基于矩阵分解来做的,都是把USER-ITEM矩阵分解为一个user-k,一个item-k的矩阵的乘机形式,唯一不同的是BPR利用了贝叶斯定理的形式来处理这个矩阵,也就是
P ( θ ∣ > u ) = P ( > u ∣ θ ) P ( θ ) p ( > u ) P(\theta|{}^{>}u)=\frac{P({}^{>}u|\theta)P(\theta)}{p({}^{>}u)} P(θ>u)=p(>u)P(>uθ)P(θ)
我们知道对于似然函数来说,每个用户>u的矩阵,都会导致一个P(θ)的值,那么事件θ发生的概率就是P(θ|>u),所以我们的目的是力图使这个似然值最大。
也就是;每个用户之间相对独立,这个用户对每个物品的喜爱程度不受其他用户的影响
用户u对物品i和物品j的喜爱程度不受其他物品的干扰。
而式子中的>u表示的是用户的偏序集,也就是用户对于各种物品的喜爱程度,θ则表示这矩阵分解中的两个参数矩阵
接下来,就是对这个式子的处理化简。
首先对于等式右边的分母,我们可以看出对于所有的用户来说,这个分母都是一样的,那么我们可以把它省略掉,方便后续的处理。那么式子就变成了:
P ( > u ∣ θ ) P ( θ ) P(>u|\theta)P(\theta) P(>uθ)P(θ)
然后就是 P ( > u ∣ θ ) P(>u|\theta) P(>uθ)的处理,
∏ u ∈ U P ( > u ∣ θ ) = ∏ ( u , i , j ) ∈ ( U I I ) ​ P ( i > u , j ∣ θ ) δ ( ( u , i , j ) ∈ D ) ​ ( 1 − P ( i > u , j ∣ θ ) δ ( ( u , i , j ) ̸ ∈ D ) ) \prod_{u\in U } P(>u|\theta)=\prod_{(u,i,j)\in(UII)}\!P(i>u,j|\theta)^{\delta((u,i,j)\in D)}\!(1-P(i>u,j|\theta)^{\delta((u,i,j)\not\in D)}) uUP(>uθ)=(u,i,j)(UII)P(i>u,jθ)δ((u,i,j)D)(1P(i>u,jθ)δ((u,i,j)̸D))

同时对于 δ \delta δ函数来说,如果括号里面的值为正,那么返回1,否则返回0,因为本题在处理时考虑的是u更喜欢i相对于j,所以式子就变成了
∏ u ∈ U P ( > u ∣ θ ) = ∏ ( u , i , j ) ∈ ( U , I , I ) P ( i > u , j ∣ θ ) \prod_{u \in U}P(>_{u}|\theta)=\prod_{(u,i,j) \in(U,I,I)}P(i>_{u},j|\theta) uUP(>uθ)=(u,i,j)(U,I,I)P(i>u,jθ)然后对于P(i>u,j|θ)可以转换为 σ ( X ˉ u , i , j ( θ ) ) \sigma(\bar{X}_{u,i,j}(\theta)) σ(Xˉu,i,j(θ)),至于为什么用σ函数来替代?别问,问就是为了方便。
对于 X ˉ u , i , j ( θ ) \bar{X}_{u,i,j}(\theta) Xˉu,i,j(θ) 来说就是 X ˉ ( u i ) − X ˉ ( u j ) \bar{X}(ui)-\bar{X}(uj) Xˉ(ui)Xˉ(uj)
那么对于第一个式子就变成了
∏ u ∈ U P ( > u ∣ θ ) = ∏ ( u , i , j ) ∈ D σ ( X ˉ u i − X ˉ u j ) \prod_{u\in U}P(>u|\theta)=\prod_{(u,i,j)\in D}\sigma(\bar{X}ui-\bar{X}uj) uUP(>uθ)=(u,i,j)Dσ(XˉuiXˉuj)
然后就是P(θ)啦。
在这里为了方便优化,使用了正态分布,均值为0,协方差为 λ θ I \lambda_{\theta}I λθI
因为lnP(θ)跟 λ ∣ ∣ θ 2 ∣ ∣ \lambda||\theta^2|| λθ2 成正比。
于是整个式子的最终结果就得出了:
L n P ( θ ∣ > u ) ∝ ​ l n P ( > u ∣ θ ) P ( θ ) = l n ∏ ( u , i , j ) ∈ D σ ( X ˉ u i − X ˉ u j ) + l n P ( θ ) = ∑ ( u , i , j ) ∈ D l n ( σ ( X ˉ u i − X ˉ u j ) ) + λ ∣ ∣ θ ∣ ∣ 2 Ln P(\theta|>_{u})\propto\!lnP(>_{u}|\theta)P(\theta)=ln\prod_{(u,i,j)\in D}\sigma(\bar{X}ui-\bar{X}uj)+lnP(\theta)=\sum_{(u,i,j)\in D}ln(\sigma(\bar{X}ui-\bar{X}uj))+\lambda||\theta||^2 LnP(θ>u)lnP(>uθ)P(θ)=ln(u,i,j)Dσ(XˉuiXˉuj)+lnP(θ)=(u,i,j)Dln(σ(XˉuiXˉuj))+λθ2
使用梯度下降到方法对θ求导,有:
∂ l n P ( θ ∣ > u ) ∂ θ = ∝ ∑ ( u , i , j ) ∈ D 1 1 + e X ˉ u i − X ˉ u j ∂ ( X ˉ u i − X ˉ u j ) ∂ θ + λ θ \frac{\partial lnP(\theta |>_u)}{\partial\theta}=\propto\sum_{(u,i,j)\in D}\frac{1}{1+e^{\bar{X}ui}-\bar{X}uj} \frac{\partial(\bar{X}ui-\bar{X}uj)}{\partial\theta}+\lambda\theta θlnP(θ>u)=(u,i,j)D1+eXˉuiXˉuj1θ(XˉuiXˉuj)+λθ
又因为

X ˉ u i − X ˉ u j = ∑ f = 1 k W u f h i f − ∑ f = 1 k W u f h j f \bar{X}ui-\bar{X}uj=\sum_{f=1}^kW_{uf}h_{if}-\sum_{f=1}^kW_{uf}h{jf} XˉuiXˉuj=f=1kWufhiff=1kWufhjf

那么就得出了:
∂ ( X ˉ u i − X ˉ u j ) ∂ θ = { ( h i f h j f ) i f    θ   = w u f   W u f i f θ = h i f − W u f i f    θ = h i f   \frac{\partial(\bar{X}_{ui}-\bar{X}_{uj})}{\partial\theta}=\left\{\begin{array}{ll}(h_{if}h_{jf})&\text{$if\;\theta\,=w_{uf}$ }\\W_{uf}&\text{$if \theta=h_{if}$}\\-W_{uf}&\text{$if\;\theta=h_{if}$ }\end{array}\right. θ(XˉuiXˉuj)=(hifhjf)WufWufifθ=wuf ifθ=hififθ=hif 
废话不多说,直接上代码:

# Implement BPR.
# Steffen Rendle, et al. BPR: Bayesian personalized ranking from implicit feedback.
# Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI, 2009. 
# @author Runlong Yu, Mingyue Cheng, Weibo Gao

import random
from collections import defaultdict
import numpy as np
from sklearn.metrics import roc_auc_score
import scores

class BPR:
    user_count = 943
    item_count = 1682
    latent_factors = 20
    lr = 0.01
    reg = 0.01
    train_count = 1000
    train_data_path = 'train.txt'
    test_data_path = 'test.txt'
    size_u_i = user_count * item_count
    # latent_factors of U & V
    U = np.random.rand(user_count, latent_factors) * 0.01
    V = np.random.rand(item_count, latent_factors) * 0.01
    biasV = np.random.rand(item_count) * 0.01
    test_data = np.zeros((user_count, item_count))
    test = np.zeros(size_u_i)
    predict_ = np.zeros(size_u_i)

    def load_data(self, path):
        user_ratings = defaultdict(set) #defaultdict()可以对键值设定一个默认值,这样在字典访问时如果改键所对应的值之前并不存在,在把该键值设定为默认值,这样避免了keyerror
        max_u_id = -1
        max_i_id = -1
        with open(path, 'r') as f:
            for line in f.readlines():
                u, i = line.split(" ")
                u = int(u)
                i = int(i)
                user_ratings[u].add(i)  # user_rating[i]里面的数据是跟输入顺序无关,只跟数据之间的大小关系有关,数值小的在前面
                max_u_id = max(u, max_u_id)
                max_i_id = max(i, max_i_id)
        return user_ratings

    def load_test_data(self, path):
        file = open(path, 'r')
        for line in file:
            line = line.split(' ')
            user = int(line[0])
            item = int(line[1])
            self.test_data[user - 1][item - 1] = 1

    def train(self, user_ratings_train):
        for user in range(self.user_count):
            # sample a user
            u = random.randint(1, self.user_count)
            if u not in user_ratings_train.keys():
                continue
            # sample a positive item from the observed items
            i = random.sample(user_ratings_train[u], 1)[0] #sample(user,1) 的作用是从user_rating_train[u]中随机选取一个元素
            # sample a negative item from the unobserved items
            j = random.randint(1, self.item_count)
            while j in user_ratings_train[u]:
                j = random.randint(1, self.item_count)
            u -= 1
            i -= 1
            j -= 1
            r_ui = np.dot(self.U[u], self.V[i].T) + self.biasV[i]
            r_uj = np.dot(self.U[u], self.V[j].T) + self.biasV[j]
            r_uij = r_ui - r_uj
            loss_func = -1.0 / (1 + np.exp(r_uij))
            # update U and V
            self.U[u] += -self.lr * (loss_func * (self.V[i] - self.V[j]) + self.reg * self.U[u])
            self.V[i] += -self.lr * (loss_func * self.U[u] + self.reg * self.V[i])
            self.V[j] += -self.lr * (loss_func * (-self.U[u]) + self.reg * self.V[j])
            # update biasV
            self.biasV[i] += -self.lr * (loss_func + self.reg * self.biasV[i])
            self.biasV[j] += -self.lr * (-loss_func + self.reg * self.biasV[j])

    def predict(self, user, item):
        predict = np.mat(user) * np.mat(item.T)
        return predict

    def main(self):
        user_ratings_train = self.load_data(self.train_data_path)
        self.load_test_data(self.test_data_path)
        for u in range(self.user_count):
            for item in range(self.item_count):
                if int(self.test_data[u][item]) == 1:
                    self.test[u * self.item_count + item] = 1
                else:
                    self.test[u * self.item_count + item] = 0
        # training
        for i in range(self.train_count):
            self.train(user_ratings_train)
        predict_matrix = self.predict(self.U, self.V)
        # prediction
        # getA()使得矩阵转换为数组narry,这样才可以取出其中的元素,负责会造成指针越界,而reshape(-1)则是把数组变成一行
        self.predict_ = predict_matrix.getA().reshape(-1)
        self.predict_ = pre_handel(user_ratings_train, self.predict_, self.item_count)
        auc_score = roc_auc_score(self.test, self.predict_)
        print('AUC:', auc_score)# auc=(area under curve)
        # Top-K evaluation
        str(scores.topK_scores(self.test, self.predict_, 5, self.user_count, self.item_count))

def pre_handel(set, predict, item_count):
    # Ensure the recommendation cannot be positive items in the training set.
    for u in set.keys():
        for j in set[u]:
            predict[(u - 1) * item_count + j - 1] = 0
    return predict

if __name__ == '__main__':
    bpr = BPR()
    bpr.main()

你可能感兴趣的:(AI)