论文阅读和模型解读 UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation

  1. 论文阅读:UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation
  2. 核心代码注释解读
  3. 论文阅读和模型解读 UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation_第1张图片

 

个人认为这篇论文的主要贡献点就是用几个constraint loss近似替代了GCN。主要包括一个模型主损失,user-item图带来的损失,item-item的约束损失。

UltraGCN模型代码注释解读:

class UltraGCN(nn.Module):
    def __init__(self, params, constraint_mat, ii_constraint_mat, ii_neighbor_mat):
        super(UltraGCN, self).__init__()
        self.user_num = params['user_num']
        self.item_num = params['item_num']
        self.embedding_dim = params['embedding_dim']
        # 这里四个w是预先设置好的,不是可学习的
        # 作者为了方便调参,放在了这里
        self.w1 = params['w1']
        self.w2 = params['w2']
        self.w3 = params['w3']
        self.w4 = params['w4']
        self.negative_weight = params['negative_weight']
        self.gamma = params['gamma']
        self.lambda_ = params['lambda']
        self.user_embeds = nn.Embedding(self.user_num, self.embedding_dim)
        self.item_embeds = nn.Embedding(self.item_num, self.embedding_dim)
        self.constraint_mat = constraint_mat
        self.ii_constraint_mat = ii_constraint_mat
        self.ii_neighbor_mat = ii_neighbor_mat
        self.initial_weight = params['initial_weight']
        self.initial_weights()
    def initial_weights(self):
        nn.init.normal_(self.user_embeds.weight, std=self.initial_weight)
        nn.init.normal_(self.item_embeds.weight, std=self.initial_weight)
    # 这里根据已经计算好的beta_ui等系数结合预先设置的w来计算损失函数中的总体系数
    # 具体损失函数计算可见cal_loss_L
    def get_omegas(self, users, pos_items, neg_items):
        device = self.get_device()
        if self.w2 > 0:
            pos_weight = torch.mul(self.constraint_mat['beta_uD'][users], self.constraint_mat['beta_iD'][pos_items]).to(
                device)
            pow_weight = self.w1 + self.w2 * pos_weight
        else:
            pos_weight = self.w1 * torch.ones(len(pos_items)).to(device)
        # users = (users * self.item_num).unsqueeze(0)
        if self.w4 > 0:
            neg_weight = torch.mul(torch.repeat_interleave(self.constraint_mat['beta_uD'][users], neg_items.size(1)),
                                   self.constraint_mat['beta_iD'][neg_items.flatten()]).to(device)
            neg_weight = self.w3 + self.w4 * neg_weight
        else:
            neg_weight = self.w3 * torch.ones(neg_items.size(0) * neg_items.size(1)).to(device)

        weight = torch.cat((pow_weight, neg_weight))
        return weight
    # 计算损失函数L:User-Item图上的计算
    def cal_loss_L(self, users, pos_items, neg_items, omega_weight):
        device = self.get_device()
        # 通过nn.Embedding得到用户,商品的embedding
        user_embeds = self.user_embeds(users)
        pos_embeds = self.item_embeds(pos_items)
        neg_embeds = self.item_embeds(neg_items)
        # 对应损失函数中用户和邻居正样本求点积
        pos_scores = (user_embeds * pos_embeds).sum(dim=-1)  # batch_size
        user_embeds = user_embeds.unsqueeze(1)
        # 对应损失函数中用户和邻居负样本求点积
        neg_scores = (user_embeds * neg_embeds).sum(dim=-1)  # batch_size * negative_num
        # 计算正负样本的交叉熵,这里的交叉熵包含了L_C和L_O,由于这里两个损失用的数据是一样的
        # 根据论文中公式12,13,14 L= L_O + lambda * L_C 可以直接改写为L = (1+lambda*beta)*BCE=(1/lambda+beta)*BCE
        # lambda 用于表示两个损失函数的相对重要性
        # 这里的omega_weight就是BCE前面的权重系数,具体计算方式见get_omegas
        # #L = -(w1 + w2*\beta)) * log(sigmoid(e_u e_i)) - \sum_{N-} (w3 + w4*\beta) * log(sigmoid(e_u e_i'))
        neg_labels = torch.zeros(neg_scores.size()).to(device)
        neg_labels = torch.zeros(neg_scores.size()).to(device)
        neg_loss = F.binary_cross_entropy_with_logits(neg_scores, neg_labels,
                                                      weight=omega_weight[len(pos_scores):].view(neg_scores.size()),
                                                      reduction='none').mean(dim=-1)
        pos_labels = torch.ones(pos_scores.size()).to(device)
        pos_loss = F.binary_cross_entropy_with_logits(pos_scores, pos_labels, weight=omega_weight[:len(pos_scores)],
                                                      reduction='none')
        # 计算总体损失
        loss = pos_loss + neg_loss * self.negative_weight
        return loss.sum()
    # 计算L_I,在Item-Item上计算
    def cal_loss_I(self, users, pos_items):
        device = self.get_device()
        # 得到商品的邻居商品的embedding和用户的embedding
        neighbor_embeds = self.item_embeds(
            self.ii_neighbor_mat[pos_items].to(device))  # len(pos_items) * num_neighbors * dim
        sim_scores = self.ii_constraint_mat[pos_items].to(device)  # len(pos_items) * num_neighbors
        user_embeds = self.user_embeds(users).unsqueeze(1)
        # 计算L_I
        loss = -sim_scores * (user_embeds * neighbor_embeds).sum(dim=-1).sigmoid().log()
        # loss = loss.sum(-1)
        return loss.sum()
    # L2正则项
    def norm_loss(self):
        loss = 0.0
        for parameter in self.parameters():
            loss += torch.sum(parameter ** 2)
        return loss / 2
    # 训练的时候,计算所有损失后求和
    def forward(self, users, pos_items, neg_items):
        omega_weight = self.get_omegas(users, pos_items, neg_items)
        loss = self.cal_loss_L(users, pos_items, neg_items, omega_weight)
        loss += self.gamma * self.norm_loss()
        loss += self.lambda_ * self.cal_loss_I(users, pos_items)
        return loss
    # 测试的时候,计算商品和用户的embedding,然后点积求分数
    def test_foward(self, users):
        items = torch.arange(self.item_num).to(users.device)
        user_embeds = self.user_embeds(users)
        item_embeds = self.item_embeds(items)
        return user_embeds.mm(item_embeds.t())
    def get_device(self):
        return self.user_embeds.weight.device

论文代码复现结果:

Amazon数据集:

论文阅读和模型解读 UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation_第2张图片

 

Gowalla数据集:

论文阅读和模型解读 UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation_第3张图片

Movielens-1M数据集

 论文阅读和模型解读 UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation_第4张图片

 

总结与反思:

ultralGCN用两个辅助loss把图结构信息引入MF,感觉就MF+ItemCF的模型融合。但跑了之后只有在Amazon数据集上的提升特别好 其他提升就很小吧。为什么会出现这个问题?我想跟它的约束损失函数有关系。

论文阅读和模型解读 UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation_第5张图片 

疑问:论文分析lightgcn存在的问题引入的公式为:

我认为这个公式没有解释清楚它的由来,lightgcn的论文中给出的定义并不是这个样子,即使加上了自连接也不是这样的原理。ultraGCN的论文中用这个公式来说明lightgcn存在的问题进行优化我认为没有解释明白。目前我认为它的贡献点就是用近似代替了GCN,仅此而已。思考这个模型在其他数据集上的表现提升并不是很明显,论文给出的约束损失函数是否还有其他提升的办法,由于层数过多以后导致的过拟合是否可以通过跨层连接的思想,或者将最原始的信息和上一层的信息做内积传给下一层GCN学习。

下周学习计划:

学习多任务在基于图卷积神经网络的推荐系统上的应用。

论文阅读Revisiting Graph based Collaborative Filtering:A Linear Residual Graph Convolutional Network Approach

你可能感兴趣的:(深度学习,推荐算法,算法,python,神经网络)