论文下载地址: https://doi.org/10.1145/3488560.3498527
发表期刊:WSDM
Publish time: 2022
作者及单位:
数据集: 正文中的介绍
代码:
其他:
其他人写的文章
简要概括创新点: (1)Contrastive Meta Learning
- We propose a new multi-behavior learning paradigm CML for recommendation by emphasizing the importance of diverse and multiplex user-item relationships, as well as tackling the label scarcity problem for target behaviors. (我们提出了一种新的多行为学习范式CML推荐,强调了多样化和多元化*的用户-项目关系的重要性,并解决了目标行为的 标签稀缺问题。)
- In our CML framework, we design a multi-behavior contrastive learning paradigm to capture the transferable user-item relationships from multi-typed user behavior data, which incorporates auxiliary supervision signals into the sparse target behavior modeling. (在我们的CML框架中,我们设计了一个多行为对比学习范式,从多类型用户行为数据中捕获可转移的用户-项目关系,该范式将辅助监督信号纳入稀疏目标行为建模中。)
- Furthermore, our proposed meta contrastive encoding scheme allows CML to preserve the personalized multi-behavior characteristics, so as to be reflective of the diverse behavior-aware user preference under a customized self-supervised framework. (此外,我们提出的 元对比编码方案允许CML保留 个性化的多行为特征,从而在定制的 自我监督框架下反映用户的不同行为感知偏好。)
• Information systems → Recommender systems.
Collaborative filtering, Self-Supervised Learning, Multi-Behavior Recommendation, Meta Learning, Graph Neural Network
(1) Recommender systems have emerged as critical components to alleviate information overloading for users in various online applications, e.g., e-commerce [40], online video platform [46] and social media [30]. The goal is to learn user preference and forecast the items that he or she will consume based on observed user behaviors. (推荐系统已经成为缓解各种在线应用中用户信息过载的关键组件,例如电子商务[40]、在线视频平台[46]和社交媒体[30]。目标是了解用户偏好,并根据观察到的用户行为预测他或她将消费的商品。)
(2) Among various recommendation techniques, (在各种推荐技巧中)
(3) However, the majority of existing recommendation models assume that only a single type of interaction exists between user and item, whereas in practical recommendation scenarios are multiplex in nature [12, 41]. (然而,大多数现有的推荐模型都假设用户和项目之间只存在单一类型的交互,而在实际推荐场景中,这种交互本质上是多重的[12,41]。)
(4) Despite the effectiveness of existing methods, these studies share two common limitations: (尽管现有方法有效,但这些研究有两个共同的局限性:)
(5) Contributions.
(6) In a nutshell, this work makes the following contributions: (简而言之,这项工作做出了以下贡献:)
The studied task is formally stated as: (所研究的任务正式表述为:)
(1) To inject the high-order connectivity into the multiplex relation learning across users/items, we first develop a graph-based message passing framework with the awareness of behavior context. (为了将高阶连通性注入到跨用户/项目的多重关系学习 中,我们首先开发了一个基于图形的具有行为上下文意识 的消息传递框架。)
Motivated by graph-based information propagation neural architecture [55] and the findings in the state-of-the-art model Light-GCN [15, 22], our behavior-aware message passing scheme is built over a lightweight graph architecture, which can be represented: (受 基于图形的信息传播神经架构 [55]和最先进的 Light GCN 模型[15,22]的研究结果的启发,我们的行为感知消息传递方案建立在一个 轻量级图形架构 之上,可以表示为:)
(2) After encoding the behavior-specific interaction patterns of users, we propose to perform the embedding aggregation across different types of behaviour patterns with the following operation for user representations (similar aggregation is applied for item side): (在编码用户的行为特定交互模式后,我们建议通过以下用户表示操作,在不同类型的行为模式之间执行嵌入聚合(类似聚合适用于项目端):)
(1) After establishing contrastive views from multi-behavior context, we further devise a behavior-wise contrastive learning paradigm between the target behaviors and auxiliary behaviors. (在建立了多行为语境下的对比视角后,我们进一步设计了目标行为和辅助行为之间的行为层面对比学习范式。)
In particular,
Given the encoded target behavior representatione e u k e^k_u euk from our graph neural architecture, the generated positive and negative pairs are { e u k , e u k ′ ∣ u ∈ U } \{e^k_u, e^{k′}_u | u \in \mathcal{U}\} {euk,euk′∣u∈U} and { e u k , e u ′ k ′ ∣ u , u ′ ∈ U , u ≠ u ′ } \{e^k_u, e^{k′}_{u′} | u, u′ \in \mathcal{U}, u \neq u' \} {euk,eu′k′∣u,u′∈U,u=u′}. (给定编码的目标行为表示 e u k e^k_u euk,根据我们的图形神经结构,生成的正对和负对是)
The incorporated auxiliary supervision enables our model to still recognize user u u u from different behavior views (i.e., k k k and k ′ k′ k′; k , k ′ ∈ K k, k′ \in K k,k′∈K) and captures the latent relationships between the auxiliary behaviors and target behaviors. (合并的辅助监督使我们的模型仍然能够从不同的行为视图(即 k k k and k ′ k′ k′; k , k ′ ∈ K k, k′ \in K k,k′∈K)识别用户 u u u) 并捕捉辅助行为和目标行为之间的潜在关系。)
Meanwhile, for different users u u u and u ′ u' u′, the contrastive loss aims to discriminate their behavior embeddings after data augmentation. (同时,对于不同的用户 u u u 和 u ′ u' u′,对比损失的目的是区分他们在数据增强后的行为嵌入。)
(2) Following works [48, 58], we utilize the InfoNCE [29] loss in our multi-view contrastive learning framework, to measure the distance between embeddings. (接下来的工作[48,58],我们利用多视角对比学习框架中的InfoNCE[29]损失来测量嵌入之间的距离。)
We define our self-supervised learning loss with the objective of maximizing the Mutual Information (MI) between user representations through contrasting positive pairs with the sampled negative pair counterparts. The InfoNCE-based contrastive loss is calculated as below:
(3) To sum up, we perform the contrastive learning via maximizing the agreement between two behavior views based on the above defined contrastive loss, and enforcing the divergence among different users.
We obtain the contrastive loss L c l k , k ′ \mathcal{L}^{k, k'}_{cl} Lclk,k′ for each pair of target behavior ( k ) (k) (k) and auxiliary behavior ( k ′ ) (k′) (k′).
Therefore, we generate a list of ontrastive loss functions as:
To evaluate CML’s performance, we conduct experiments on several real-world datasets by answering the following research questions: (为了评估CML的性能,我们通过回答以下研究问题,在几个真实数据集上进行了实验:)
We further compare our CML with two state-of-the-art heterogeneous graph neural networks, by applying them to capture the heterogeneous behavior relations in recommendation. (我们进一步将我们的CML与两个最先进的异构图神经网络进行比较,通过应用它们来捕获推荐中的异构行为关系。)
(1) We implement our CML with PyTorch.
The embedding initialization is performed with Xavier [14] and the model is optimized by adopting the AdamW optimizer [26] and the Cyclical Learning Rate (CyclicLR) strategy [35].
In specific, the base and max learning rate is searched from { 0.6 e − 4 , 1 e − 4 , 1 e − 3 } \{ 0.6e^{−4}, 1e^{−4}, 1e^{−3} \} {0.6e−4,1e−4,1e−3} and { 0.6 e − 3 , 1 e − 3 , 2 e − 3 , 5 e − 3 } \{ 0.6e^{−3}, 1e^{−3}, 2e^{−3}, 5e^{−3} \} {0.6e−3,1e−3,2e−3,5e−3}, respectively. (使用Xavier[14]进行嵌入初始化,并采用AdamW优化器[26]和循环学习率(CyclicLR)策略[35]对模型进行优化。)
For all graph-based baselines, the number of graph-based message propagation layers is tuned from {1,2,3,4}. (对于所有基于图的基线,基于图的消息传播层的数量从{1,2,3,4}调整。)
We apply the L2 regularization for the learned embeddings with the weight tuned from { 1 e − 3 , 5 e − 3 , 1 e − 2 } \{ 1e^{−3}, 5e^{−3}, 1e^{−2} \} {1e−3,5e−3,1e−2}. (我们对学习到的嵌入应用L2正则化,权重从 { 1 e − 3 , 5 e − 3 , 1 e − 2 } \{ 1e^{−3}, 5e^{−3}, 1e^{−2} \} {1e−3,5e−3,1e−2})
Additionally, to alleviate the overfitting issue, the dropout is used in our designed meta network. (此外,为了缓解过度拟合的问题,我们在设计的元网络中使用了dropout)
(2) We adopt the widely used leave-one-out strategy by generating the test set from users’ last interacted items under the target behavior type (i.e., purchase/transaction). (我们采用了广泛使用的遗漏策略,根据用户在目标行为类型(即购买/交易)下最后一次交互的项目生成测试集。)
Two representative evaluation metrics are used for performance comparison: (两个有代表性的评估指标用于效果比较:)
We also run our CML model and the best-performed baseline method for 10 times to calculate p-values for significance analysis. (我们还运行我们的CML模型和表现最佳的基线方法10次,以计算p值进行显著性分析)
We present the detailed evaluation results of all methods on different datasets in Table 2 where the results of our CML and the best performed baselines are highlighted with bold and underlined, respectively. Key observations are as follows: (我们在表2中给出了不同数据集上所有方法的详细评估结果,其中CML和最佳执行基线的结果分别用粗体和下划线突出显示。主要观察结果如下:)
To shed light on the performance improvement, we further conduct the ablation study for our CML, to justify the rationality of the designed key components. Analysis details are summarized as: (为了阐明性能改进,我们进一步对我们的CML进行了烧蚀研究,以证明设计的关键部件的合理性。分析细节总结如下:)
(1) Interaction Data Sparsity (RQ3) In this section, we aim to show the rationality of bringing the contrastive learning into the multi-behavior recommendation, so as to alleviate the data sparsity issue. (交互数据稀疏性(RQ3)在本节中,我们旨在展示将对比学习引入多行为推荐的合理性,以缓解数据稀疏性问题。)
(2) We have the following findings: (我们有以下发现:)
(1) In this paper, we develop a novel multi-behavior contrastive meta learning framework for recommendation. (我们开发了一个新的多行为对比元学习推荐框架。)
(2) In this paper, we take the initial step to capture the diverse multi-behavior patterns of users for recommendation under the self-supervised learning paradigm. (在本文中,我们首先在自监督学习范式下捕获用户多样的多行为模式以供推荐。)