论文下载地址: https://doi.org/10.1145/3488560.3498371
发表期刊:WSDM
Publish time: 2022
作者及单位:
数据集: 正文中的介绍
代码:
其他:
其他人写的文章
简要概括创新点: internal Short-term + global Long-term + Temporal Meta-learning based on MAML + DeepFM + Multi-hop Neighbor-similarity Based loss
- The contributions of this work are concluded in four points as follows: (这项工作的贡献总结为以下四点:)
- We first systematically address the practical challenges of jointly considering users’ internal/external behaviors and short-/long-term preferences in recommendation via our new proposed LSTTM framework. LSTTM is effective and easy to deploy in practical systems. (我们首先通过新提出的LSTTM框架,系统地解决了在推荐中联合考虑用户的内部/外部行为和短期/长期偏好的实际挑战。LSTTM有效且易于在实际系统中部署。)
- We build two graphs focusing on different aspects to make full use of all internal/external behaviors. Moreover, we set customized GAT aggregators and training strategies to better learn user short-/long-term preferences. (我们构建了两个关注不同方面的图表,以充分利用所有内部/外部行为。此外,我们还设置了定制的GAT聚合器和培训策略,以更好地了解用户的短期/长期偏好。)
- We design a novel temporal meta-learning method based on MAML, which enables fast adaptations to users’ real-time preferences. To the best of our knowledge, we are the first to adopt temporal MAML in online recommendation. (我们设计了一种新的基于MAML的时态元学习方法,能够快速适应用户的实时偏好。据我们所知,我们是第一个在在线推荐中采用时态MAML的)
• Information systems → Recommender systems.
recommendation, temporal meta-learning, online recommendation
(1) Real-world industry-level recommendation systems usually need to interact with complicated practical scenarios. (现实世界中的行业级推荐系统通常需要与复杂的实际场景交互。)
Large-scale on-line recommendation systems in super platforms such as Google and Amazon usually have the following two complexities: (谷歌和亚马逊等超级平台上的大规模在线推荐系统通常有以下两个复杂性:)
(2) In this work, we attempt to design an effective and efficient online recommendation framework, which jointly considers both users’ internal/external behaviorsandusers’ short-term/long-term preferences. (在这项工作中,我们试图设计一个有效且高效的在线推荐框架,该框架综合考虑了用户的内部/外部行为和用户的短期/长期偏好。)
This online recommendation mainly faces the following three challenges: (该在线推荐主要面临以下三个挑战:)
(3) To address these issues, we propose a novel Long Short-Term Temporal Meta-learning (LSTTM) framework for practical online recommendations. (为了解决这些问题,我们提出了一个新的长短期时间元学习(LSTTM)框架,用于实用的在线推荐。)
(4) In experiments, we conduct an offline temporal CTR prediction with competitive baselines on a real-world recommendation system, and also conduct an online A/B test. The significant offline and online improvements show the effectiveness of LSTTM. Moreover, we also conduct an ablation study to better understand the effectiveness of different components. (在实验中,我们在现实世界的推荐系统上进行了离线时间CTR预测,并进行了在线a/B测试。线下和线上的显著改进表明了LSTTM的有效性。此外,我们还进行了消融研究,以更好地了解不同成分的有效性。)
(5) The contributions of this work are concluded in four points as follows: (这项工作的贡献总结为以下四点:)
(1) In real-world recommendation, Factorization machine (FM) [19], NFM [8], DeepFM [7], AutoInt [20], DFN [29] are widely used to model feature interactions. User behaviors are one of the most essential features to learn user preferences. Lots of models [21, 28, 31, 34, 37] regard user behaviors as sequences to model user preferences via attention and transformer. Besides sequence-based models, graph-based models such as SR-GNN [26] and GCE-GNN [25] use graph neural networks (GNNs) on user behavior graphs built from sessions. (在实际推荐中,因子分解机(FM)[19]、NFM[8]、DeepFM[7]、AutoInt[20]、DFN[29]被广泛用于建模特征交互。用户行为是了解用户偏好的最基本特征之一。许多模型[21,28,31,34,37] 将用户行为视为序列 ,通过注意和变换来模拟用户偏好。除了基于序列的模型外,基于图形的模型(如SR-GNN[26]和GCE-GNN[25])在会话构建的用户行为图上使用图形神经网络(GNN)。)
(2) Both long- and short- term preferences are essential in recommendation. (长期和短期偏好在推荐中都是必不可少的。)
(1) Meta-learning aims to transfer the meta knowledge so as to rapidly adapt to new tasks with a few examples, which is regarded as “learning to learn” [24]. (元学习旨在转移元知识,以快速适应新任务,只需几个例子,这被认为是“学会学习”[24]。)
(2) In recommendation, meta-learning has also been verified on various cold-start scenarios, including cold-start user [10, 40], item [22, 39], cross-domain recommendation [4, 38] and model selection [15]. (在推荐中,元学习也在各种冷启动场景中得到了验证,包括冷启动用户[10,40]、项目[22,39]、跨领域推荐[4,38]和模型选择[15]。)
Besides MAML-based methods, Pan et al. [17] proposes meta-embeddings for warm-up scenarios. (除了基于MAML的方法外,Pan等人[17]还提出了热身场景的元嵌入。)
Different from these models, LSTTM designs a temporal MAML to accelerate model adaptation to users’ short-term preferences. (与这些模型不同,LSTTM设计了一个时间MAML,以加速模型适应用户的短期偏好。)
To the best of our knowledge, we are the first to conduct temporal MAML in recommendation. (据我们所知,我们是第一个进行时间MAML推荐的公司。)
(1) The internal short-term graph module models user short-term internal behaviors. (内部短期图模块为用户的短期内部行为建模。)
(2) Inspired by Veličković et al.[23], we build an enhanced GAT layer for short-term oriented node aggregation. (我们构建了一个增强的GAT层,用于面向短期的节点聚合。)
(3) Similarly, we also generate the sampled neighbor set of items as N d i = { u i , m ′ − K + 1 , u i , m ′ } N_{d_i} = \{u_{i, m^{'} - K + 1}, u_{i, m^{'}} \} Ndi={ui,m′−K+1,ui,m′}. With the temporal neighbor set N u i N_{u_i} Nui, we build the user representation u i k u^k_i uik at the k k k-th layer via item embeddings in the k − 1 k−1 k−1 layer as follows: (类似地,我们还生成采样的相邻项集 N d i = { u i , m ′ − K + 1 , u i , m ′ } N_{d_i} = \{u_{i, m^{'} - K + 1}, u_{i, m^{'}} \} Ndi={ui,m′−K+1,ui,m′}. 伴同时态邻居集 N u i N_{u_i} Nui,我们构建了第 k k k层的用户表示,通过$k-1层的项目embedding。 )
(4) W d k W^k_d Wdk is the weighting matrix. α i j k \alpha^k_{ij} αijk represents the attention between u i u_i ui and d i , j d_{i,j} di,j in this layer, which is formalized as: ( W d k W^k_d Wdk是权重矩阵, α i j k \alpha^k_{ij} αijk表示该层中 u i u_i ui和 d i , j d_{i,j} di,j之间的注意力,其形式化为:)
Note that the temporal neighbor set N u i N_{u_i} Nui changes over time, since the internal short-term graph is a dynamic graph that is updated via users’ new behaviors. (注意到,时态邻居集 N u i N_{u_i} Nui随着时间的推移会发生变化,因为内部短期图是一个动态图,通过用户的新行为进行更新。)
Finally, we conduct a two-layer temporal GAT to generate the user short-term representation u i s = u i 2 u^s_i = u^2_i uis=ui2, which is fed into the next gating fusion module. (最后,我们进行了两层时态GAT,以生成用户短期表示 u i s = u i 2 u^s_i = u^2_i uis=ui2 , 被送入下一个选通融合模块。)
The aggregation of items is similar to that of users. (项目的聚合与用户的聚合类似。)
We use GAT since it is effective, efficient, and easy to deploy on billion-level huge graphs. It is also convenient to conduct other enhanced GNN models in this module. (我们使用GAT,因为它有效、高效,并且易于在十亿级的大型图形上部署。在本模块中还可以方便地执行其他增强型GNN模型。)
(5) In internal short-term graph, the temporal neighbor sampling highlights the individual-level short-term preferences. We also propose a temporal meta-learning method to update this module, attempting to capture the short-term preferences at the global level, which will be introduced in Sec. 3.5. (在内部短期图中,时态邻域抽样突出了个体水平的短期偏好。我们还提出了一种时态元学习方法来更新这个模块,试图在全局层面捕捉短期偏好,这将在第二节中介绍。3.5.)
(1) The global long-term graph module aims to take advantage of all user diverse preferences in multiple applications. (全局长期图模块旨在利用多个应用程序中所有用户的不同偏好。)
(2) We also conduct a two-layer GAT for neighbor aggregation similar as Eq. (2) to Eq. (3) , where the neighbor set N u i ˉ \bar{N_{u_i}} Nuiˉ are randomly sampled or selected via certain importances. (我们还为邻居聚合进行了两层GAT,类似于等式(2)到等式(3),其中邻居集 N u i ˉ \bar{N_{u_i}} Nuiˉ是通过某些确定的重要性随机抽样或选择的。)
The user long-term representation u ˉ i l = u ˉ i 2 \bar{u}^l_i = \bar{u}^2_i uˉil=uˉi2 is also utilized in the gating fusion module. Since the overall behaviors are too enormous to be fully retrained in online, and external behaviors are usually delayed and uncontrollable, we conduct an enhanced neighbor-similarity based loss to train this module asynchronously introduced in Sec. 3.5. (用户长期表示 u ˉ i l = u ˉ i 2 \bar{u}^l_i = \bar{u}^2_i uˉil=uˉi2也用于选通融合模块。由于整体行为过于庞大,无法在线完全重新训练,外部行为通常会延迟且无法控制,因此我们采用增强的基于邻居相似性的loss来异步训练Sec3.5中介绍的模块)
(3) Comparing the internal short-term graph modeling with the global long-term graph modeling, we can find three main differences: (比较内部短期图建模和全局长期图建模,我们可以发现三个主要区别:)
(1) This module attempts to combine both user short-term and long-term representations u i s u^s_i uis and u ˉ i l \bar{u}^l_i uˉil to generate the ranking score. We conduct a gating-based fusion to generate the final user representation u i u_i ui via u i s u^s_i uis and u ˉ i l \bar{u}^l_i uˉil as follows:
g ( ⋅ ) g(·) g(⋅) indicates the gating function, which is measured via the corresponding user embeddings and the target item d j s d^s_j djs as: (指示选通功能,通过相应的用户嵌入和目标项进行测量) (是一个随机初始化的可训练项目ID嵌入。)
(2) With this gating-based fusion, users can get personalized weights on long-/short- term preferences for different items, which helps to improve the performances. (通过这种基于选通的融合,用户可以根据不同项目的长期/短期偏好获得个性化权重,这有助于提高性能。)
(2) After gating fusion, the final user representation u i u_i ui is aggregated with the recommendation contexts c c c and target item embedding d j s d^s_j djs, and then fed into the downstream neural ranking models. (在门控融合之后,最终的用户表示 u i u_i ui与推荐上下文 c c c和嵌入 d j s d^s_j djs的目标项聚合,然后输入下游神经排序模型。)
We conduct a widely-used DeepFM [7] to model the feature field interactions between user, item and contexts as follows: (我们使用广泛使用的DeepFM[7]对用户、项目和上下文之间的功能字段交互进行建模,如下所示:)
p ( i , j ) p(i, j) p(i,j) is the click probability for u i u_i ui and d j d_j dj. It is also easy to adopt other feature interaction models for feature interactions. ( p ( i , j ) p(i,j) p(i,j)是 u i u_i ui和 d j d_j dj的点击概率。采用其他特征交互模型进行特征交互也很容易。)
The asynchronous optimization with temporal MAML is the key contribution of LSTTM. In practice, timely model updating is significant in online recommendation, while there are two challenges in real-world systems. (基于时间MAML的异步优化 是LSTTM的 关键贡献 。在实践中,及时更新模型对于在线推荐非常重要,而在现实系统中有两个挑战。)
(1) To enhance LSTTM with the capability of fast adaptation to user short-term interests, we propose a novel temporal MAML training strategy based on [6]. ([6] Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of ICML.) (为了增强LSTTM快速适应用户短期兴趣的能力,我们提出了一种 基于[6]的时态MAML训练策略 )
Different from conventional meta-learning based recommendations that usually consider each user or domain as a task, our temporal MAML regards recommendation in each time period as a task. (不同于传统的基于元学习的建议,通常认为每个用户或域作为一个任务,我们的时态MAML考虑在每个时间段的建议作为一项任务。)
(2) Specifically, we first divide all training instances into different sets according to their time periods (e.g., we view each hour as a time period for the practical demands). (具体来说,我们首先根据时间段将所有训练实例划分为不同的集合(例如,我们将每个小时视为实际需求的时间段)。)
(3) In training, we sample different temporal tasks containing training instances in different time periods to form a batch. (在训练中,我们对包含不同时间段的训练实例的不同时间任务进行采样,形成一个batch。)
(4) Under the temporal MAML training framework, we conduct a classical cross entropy loss L T L_T LT with the click probability p ( i , j ) p(i, j) p(i,j) of user u i u_i ui and item d j d_j dj on the positive set (clicked user-item instances) S p S_p Sp and negative set (unclicked user-item instances) S n S_n Sn as follows:
(5) Note that the L T L_T LT is only used for updating the internal short-term graph and the long- short-term preference fusion modules via the temporal MAML as in Fig. 2. (请注意, L T L_T LT仅用于通过时态MAML更新内部短期图和长短期偏好融合模块,如图2所示。)
(6) Motivations and advantages of temporal MAML. The motivations and advantages of the temporal MAML are concluded as follows: (时态MAML的动机和优势。时态MAML的动机和优势总结如下)
(1) Differing from the internal short-term graph, the global long-term graph (与内部短期图不同,全局长期图)
To make a compromise between efficiency, effectiveness, and robustness, we conduct a multi-hop neighbor- similarity based loss instead of the online temporal MAML. (为了在效率、有效性和鲁棒性之间做出折衷,我们采用了基于多跳邻居相似性的损失,而不是在线时态MAML。)
(2) We assume that both users’ and items’ long-term representations u ˉ i l \bar{u}^l_i uˉil and d ˉ j l \bar{d}^l_j dˉjl learned in Sec. 3.3 should be similar to their k-hop neighbors on the global long-term graph enhanced from [11] and [30]. (我们假设用户和项目的长期表示 u ˉ i l \bar{u}^l_i uˉil and d ˉ j l \bar{d}^l_j dˉjl,在Sec3.3学习到的。应该与[11]和[30]增强的全局长期图上的k-hop邻居相似。)
(3) The multi-hop neighbor-similarity based loss focuses more on the global view of user and item representations learned from all long-term internal/external behaviors. (基于多跳邻居相似性的损失更关注从所有长期的内部/外部行为中学习到的用户和项目表示的全局视图。)
(4) The advantages of using the multi-hop neighbor-similarity based loss for the global long-term graph are as follows: (对全局长期图使用基于多跳邻居相似性的损失的优点如下:)
(1) The overall loss L L L is the weighted aggregation of these two losses L T L_T LT and L N L_N LN as follows:
(2)The advantages of our asynchronous optimization are listed as follows: (我们的异步优化的优点如下:)
In this section, we conduct experiments to answer the following research questions: (在本节中,我们通过实验来回答以下研究问题:)
(1) We implement several competitive baselines for evaluation. First, we conduct four widely-used ranking models as follows: (我们实施了几个竞争性评估基准。首先,我们进行了四种广泛使用的排名模型,如下所示:)
(2) These baselines use the same features of the users, internal behaviors and contexts that are also used in LSTTM, and are optimized via the same training set with the cross-entropy loss. (这些基线使用与LSTTM中同样使用的用户特征、内部行为和上下文,并通过具有交叉熵损失的相同训练集进行优化。)
(3) For fair comparisons, we also implement two enhanced DeepFM models armed with external behaviors and sequence modeling. (为了进行公平的比较,我们还实现了两个增强的DeepFM模型,其中包含外部行为和序列建模。)
(4) Finally, since we conduct the temporal MAML for online updating, we also implement two SOTA meta-learning methods based on SML [35] in online news recommendation as follows: (最后,由于我们进行了在线更新的时态MAML,我们还在在线新闻推荐中基于SML[35]实现了两种SOTA元学习方法,如下所示:)
(5) Note that we do not compare with other meta-learning recommendation methods such as MeLU [10], since they focus on different tasks (e.g., cold-start users or domains) and are not suitable for our temporal setting. To further verify the effectiveness of different components and features in LSTTM, we implement four ablation versions of LSTTM, whose results are discussed in Sec. 5.6. (请注意,我们不会与其他元学习推荐方法(如MeLU[10])进行比较,因为它们专注于不同的任务(例如,冷启动用户或域),不适合我们的时间设置。为了进一步验证LSTTM中不同组件和功能的有效性,我们实现了LSTTM的四个消融版本,其结果将在第。5.6.)
We first simulate the real-world online recommendation and conduct the temporal CTR prediction task for offline evaluation. (我们首先模拟现实世界中的在线推荐,并执行用于离线评估的时态CTR预测任务)
From Table 2 we can observe that: (从表2可以看出:)
(1) LSTTM achieves significant improvements on all baselines in three periods, with the significance level α = 0.01 \alpha = 0.01 α=0.01 . It consistently outperforms strong baselines in all 24 hours (see Fig. 3). The deviation is less than ±0.002. Considering the large size of our test set, the 1.1% − 1.7% AUC improvements over the best baseline are impressive and solid. It verifies the effectiveness and robustness of LSTTM in modeling both short-term and long-term preferences from users’ internal and external behaviors. (LSTTM在三个时期内实现了所有基线的显著改善,显著性水平α=0.01。在所有24小时内,其表现始终优于强基线(见图3)。偏差小于±0.002。考虑到我们的测试集很大,1.1%− 与最佳基线相比,1.7%的AUC改善令人印象深刻且稳定。它验证了LSTTM在从用户的内部和外部行为建模短期和长期偏好方面的有效性和鲁棒性。)
(2) LSTTM (final) consistently outperforms LSTTM (w/o Meta) and SML on all tasks. It confirms the advantages of temporal MAML in Sec. 3.5.1. Thanks to the MAML-based training, LSTTM is more sensitive to global new trends in communities. Hence, it can better capture users’ short-term preferences via good model initialization, and thus can fast adapt to hot topics over time in online recommendation. Nevertheless, LSTTM (w/o Meta) still performs better than baselines, which reflects the effectiveness of our global long-term and internal short-term graphs as well as the gating fusion. Sec. 5.6 gives more details of different ablation versions. (LSTTM(最终版)在所有任务上都始终优于LSTTM(不含Meta)和SML。它证实了时态MAML在Sec中的优势。3.5.1. 由于基于MAML的培训,LSTTM对社区的全球新趋势更加敏感。因此,通过良好的模型初始化,它可以更好地捕捉用户的短期偏好,从而能够快速适应在线推荐中的热门话题。尽管如此,LSTTM(w/o Meta)的性能仍优于基线,这反映了我们的全球长期和内部短期图以及选通融合的有效性。5.6节给出了不同消融版本的更多细节。)
(3) We also find that models armed with external behaviors consistently outperform the same models without external behaviors (e.g., see LSTTM in Sec. 5.6 and DeepFM in Table 2). It verifies the importance of external behaviors in real-world scenarios, which works as a strong supplement to the internal behaviors. The external behaviors will be more significant in few-shot scenarios. (我们还发现,具有外部行为的模型始终优于没有外部行为的相同模型(例如,见第5.6节中的LSTTM和表2中的DeepFM)。它验证了外部行为在现实场景中的重要性,是对内部行为的有力补充。在少数镜头场景中,外部行为将更加显著。)
(4) Comparing models in different periods, we know that LSTTM achieves larger improvements in period 2 and 3 compared to LSTTM (w/o Meta). It is because that (a) humans and hot spots are often more active in period 2 and 3, where temporal MAML is superior to baselines in capturing user real-time preferences. (b) In period3, all models have not been fully trained for at least 16 hours. LSTTM
has a better online fine-tuning to catch up with new global interest evolutions. The cumulative effects of temporal MAML will gradually
show up over time with growing hot topics. Fig. 3 shows the hour-level AUC trends of four representative models. (比较不同时期的模型,我们知道,与LSTTM(w/o Meta)相比,LSTTM在第2和第3时期取得了更大的改进。这是因为(a)人类和热点通常在第2和第3阶段更活跃,在这两个阶段中,时间MAML在捕捉用户实时偏好方面优于基线。(b) 在第三阶段,所有模型至少有16小时没有接受过全面培训。LSTTM有更好的在线微调功能,以跟上新的全球利益变化。时间MAML的累积效应将随着时间的推移逐渐显现,并伴随着越来越多的热点话题。图3显示了四个代表性模型的小时级AUC趋势。)
Table 3 shows the improvement percentages over the base model. We can observe that: (表3显示了与基础模型相比的改进百分比。我们可以观察到:)
(1) LSTTM achieves significant improvements on all metrics with the significance level α = 0.01 \alpha = 0.01 α=0.01. It reconfirms the effectiveness of LSTTM in online. Through the asynchronous online updating with the temporal MAML, LSTTM can (LSTTM在显著性水平α=0.01的所有指标上都取得了显著改善。它再次确认了LSTTM在在线测试中的有效性。通过时态MAML的异步在线更新,LSTTM可以)
(2) The improvement on CTR indicates that more appropriate items have been impressed to users (reflecting item-aspect accuracy), while the improvement on ACN represents that users are more willing to click items (reflecting user-aspect accuracy and activeness). HCR models the coverage of users that have clicked news, which implies the impacts of our recommendation function. LSTTM also outperforms the online baseline on dwell time of items, which reflects the real user satisfaction on the item contents. In conclusion, LSTTM achieves comprehensive improvements on all online metrics, which confirms the robustness of our model. (CTR的改进表明用户对更合适的项目印象深刻(反映项目方面的准确性),而ACN的改进表明用户更愿意点击项目(反映用户方面的准确性和积极性)。HCR对点击新闻的用户的覆盖率进行建模,这意味着我们的推荐功能的影响。LSTTM在商品停留时间方面也优于在线基线,这反映了用户对商品内容的真实满意度。总之,LSTTM实现了所有在线指标的全面改进,这证实了我们模型的稳健性。)
We further conduct an ablation test to verify the effectiveness of different components in LSTTM. Table 4 shows the results of different ablation settings. We observe that all components significantly benefit the recommendation. Precisely, we find that: (我们进一步进行了消融试验,以验证LSTTM中不同组件的有效性。表4显示了不同消融设置的结果。我们观察到,所有组件都显著受益于该建议。确切地说,我们发现:)
(1) In this work, we propose an LSTTM for online recommendation, (在这项工作中,我们提出了一个用于在线推荐的LSTTM,)
(2) In the future, we will polish the temporal MAML to build a more robust adaptation, and transfer the idea of temporal MAML to other
temporal tasks. We will also explore some enhanced combinations with other online learning and meta-learning methods. (在未来,我们将完善时态MAML以构建更强大的适应性,并将时态MAML的思想转移到其他时态任务中。我们还将探索与其他在线学习和元学习方法的一些增强组合。)