论文下载地址: https://doi.org/10.1145/3292500.3330939
发表期刊:KDD
Publish time: 2019
作者及单位:
数据集: 正文中的介绍
代码:
其他:
其他人写的文章
简要概括创新点:
- (1) Limited attention is very important to social recommendation as it has been proved to have significant impact on users’ online behaviours. (有限的关注对社交推荐非常重要,因为事实证明它对用户的在线行为有重大影响。)
- (2)Therefore, we propose to incorporate limited attention, a well-studied social science notion into social recommendation in an appropriate way. (因此,我们建议以适当的方式将有限的注意力、经过充分研究的社会科学概念纳入社会推荐中。)
- We first formulate the optimal limited attention problem, aiming to optimally bring the concept of limited attention into social recommendation. (我们首先提出了最优有限注意问题,旨在将有限注意的概念最优地引入到社会推荐中。)
- Then we develop a novel model which efficiently finds an optimal number of friends whose preferences have the best impact on the target user and adaptively learns an optimal personalized attention towards every selected friend, as well as the latent preference for each user. (然后,我们开发了一个新的模型,该模型可以有效地找到对目标用户的偏好影响最大的最佳朋友数,并自适应地学习对每个选定朋友的最佳个性化注意,以及每个用户的潜在偏好。)
- (3)we detailedly explain our proposed OLA-Rec model which incorporates the concept of optimal limited attention into social recommendation. (我们详细解释了我们提出的LA-Rec模型,该模型将最优有限注意的概念融入到社会推荐中。)
- (4)we incorporate the concept of optimal limited attention into social recommendation through combining the optimal k ∗ k^∗ k∗ and α ∗ \alpha^∗ α∗ with matrix factorization. Generally, we estimate user i i i’s rating on item j j j, R i j R_{ij} Rij, through the dot product of social factor ϕ i \phi_i ϕi and item j j j’s latent feature vector V j V_j Vj: (接下来,我们将最优有限注意的概念结合到社会推荐中,通过在矩阵分解中结合 k ∗ k^∗ k∗ and α ∗ \alpha^∗ α∗ )
- (5)We then develop a novel algorithm through employing an EM-style strategy to jointly optimize users’ latent preferences, optimal number of their best influential friends and the corresponding attentions. (然后,我们开发了一种新的算法,通过使用EM风格的策略来联合优化用户的潜在偏好、他们最有影响力的朋友的最佳数量以及相应的注意事项。)
- (6) 细节
- we define users rating less than five items as cold-start users. Figure 5 depicts the performances of various methods on cold start users. (我们将评分低于五项的用户定义为冷启动用户)
- We remove users with less than 2 ratings and select 80% of each user’s ratings at random for training, leaving the remainder as test set. (我们删除评分低于2的用户,并随机选择每个用户评分的80%进行训练,剩下的作为测试集。)
• Information systems → Social recommendation.
Recommendation, User Behavior Modeling, Limited Attention
(1) Being capable of efficiently filtering the exploding information on Internet, recommender systems have become an indispensable tool in recommending relevant items that may potentially be attractive to users. As a hot research topic, recommendation with no doubt has received a lot of attention from both academy and industry [1, 28]. Nevertheless, traditional recommender systems suffer from data sparsity which is caused by the fact that the number of items is normally very huge while users commonly consume only a very small portion of these items. In addition, traditional recommendation approaches have a deteriorative performance on new users without any historical behaviours, resulting in the cold start problem. This brings the idea of social recommendation which utilizes social information from social connections (such as friends) to mitigate the above two problems [11, 19]. (由于能够有效过滤互联网上爆炸式增长的信息,推荐系统已经成为推荐可能吸引用户的相关项目的不可或缺的工具。作为一个热门研究课题,推荐无疑受到了学术界和业界的广泛关注[1,28]。然而,传统的推荐系统存在数据稀疏的问题,这是由于项目的数量通常非常庞大,而用户通常只消费其中的一小部分。 此外,传统的推荐方法在没有任何历史行为的新用户身上表现不佳,导致冷启动问题。这就产生了社会推荐的概念,它利用来自社会关系(如朋友)的社会信息来缓解上述两个问题[11,19]。)
(2) Although there have been a lot of works on social recommendation, most of them ignore the attention factor which results in the constraint that only a small portion of information can be processed in real time by each individual due to her limited mind strength and brain capacity [13, 27]. (虽然有很多关于社会推荐的研究,但大多数都忽略了注意力因素,这导致了由于每个人的脑力和大脑容量有限,只能实时处理一小部分信息的限制[13,27]。)
(3) To address the above problem in social recommendation, we borrow the idea of limited attention, a well-documented psychological and cognitive concept from social science that can affect user behaviours. (为了解决社会推荐中的上述问题,我们借用了有限关注的概念,这是一个来自社会科学的心理学和认知概念,可以影响用户行为。)
(4) Therefore, two challenges exist:
(5) To handle these two challenges, we elegantly combine social science concepts with machine learning techniques and formulate the problem of optimal limited attention in the context of social recommendation. (为了应对这两个挑战,我们巧妙地将社会科学概念与机器学习技术相结合,并在社会推荐的背景下提出了最佳有限注意力问题。)
We then propose a novel social recommendation model capable of (然后,我们提出了一种新的社会推荐模型,该模型能够)
To be more concrete,
Experiments on real-world datasets demonstrate the improvement of our proposed model against state-of-the-art approaches. (在真实数据集上的实验表明,我们提出的模型相对于最先进的方法有所改进。)
(6) To recapitulate, the highlight of this paper is that inspired by the sociological discoveries, we develop a model which combines social science concepts and mathematical formulations in an elegant way. We address the challenges raised in social science by means of machine learning techniques in the context of social recommendation. We believe our elegant combination of machine learning with social science can help to achieve a performance boost in terms of social recommendation accuracy. (综上所述,本文的重点是受社会学发现的启发,我们开发了一个模型,以优雅的方式将社会科学概念和数学公式结合起来。我们在社会推荐的背景下,通过机器学习技术解决社会科学中提出的挑战。我们相信,我们将机器学习与社会科学完美结合,有助于提高社会推荐的准确性。)
The contributions of this paper are summarized as follows. (本文的贡献总结如下。)
(1) In recommender systems, we are given a set of users U U U and a set of items I I I,
(2) A matrix factorization model assumes the rating matrix R R R can be approximated by a multiplication of d d d-rank factors,
(1) Given a set of users, their social linkage information, a set of items as well as a subset of user-item ratings as input in the context of social recommendation, for each user select an optimal subset of her friends such that these friends’ preferences can best influence this user and learn an optimal attention for each of these selected friends. (给定一组用户、他们的社交链接信息、一组项目以及用户项目评分子集,作为社交推荐的输入,对于每个用户,选择其朋友的最佳子集,以便这些朋友的偏好能够最好地影响该用户,并为每个选定的朋友学习最佳关注度。)
(2) Existing social recommendation models either simply treat different social connections equally or employ Pearson Correlation Coefficient (PCC) to calculate similarities between users. (现有的社会推荐模型要么简单地平等对待不同的社会关系,要么采用皮尔逊相关系数(PCC)计算用户之间的相似性。)
(3) Furthermore, we can also observe from (3) that PCC is static and independent of user latent feature vectors. Therefore, applying PCC similarity to the calculation of attentions between users will result in a suboptimal result. (此外,我们还可以从(3)中观察到,PCC是静态的,与用户潜在的特征向量无关。 因此,将PCC相似度应用于用户之间关注度的计算将导致次优结果。)
(3) As a conclusion, all the existing approaches fail to solve the optimal limited attention problem. (综上所述,现有的方法都无法解决最优有限注意问题。)
(1) To solve Problem 1, we propose a novel algorithm OLA-Rec which is capable of finding an optimal number of best influential friends and their corresponding attentions from each target user. (为了解决问题1,我们提出了一种新的算法OLA Rec,该算法能够从每个目标用户中找到最佳数量的最有影响力的朋友及其相应的注意事项。)
(2) We begin by introducing a new d × 1 d \times 1 d×1 vector ϕ i \phi_i ϕi for each user i i i, such that
We minimize the absolute difference between ϕ i \phi_i ϕi and α i u U u \alpha_{iu}U_u αiuUu so that they are close to each other:
(2) As we discussed, the challenge in Problem 1 is to find an optimal number of best influential friends for each individual and learn the optimal attention for them with respect to the best recommendation accuracy. We start to tackle this challenge by considering the following inequality based on (5): (正如我们所讨论的,问题1中的挑战是为每个人找到最佳数量的最有影响力的朋友,并了解他们在最佳推荐准确性方面的最佳关注度。我们开始通过考虑以下基于(5)的不平等来应对这一挑战:)
(3) The assumption of Lipschitz continuous function, on the other hand, is required to bound the so-called bias term. Thus there comes another optimization problem from (6) such that solving it could obtain a guarantee for (5) with high probability: (另一方面,Lipschitz连续函数的假设需要约束所谓的偏差项。因此,从(6)中得到了另一个优化问题,解决它可以以高概率获得(5)的保证:)
(4) and user i i i’s friends u ∈ F ( i ) u \in F(i) u∈F(i) are assumed to be in an ascending order with respect to d ( U u , U i ) d(U_u, U_i) d(Uu,Ui). (假设是按升序排列的)
(5) Inspired by Anava’s work [2], we come out with the following theorem and corollary: (受Anava’s工作[2]的启发,我们得出了以下定理和推论:)
(1) Consider the alternative expression in (8), i.e., m i n α i C ( ∥ α i ∥ 2 + α i T β ) min_{\alpha_i} C( ∥\alpha_i∥_2+ \alpha^T_i β) minαiC(∥αi∥2+αiTβ), by ignoring C and introducing the Lagrange Multipliers, we have: (通过忽略 C C C并引入拉格朗日乘数,我们得到:)
(2) Given the convexity of (8), a global optimum is guaranteed for any solution satisfying the KKT conditions. Take the partial derivative of Lagrangian with respect to α i \alpha_i αi, set it to 0: (给定(8)的凸性,对于满足KKT条件的任何解,都保证全局最优。取拉格朗日对 α i \alpha_i αi的偏导数,设为0:)
(3) Thus for any optimal attention α i u ∗ > 0 α^∗_{iu} > 0 αiu∗>0, we have:
(4) Further combining (12) with the constraint that ∑ u = 1 ∣ F ( i ) ∣ α i u = 1 \sum^{|F(i)|}_{u=1} \alpha_{iu} = 1 ∑u=1∣F(i)∣αiu=1, any α i u ∗ > 0 α^∗_{iu} > 0 αiu∗>0 can be calculated as follows:
A direct statement from Theorem 3.1 is as follows.
*There exists 1 ≤ k i ∗ ≤ ∣ F ( i ) ∣ 1 ≤ k^∗_i≤ |F(i)| 1≤ki∗≤∣F(i)∣ whose relation to α i ∗ α^∗_i αi∗ in Theorem 3.1 is as follows: ∀ u > k i ∗ \forall u > k^∗_i ∀u>ki∗, α i u ∗ = 0 α^∗_{iu} = 0 αiu∗=0 and ∀ u ≤ k i ∗ ∀u \le k^∗_i ∀u≤ki∗, α i u ∗ > 0 α^∗_{iu} > 0 αiu∗>0.
(5) Theorem 3.1 and Corollary 3.2 confirm the existence of optimal solution for Problem 1, which is that for each target user i i i, k i ∗ k^∗_i ki∗ is the optimal number of best influential friends needed whose attentions from i should be non-zero and whose attentions correspond to the k i ∗ k^∗_i ki∗ smallest values of β i β_i βi. (定理3.1和推论3.2证实了问题1的最优解的存在性,即每个目标用户 i i i的最优解。 k i ∗ k^*_i ki∗是所需的最佳有影响力的朋友数量,这些朋友的关注度应为非零,并且其关注度应与 β i β_i βi的 k i ∗ k^∗_i ki∗个最小值对应 )
(6) Through rewriting (14) in a quadratic form, we have (15): (通过以二次形式重写(14),我们得到)
(7) Thus, λ \lambda λ for user i i i can be calculated in (16) through solving (15). We note that we only keep the solution satisfying α i u ≥ 0 , ∀ u ∈ F ( i ) \alpha_{iu} \ge 0, \forall u \in F(i) αiu≥0,∀u∈F(i) (因此,可以通过求解(15)在(16)中计算用户 i i i的 λ \lambda λ。我们注意到,我们只会让解决方案满意)
(8) Therefore given k i ∗ k^∗_i ki∗, the optimal attention α i ∗ α^∗_i αi∗ can be obtained through substituting the computed value of λ \lambda λ by (16) into (10). Algorithm 1 presents the details for finding the optimal number k i ∗ k^∗_i ki∗ and optimal attention α i ∗ α^∗_i αi∗ for target user i i i. (因此给出了 k i ∗ k^∗_i ki∗ , 最佳注意 α i ∗ α^∗_i αi∗可通过将 λ \lambda λ的计算值代入(10)得到。算法1给出了寻找目标用户 i i i最优数 k i ∗ k^*_i ki∗的细节和最佳注意力 α i ∗ α^∗_i αi∗的细节。)
(9) Next we incorporate the concept of optimal limited attention into social recommendation through combining the optimal k ∗ k^∗ k∗ and α ∗ \alpha^∗ α∗ with matrix factorization. Generally, we estimate user i i i’s rating on item j j j, R i j R_{ij} Rij, through the dot product of social factor ϕ i \phi_i ϕi and item j j j’s latent feature vector V j V_j Vj: (接下来,我们将最优有限注意的概念结合到社会推荐中,通过在矩阵分解中结合 k ∗ k^∗ k∗ and α ∗ \alpha^∗ α∗ )
(10) Thus, we keep R i j R_{ij} Rij and ϕ i T V j \phi^T_iV_j ϕiTVj close to each other through minimizing the square loss shown in (18): (通过最大限度地减少图中所示的平方损失,彼此接近)
(11) Besides, given the additional social information for user i i i, we also hope that U i U_i Ui is close to ϕ i \phi_i ϕi and ϕ i \phi_i ϕi in turn is close to ∑ u ∈ F ( i ) k ∗ α i u U u \sum_{u \in F(i)_{k^*}} \alpha_{iu}U_u ∑u∈F(i)k∗αiuUu as well: (此外,考虑到用户ii的额外社交信息,我们也希望用户i 靠近 ϕ i \phi_i ϕi, ϕ i \phi_i ϕi反过来又去接近 ∑ u ∈ F ( i ) k ∗ α i u U u \sum_{u \in F(i)_{k^*}} \alpha_{iu}U_u ∑u∈F(i)k∗αiuUu)
(12) Putting (18) (19) and (20) together, our objective function is: (其中我们表示 F ( i ) k ∗ F(i)_{k^∗} F(i)k∗作为用户 i i i的 k i ∗ k^∗_i ki∗最好的有影响力的朋友,且 α i u ∗ α^∗_{iu} αiu∗是从用户 i i i到 u u u的最佳注意力。把(18)(19)和(20)放在一起,我们的目标函数是)
where ∑ U i T U i \sum U^T_iU_i ∑UiTUi and ∑ V j T V j \sum V^T_j V_j ∑VjTVjare regularization terms preventing overfitting.
(13) Assuming the optimal k i ∗ k^∗_i ki∗ and attention α i u ∗ α^∗_{iu} αiu∗ for user i i i are known, a local minimum of (21) can be found by taking the derivative and performing gradient descent on U i U_i Ui, V j V_j Vj, ϕ i \phi_i ϕi separately. The corresponding partial derivatives are shown as follows: (假设最优 k i ∗ k^∗_i ki∗和注意力 α i u ∗ \alpha^∗_{iu} αiu∗对于已知的用户 i i i,分别在 U i U_i Ui, V − j V-j V−j, ϕ i \phi_i ϕi上通过取导数并执行梯度下降,可以找到(21)的局部最小值,相应的偏导数如下所示:)
(14) We close this section by presenting the whole picture of our proposed OLA-Rec model. We employ an Expectation-Maximization (EM) [6] style optimization strategy to alternatively learn the parameters k ∗ k^∗ k∗, α ∗ α^∗ α∗, ϕ ϕ ϕ, U U U, V V V that minimize L \mathcal{L} L. (我们通过展示我们提出的OLA Rec模型的全貌来结束本节。我们采用期望最大化(EM)[6]式优化策略来交替学习参数 k ∗ k^∗ k∗, α ∗ α^∗ α∗, ϕ ϕ ϕ, U U U, V V V使 L \mathcal{L} L最小化的。)
In each iteration, the optimal number k ∗ k^∗ k∗ and optimal attention α ∗ α^∗ α∗ for each user are calculated based on the current ϕ \phi ϕ and U U U through employing Algorithm 1. (在每次迭代中,每个用户的最优数 k ∗ k^∗ k∗ 最佳注意力 α ∗ \alpha^∗ α∗通过采用算法1,基于当前 ϕ \phi ϕ和 U U U,计算得到)
Given the optimal k ∗ k^∗ k∗ and α ∗ α^∗ α∗ obtained from E-step , ϕ \phi ϕ, U U U, V V V are updated using standard gradient descent: (给定最优 k ∗ k^∗ k∗和 α ∗ α^∗ α∗E-step中获得, ϕ \phi ϕ, U U U, V V V使用标准梯度下降进行更新:)
where η \eta η is the learning rate and x ∈ { U , V , ϕ } x \in \{U,V, \phi\} x∈{U,V,ϕ} denotes any model parameter.(学习率 任何模型参数)
(15) Finally, the whole procedure terminates when the absolute difference between the losses in two consecutive iterations is less than 10−5. (最后,当两个连续迭代中损失的绝对差值小于 1 0 − 5 10^{-5} 10−5时,整个过程终止.)
(16) We close this section by pointing out that the concept of limited attention is a well-studied cognitive factor in social science which claims only a small portion of information can be processed in real time by each individual due to her limited mind strength. (我们在结束本节时指出,有限注意力的概念是社会科学中一个经过充分研究的认知因素,它声称由于每个人有限的思维能力,只有一小部分信息可以被实时处理)
In this section, we compare our proposed algorithm (OLA-Rec) with several state-of-the-art methods on four real-world datasets to demonstrate the superiority of OLA-Rec model over the others with respect to various evaluation metrics.
The following metrics are used to measure the recommendation accuracy.
(1) Mean Absolute Error (MAE).
(3) Recall@K.
This metric quantifies the fraction of consumed items that are in the top-K ranking list sorted by their estimated rankings. For each user u u u we define S ( K ; u ) S(K;u) S(K;u) as the set of already-consumed items in the test set that appear in the top-K list and S ( u ) S(u) S(u) as the set of all items consumed by this user in the test set. Then, we have (该指标量化了按估计排名排序的top-K排名列表中已消费物品的比例。对于每个用户 u u u,我们将 S ( K ; u ) S(K;u) S(K;u)定义为测试集中出现在top-K列表中的已消费项目集,将 S ( u ) S(u) S(u)定义为该用户在测试集中消费的所有项目集。那么,我们有)
(4) Precision@K.
This measures the fraction of the top-K items that are indeed consumed by the user (test set): (这测量了用户实际消费的top-K项目的分数(测试集):)
Our experiments are performed on four real-world datasets whose detailed filtering information will be presented in Appendix A. Table 1 gives a summary about their basic statistics. (我们的实验是在四个真实世界的数据集上进行的,这些数据集的详细过滤信息将在附录A中给出。表1总结了它们的基本统计数据。)
The following seven recommendation methods including our proposed OLA-Rec model are compared. (比较了以下七种推荐方法,包括我们提出的OLA Rec模型。)
In Table 2, we show the performances of the above seven comparative models on four datasets, in terms of RMSE, MAE, Precision@5 (Pre@5) and Recall@5 (Rec@5). We conduct paired difference tests for two ranking metrics, Pre@5 and Rec@5, and ∗ indicates the significance of testing results at p < 0.05 with degree of freedom as # users - 1 on each dataset. (在表2中,我们展示了上述七个比较模型在四个数据集上的性能,即RMSE、MAE、,Precision@5 (Pre@5)及Recall@5 (Rec@5).我们对两个排名指标进行配对差异测试,Pre@5和Rec@5和∗ 表明测试结果在p<0.05时的显著性,每个数据集的自由度为#用户-1。)
(1) RMSE and MAE.
(2) Pre@5 and Rec@5.
(3) Precision v.s. Recall.
(4) Impact of L C L_C LC Ratio.
(5) Histograms on k ∗ n \frac{k^∗}{n} nk∗.
(6) Cold Start Problem.
Douban. This is a public dataset from a Chinese movie forum (http://movie.douban.com/), containing user-user friendships and user-movie ratings, and is publicly available from (https: //www.cse.cuhk.edu.hk/irwin.king.new/pub/data/douban). (豆瓣。这是一个来自中国电影论坛的公共数据集(http://movie.douban.com/),包含用户友谊和用户电影评分,可从以下网址公开获取(https://www.cse.cuhk.edu.hk/irwin.king.new/pub/data/douban)。)
CiaoDVD.The trust relationships among users from CiaoDVD as well as their ratings on DVDs are included. It was crawled from the entire category of DVDs of a UK DVD community website (http://dvd.ciao.co.uk) in December, 2013. (包括CiaoDVD用户之间的信任关系以及他们在DVD上的评级。它是从英国DVD社区网站的整个DVD类别中抓取的(http://dvd.ciao.co.uk)2013年12月。)
Epinions. This dataset comes from an American website and consists of trust relationships and user-item ratings. This dataset (http://www.trustlet.org/wiki/Epinions_dataset) is extracted from the consumer review website Epinions (http://www.epinions.com/), which contains user-user trust relationships and numerical ratings. (该数据集来自一家美国网站,由信任关系 和 用户项目评级 组成。这个数据集(http://www.trustlet.org/wiki/Epinions_dataset)摘自消费者评论网站Epinions(http://www.epinions.com/),其中包含用户信任关系和数字评级。)
Flixster.This dataset (http://www.cs.ubc.ca/~jamalim/datasets/) contains the information of user-movie ratings as well as user-user friendships from Flixster, an American social movie site for discovering new movies (http://www.flixster.com/). (这个数据集(http://www.cs.ubc.ca/~jamalim/datasets/)包含用户电影评级的信息,以及来自Flixster的用户友谊信息。Flixster是一个美国社交电影网站,用于发现新电影(http://www.flixster.com/).)
We remove users with less than 2 ratings and select 80% of each user’s ratings at random for training, leaving the remainder as test set. (我们删除评分低于2的用户,并随机选择每个用户评分的80%进行训练,剩下的作为测试集。)