论文下载地址: 10.1109/ICDM.2019.00072
发表期刊:ICDM
Publish time: 2019
作者及单位:
数据集: 正文中的介绍
代码:
其他:
其他人写的文章
简要概括创新点: (1)同时用rating和review (2)task是 Trust Prediction
- we propose a novel deep user model for trust prediction based on user similarity measurement. (提出了一种基于用户相似性度量的深度用户信任预测模型。)
- It is a comprehensive data sparsity insensitive model that combines a user review behavior and the item characteristics that this user is interested in. (它是一个综合的数据稀疏性不敏感模型,结合了用户评论行为和该用户感兴趣的项目特征。)
- With this user model, we firstly generate a user’s latent features mined from user review behavior and the item properties that the user cares. (在这个用户模型中,我们首先从用户的评论行为和用户关心的项目属性中挖掘出用户的潜在特征。)
- Then we develop a pair-wise deep neural network to further learn and represent these user features. (然后我们开发了一个成对的深层神经网络来进一步学习和表示这些用户特征。)
- Finally, we measure the trust relations between a pair of people by calculating the user feature vector cosine similarity. (最后,我们通过计算用户特征向量的余弦相似度来度量两个人之间的信任关系。)
Trust prediction, Online social networks, User modeling
(1) Having an effective way to predict trust relations among people in social media can support products marketing, awareness promoting, and decisions making. (有一种有效的方法来预测社交媒体中人们之间的信任关系,可以支持产品营销、提高意识和决策。)
(2) In recent years, there have been a few trust prediction works reported in the literature, which can be roughly categorized into two groups: (近年来,文献中报道了一些信任预测工作,大致可分为两类:)
(3) As a social concept, the issue of trust has been extensively studied in social science. (信任作为一个社会概念,在社会科学中得到了广泛的研究。)
Fig. 1. Example of users providing reviews and ratings for the products and services on online social networks. There exist some trust relations among users. The challenge is to predict and establish trust relations between all the users based on the available information. (图1。在在线社交网络上为产品和服务提供评论和评级的用户示例。用户之间存在一定的信任关系。挑战在于根据可用信息预测并建立所有用户之间的信任关系。)
(4) As can be seen in Fig.1, a user can express her/his opinion on an item by providing ratings and writing reviews on social media. (如图1所示,用户可以通过在社交媒体上提供评分和撰写评论来表达自己对某个项目的看法。)
(5) The advancement of deep learning technologies has shown great success in many application areas such as neural language processing [13], [14], recommendation systems [15]–[17]. (深度学习技术的进步在许多应用领域取得了巨大成功,如神经语言处理[13]、[14]、推荐系统[15]–[17]。)
(6) In order to tackle the challenges discussed above, we develop a novel deep user model that has the following features: (为了应对上述挑战,我们开发了一种新的深度用户模型,该模型具有以下特点:)
(7) The main contributions of this paper are summarized as follows. (本文的主要贡献总结如下)
(8) The rest of the paper is structured as follows. Section 2 discusses the related work. Section 3 formulates the trust prediction problem. Section 4 proposes our deep trust prediction framework. Section 5 explains the experiments conducted and analyses the result. Finally Section 6 concludes the work. (论文的其余部分结构如下。第2节讨论了相关工作。第3节阐述了信任预测问题。第4节提出了我们的深度信任预测框架。第5节解释了进行的实验并分析了结果。最后,第6节总结了本文的工作。)
(1) Trust network structure has been widely exploited by existing trust prediction methods as it is the most important available resource. (信任网络结构作为最重要的可用资源,已被现有的信任预测方法广泛利用。)
(2) However, trust network structure based approaches normally suffer from the data sparsity problem since the number of trust relations may not be sufficiently enough to guarantee the success of such approaches. (然而,基于信任网络结构的方法通常会遇到数据稀疏问题,因为信任关系的数量可能不足以保证此类方法的成功。)
(1) Low-rank approximation based method is widely employed in various applications such as collaborative filtering [23], [24] and document clustering [5], [25]. (基于低秩近似的方法广泛应用于各种应用,如协同过滤[23]、[24]和文档聚类[5]、[25]。)
(2) Different from the trust network structure based approaches, the low-rank approximation based approaches do not have to rely on the paths between users. However, all the aforementioned low-rank approximation based models still significantly suffer from the data sparsity problem since these approaches conduct factorization directly on the sparse trust relation matrix. (与基于信任网络结构的方法不同,基于低秩近似的方法不必依赖于用户之间的路径。然而,由于这些方法直接在稀疏信任关系矩阵上进行因子分解,所有上述基于低秩近似的模型仍然显著地受到数据稀疏性问题的影响。)
(1) As presented in most of the product review websites, there usually exists Users, Items, User to User interaction and User to Item interaction. The goal here is to predict the trust degree between these users based on their similarities in relation to the items which they rated or commented on. For the preparation of similarity calculation and trust prediction, we specify the elements and their relationships used in the calculation as follows: (如大多数产品评论网站所示,通常存在用户、项目、用户对用户的交互和用户对项目的交互。这里的目标是根据这些用户与他们评分或评论的项目的相似性来预测他们之间的信任度。为了准备相似性计算和信任预测,我们将计算中使用的元素及其关系指定如下:)
(2) As discussed before, we will consider the features of both users’ review behaviors and items that the users have provided ratings and reviews. (如前所述,我们将考虑用户的评论行为和用户提供评分和评论的项目的特征。)
(3) Now we concatenate these features: P i Pi Pi, Q j Qj Qj, R E i U RE^U_i REiU, R E j I RE^I_j REjI as user u i u_i ui’s features, denoted as F u i F_{u_i} Fui. (现在我们将这些特性连接起来: P i Pi Pi, Q j Qj Qj, R E i U RE^U_i REiU, R E j i RE^i_j REji作为用户 U i U_i Ui的特性,表示为 F U i F_{U_i} FUi。)
(1) Homophily theory is the most important social science theory that indicates trust relations are more likely to establish between similar users, which is widely observed in social networks [30]. (同质理论是最重要的社会科学理论,表明相似用户之间更容易建立信任关系,这在社交网络中被广泛观察[30]。)
(2) We firstly decide to measure users’ similarity by their rating vector cosine similarity to answer the above questions. (我们首先决定通过用户的评分向量余弦相似度来衡量用户的相似度,以回答上述问题。)
(3) To answer the second question, we calculate users with explicit trust relations in two groups. (为了回答第二个问题,我们计算了两组具有明确信任关系的用户。)
(4) According to the above investigation, positive answers to both questions verify the existence of the homophily effect in trust relations. (根据上述调查,对这两个问题的肯定回答证实了信任关系中的同质效应的存在。)
(1) In this section, we describe how to leverage a user’s comprehensive features to model his/her characteristics. We mainly consider a user’s review behavior and the properties of the items related to the user. (在本节中,我们将介绍如何利用用户的全面的特点来模拟其特征。我们主要考虑用户的评论行为和与用户相关的项目的属性。)
(2) Modeling User Review Behavior: On a product review website such as Epinions, a user’s review behavior usually includes ratings and reviews, which both indicate a user’s characteristics in different forms. (建模用户评论行为:在Epinions等产品评论网站上,用户的评论行为通常包括评分和评论,这两种行为都以不同的形式表示用户的特征。)
(3) To obtain the information of a user’s features delivered from his/her review behavior, we use Matrix Factorization(MF) [31] and Doc2Vec [32] to process ratings and reviews respectively. (为了从用户的评论行为中获得用户特征的信息,我们分别使用矩阵分解(MF)[31]和 Doc2Vec [32]来处理评分和评论。)
(4) Doc2vec learns k-dimension vector representations for variable length pieces of reviews. (Doc2vec学习可变长度评论的k维向量表示。)
(5) In our model, we set k = 32 k = 32 k=32 in MF and doc2vec due to the preliminary experiment results. (在我们的模型中,根据初步实验结果,我们在MF和doc2vec中设置了 k = 32 k=32 k=32。)
(1) Items rated and reviewed by a user naturally reflect the interest of the user, thus the properties lie behind an item should be treated as a part of user features. (用户评分和评论的项目自然反映了用户的兴趣,因此项目背后的属性应被视为用户功能的一部分。)
(2) In real world, a user uiusually rates/reviews a list of items, denoting as { v 1 , . . . , v j } \{v_1, ..., v_j\} {v1,...,vj}, and thus there is { V 1 , . . . , V j } \{V_1, ..., V_j\} {V1,...,Vj}. (在现实世界中,用户通常会对项目列表进行评分/审核,表示为 { v 1 , … , v j } \{v_1,…,v_j\} {v1,…,vj},因此有 { V 1 , . . . , V j } \{V_1, ..., V_j\} {V1,...,Vj}。)
In order to obtain the representative item features related to u i u_i ui, we average all the items’ features related to u i u_i ui as follows: (为了获得与 u i u_i ui相关的代表性项目特征,我们对与 u i u_i ui相关的所有项目特征进行平均,如下所示:)
(1) According to the aforementioned user modeling steps, the items’ properties are mined from all the received ratings and reviews, and then they are considered as a part of user features. Together with user review behavior features, the user feature F u i F_{ui} Fui is calculated as: (根据上述用户建模步骤,从所有收到的评分和评论中挖掘项目的属性,然后将其视为用户特征的一部分。与用户评论行为特征一起,用户特征 F u i F_{ui} Fui计算如下:)
(2) We want to mention that the concatenation vector obtained from pre-trained MF and doc2vec guarantees the accuracy of the input for the following pair-wise deep neural network, which contributes to the final superior performance. (我们想指出的是,从预先训练的MF和doc2vec获得的串联向量保证了以下成对深层神经网络输入的准确性,这有助于最终获得优异的性能。)
(2) To the best of our knowledge, this is the first comprehensive user modeling method which considers and integrates all the above features. (据我们所知,这是第一个综合考虑并集成了上述所有功能的用户建模方法。)
(1) After user feature modeling, we feed the concatenation user feature representations of pairs of users into a pair-wise deep neural network to further capture and represent users’ characteristics, which provides a deep understanding of user. (在用户特征建模之后,我们将成对用户的串联用户特征表示输入到一个成对的深度神经网络中,进一步捕获和表示用户的特征,从而提供对用户的深入理解。)
(2) The pair-wise deep neural network has a multi-layer perception unit and a similarity calculation unit. The input of the deep neural network are pairs ( F u i , F u j ) (F_{u_i}, F_{u_j}) (Fui,Fuj) for each pair of ( u i , u j ) (u_i, u_j) (ui,uj). The similarity calculation unit then calculates the similarity between the user pair ( u i , u j ) (u_i, u_j) (ui,uj). (成对深层神经网络具有 多层感知单元 和 相似性计算单元 。深度神经网络的输入是每对 ( u i , u j ) (u_i,u_j) (ui,uj)的对 ( F u i , F u j ) (F_{u_i},F_{u_j}) (Fui,Fuj)。然后,相似性计算单元计算用户对 ( u i , u j ) (u_i,u_j) (ui,uj)之间的相似性。)
(3) Formally, for a deep neural network, if we denote the input vector as x x x, the final output latent representation vector as y y y, the output of intermediate hidden layers by l i , i = 1 , . . . , N − 1 l_i, i = 1,...,N − 1 li,i=1,...,N−1, the i t h i_{th} ith weight matrix by W i W_i Wi, and the i t h i_{th} ith bias term by b i b_i bi, we have (形式上,对于深度神经网络,如果我们将输入向量表示为 x x x,最终输出潜在表示向量表示为 y y y,中间隐藏层的输出表示为 l i , i = 1 , . . . , N − 1 l_i, i = 1,...,N − 1 li,i=1,...,N−1)
(4) Therefore, for each trusted pair of users u i u_i ui and u j u_j uj, the features F u i F_{u_i} Fui and F u j F_{u_j} Fuj are finally mapped to a low-dimensional vector in a latent space through our pair-wise deep neural network as shown in Equation 7. (因此,对于每个受信任的用户 u i u_i ui和 u j u_j uj对,特征 F u i F_{u_i} Fui和 F u j F_{u_j} Fuj最终通过我们的成对深层神经网络映射到潜在空间中的低维向量,如式子7所示)
(1) According to the user latent features learned from the deep neural network, we then calculate the similarity between user latent features. Finally, the similarity value will be used to measure the trust relations between pairs of users. In this paper, we employ cosine similarity to calculate the similarity between uiand uj, which is calculated as: (根据从深度神经网络中学习到的用户潜在特征,计算用户潜在特征之间的相似度。最后,相似度值将用于衡量用户对之间的信任关系。在本文中,我们使用余弦相似性来计算Ui和uj之间的相似性,计算公式如下:)
(2) For optimization, our model is trained with the squared loss function L s q r L_{sqr} Lsqr, which is widely used in existing work: (为了优化,我们的模型用平方损失函数 L s q r L_{sqr} Lsqr训练, 在现有工作中广泛使用:)
(3) Finally, all the testing pairs are ranked in decreasing order as a list according to the calculated similarity score. (最后,根据计算出的相似度得分,将所有测试对按降序排列为一个列表。)
We carry out a set of experiments against two real-world datasets to answer the following questions: (我们针对两个真实数据集进行了一系列实验,以回答以下问题:)
To illustrate the performance of our model to answer question Q1, we compare our proposed deepTrust prediction model with six baseline approaches including both classical and the state-of-the-art methods, which are introduced in details as follows: (为了说明我们的模型在回答问题Q1时的性能,我们将我们提出的deepTrust预测模型与六种基线方法进行了比较,包括经典方法和最先进的方法,详细介绍如下:)
Random: This is a basic baseline approach that randomly selects trust relations among the pairs of users. (这是一种基本的基线方法,随机选择用户对之间的信任关系。)
TP: Trust propagation evaluates trust relations along a path between users, which is the most typical trust network structure based trust prediction approach [4]. 信任传播沿着用户之间的路径评估信任关系,这是最典型的基于信任网络结构的信任预测方法[4]。()
RS: Rating similarity predicts trust values between users by calculating the similarity of their ratings [33]. (评级相似性通过计算用户评级的相似性来预测用户之间的信任值[33]。)
MF: Matrix Factorization is a classical low-rank approximation based trust prediction approach, which performs matrix factorization on the matrix representation of trust relations [5]. (矩阵分解是一种经典的 基于低秩近似的信任预测 方法,它对信任关系的矩阵表示进行矩阵分解[5]。)
hTrust: hTrust adds rating similarity as regularization to trust matrix factorization for trust prediction [1]. It is a state-of-the-art trust prediction approach combining the additional knowledge and the low-rank approximation model. (hTrust将评级相似性作为正则化添加到信任预测的信任矩阵分解中[1]。它是一种结合了附加知识和低秩近似模型的最先进的信任预测方法。)
Power-Law: Power-Law distribution aware trust prediction approach is a state-of-the-art low-rank approximation based approach, which models the power-law distribution property of trust relations in online social networks by learning both the low-rank and the sparse part of a trust network [3]. (幂律分布感知信任预测方法是一种基于低秩近似的最新方法,它通过学习信任网络的低秩和稀疏部分来模拟在线社交网络中信任关系的幂律分布特性[3]。)
The best trust prediction accuracy of our model and baseline models are summarized in Table II. The ratio of training size and testing size is 80% : 20%. There are N labeled trust relations, M pairs of them are predicted as trust relations, the trust prediction accuracy is calculated as M/N ×100%. (表II总结了我们的模型和基线模型的最佳信任预测精度。培训规模与考试规模之比为80%:20%。有N个标记的信任关系,其中M对被预测为信任关系,信任预测精度计算为M/N×100%。)
As can be seen in Table II, our proposed method performs the best and it outperforms the six comparison approaches clearly by an average of 31%, ranging from 25% to 47%. In addition, our model is significantly better than power-law, hTrust and MF, while power-law, MF and hTrust are generally better than TP . This result indicates that our novel deep trust prediction model is better than the low-rank approximation based models and the low-rank approximation based models are better than the trust network structure based models, which shows that our new deep prediction model has a superior performance compared with the existing works. (如表二所示,我们提出的方法表现最好,明显优于六种比较方法,平均为31%,从25%到47%不等。此外,我们的模型明显优于幂律、hTrust和MF,而幂律、MF和hTrust通常优于TP。这一结果表明,我们新的深度信任预测模型优于基于低秩近似的模型,基于低秩近似的模型优于基于信任网络结构的模型,这表明我们新的深度信任预测模型具有优于现有模型的性能。)
As can be seen in Table II, hTrust achieves better performance than MF by incorporating rating similarity as regularization, which shows that user similarities play an important role in trust prediction. Power-Law distribution aware trust prediction approach achieves much better results compared with other baseline approaches. It also shows that considering the sparsity problem in trust network is quite helpful to improve trust prediction accuracy. These results clearly confirm that incorporating the similarity between users features will alleviate the data sparsity problem and can improve the accuracy of trust prediction. (从表2中可以看出,hTrust通过将评级相似性作为正则化来实现比MF更好的性能,这表明用户相似性在信任预测中起着重要作用。与其他基线方法相比,幂律分布感知信任预测方法取得了更好的结果。研究还表明,考虑信任网络中的稀疏性问题有助于提高信任预测的准确性。这些结果清楚地证实,结合用户特征之间的相似性将缓解数据稀疏性问题,并可以提高信任预测的准确性。)
Therefore, we believe that the reasons for our better performance comparing existing approaches are: (因此,我们认为,与现有方法相比,我们的绩效更好的原因是:)
In this section, we mainly illustrate the influence of two important parameters, namely, negative instance number and the network hidden layer number to answer question Q2. The results over the two datasets are shown in Fig.4 (a) Epinions and Fig.4 (b) Ciao respectively. (在这一部分中,我们主要说明两个重要参数,即负实例数和网络隐藏层数对回答问题Q2的影响。这两个数据集的结果分别如图4(a)和图4(b)所示。)
To answer question Q3, we conduct experiments on different trust relation sparsity degree to show that our model is insensitive to the sparsity degree of trust relations. In reality, online social networks are usually very sparse, and how to overcome the data sparsity problem has been a long-standing challenge. (为了回答问题3,我们对不同的信任关系稀疏度进行了实验,结果表明我们的模型对信任关系的稀疏度不敏感。实际上,在线社交网络通常非常稀疏,如何克服数据稀疏问题一直是一个长期的挑战)
For a dataset with m users, if there exists trtrust relations, then the trust relation sparsity degree is ( t r / m ∗ m ) (t_r / m ∗ m) (tr/m∗m). Sparsity degree indicates how sparse the dataset is. The smaller the sparsity degree is, the sparser the trust relations. (对于具有m个用户的数据集,如果存在trust关系,则信任关系稀疏度为 ( t r / m ∗ m ) (t_r / m ∗ m) (tr/m∗m) 。稀疏度表示数据集的稀疏程度。稀疏度越小,信任关系越稀疏。)
Keeping the same number of ratings/reviews for each user, we process Epinions and Ciao datasets with different sparsity degree 0.2445%,0.1869%,0.1237% and 0.1245%,0.1037%,0.0863%, respectively. (在保持每个用户的评分/评论数相同的情况下,我们分别以0.2445%、0.1869%、0.1237%和0.1245%、0.1037%和0.0863%的稀疏度处理Epinions和Ciao数据集。)
In the proposed deep Trust method, user modeling includes three parts: user reviews, user ratings, and item properties. (在提出的深度信任方法中,用户建模包括三个部分:用户评论、用户评分和项目属性。)
As can be seen in Table III, compared with the deepTrust, the prediction accuracy is decreased by about 10%, 7%, and 3% in deepTrust-review, deepTrust-rating, deepTrust-item respectively. (从表三可以看出,与deepTrust相比,deepTrust review、deepTrust rating和deepTrust item的预测准确率分别降低了约10%、7%和3%。)
(1) In this paper, we propose a novel deep user modeling framework deep Trust for trust prediction between people based on the homophily theory. (在本文中,我们提出了一个新的深度用户建模框架deepT-trust,用于基于同质理论的人与人之间的信任预测。)
(2) We mainly focus on the networks extracted from Epinions and Ciao in this paper. (本文主要研究从Epinions和Ciao中提取的网络。)