此篇论文发表于: SIAM International Conference on Data Mining, 2013(CCF B类会议)。
摘要对于写论文来说是比较重要的,但是对于我们新手写论文来说,一般比较难于掌握。因此,我才用采用七步走的方法对改论文(Multi-view Clustering via Joint Nonnegative Matrix Factorization)进行解释。
第一步: 交代背景:多视角数据的普遍性和重要性(Many real-world datasets are comprised of different representations or views which often provide information complementary to each other)。
第二步: 概括当前方法 (To integrate information from multiple views in the unsupervised setting, multi-view clustering algorithms have been developed to cluster multiple views simultaneously to derive a solution which uncovers the common latent structure shared by multiple views.)。
第三步: 一般介绍现有方法的不足 (当然本篇论文在摘要中没有这样写,后续的引言中有提及。在我们写作当中,视具体情况而定)。
第四步: 提出当前的方法(In this paper, we propose a novel NMF-based multi-view clustering algorithm by searching for a factorization that gives compatible clustering solutions across multiple views).。
第五步: 在提出论文的方法之后,需要进行对自己提出的方法的大致的介绍 (The key idea is to formulate a joint matrix factorization process with the constraint that pushes clustering solution of each view towards a common consensus instead of xing it directly)。
第六步: 第五步进行了理论上的阐述。这一步呢,通常是对提出的算法怎么样实现优化的一句话或者两句话。不能太长,因为有字数限制。(The main challenge is how to keep clustering solutions across different views meaningful and comparable. To tackle this challenge, we design a novel and effective normalization strategy inspired by the connection between NMF and PLSA.)
第七步: 简要介绍一下实验,这个比较的套路一般都是这个套路。(Experimental results on synthetic and several real datasets demonstrate the effectiveness of our approach.)。
以上就是大致的一个流程,我也正在学习,若有不足请各位耐心支出。非常感谢。
首先引言第一段呢,一般情况下都是对于研究方向背景的一个介绍。例如本论文是对与多视角数据聚类的研究,因此,引言部分首先是对多视角数据的介绍,多视角数据在现实中有哪些例子。然后,一般是:单视角仅仅利用多视角信息明显是不足的,因此综合利用多视角信息我往往是优于单视角的。 现实中的很多数据集由不同的表达和视角组成。例如,同一个故事可以被不同的文章讲述;一个文件可以被翻译为多种不同的语言;研究团队可以由不同的研究课题组成,也可以看做由不同的作者组成;网页可以基于内容分类,也可以基于超链接文本等等。在这些数据中,每个数据集被不同的属性表达,这些属性可以自然地分成不同的子主题,每个子主题都提供了足够挖掘的信息。由于这些不同的表达往往含有兼容互补的信息,我们很自然地想到将这些信息整合在一起以得到更好的性能。从多视角中学习的关键在于如何利用每个视角自己的信息以得到更好的表现而不仅仅串联这些信息。 |
Many datasets in real world are naturally comprised of different representations or views [5]. For example, the same story can be told in articles from different news sources, one document may be translated into multiple different languages, research communities are formed based on research topics as well as co-authorship links, web pages can be classied based on both content and anchor text leading to hyperlinks, and so on. In these applications, each data set is represented by attributes that can naturally be split into different subsets, any of which succeces for mining knowledge. Observing that these multiple representations often provide compatible and complementary information, it becomes natural for one to integrate them together to obtain better performance rather than relying on a single view. The key of learning from multiple views (multi-view) is to leverage each view's own knowledge base in order to outperform simply concatenating views. |
接下来呢,需要做一个转折。什么转折? 比如说本篇论文是做多视角聚类。机器学习中有分类,有回归等多种任务,你怎么转到多视角数据聚类呢? 论文中使用了这样一句话:As unlabeled data are plentiful in real life and increasing quantities of them come in multiple views from diverse sources, the problem of unsupervised learning from multiple views of unlabeled data has attracted attention [3, 17], referred to as multi-view clustering, 引出了多视角数据聚类这个任务。
多视角聚类现有方法的不足: 1. 第一种方法: Algorithms in the rst category [3, 17] incorporate multi-view integration into the clustering process directly through optimizing certain loss functions. 2. 第二种: algorithms in the second category such as the ones based on Canonical Correlation Analysis [8, 4] rst project multi-view data into a common lower dimensional subspace and then apply any clustering algorithm such as k-means to learn the partition 3 第三种: a clustering solution is derived from each individual view and then all the solutions are fused base on consensus [7, 13]. |
As unlabeled data are plentiful in real life and increasing quantities of them come in multiple views from diverse sources, the problem of unsupervised learning from multiple views of unlabeled data has attracted attention [3, 17], referred to as multi-view clustering. The goal of multi-view clustering is to partition objects into clusters based on multiple representations of the object. Existing multi-view clustering algorithms can be roughly classied into three categories. Algorithms in the rst category [3, 17] incorporate multi-view integration into the clustering process directly through optimizing certain loss functions. In contrast, algorithms in the second category such as the ones based on Canonical Correlation Analysis [8, 4] rst project multi-view data into a common lower dimensional subspace and then apply any clustering algorithm such as k-means to learn the partition. The third category is called late integration or late fusion, in which a clustering solution is derived from each individual view and then all the solutions are fused base on consensus [7, 13]. |
本段提出本文提出的方法。 我们提出了一个新颖的多视角聚类算法基于一个在单视角聚类中非常高效的技术,即(非负矩阵分解技术,NMF)。此处注意(i.e.,), 即的意思。
此时你可能有疑问,那NMF是什么?(或许你是一个对NMF比较熟悉的人,你觉得不用。但是有一个问题,这么多的技术,你为什么要选用NMF呢?)。因此,接下来呢,作者简要介绍了NMF。
如果是你简要的介绍了NMF之后,你觉得够了吗?显示你也会觉得没有。因为这个分量还不够。你还需要引用一些在聚类方向最近的工作。 |
In this paper, we propose a new multi-view clustering approach based on a highly effective technique in single-view clustering, i.e., non-negative matrix factorization (NMF) [18].
NMF, which was originally introduced as a dimensionality reduction technique [18], has been shown to be useful in many research areas such as information retrieval [20] and pattern recognition[18]. NMF has received much attention because of its straightforward interpretability for applications, i.e., we can explain each observation as an additive linear combinations of nonnegative basis vectors.
Recently, NMF has become a popular technique for data clustering, and it is reported to achieve competitive performance compared with most of the state-of-the-art unsupervised algorithms. For example, Xu et al. [20] applied NMF to text clustering and gained superior performance, and Brunet et al. [6] achieved similar success on biological data clustering.
|
上文介绍了NMF,这里我们是不是就需要继续说明一下NMF。为什么呢?是要引出我们提出的方法。NMF是优秀的技术,但是它是不足的,因此它仅仅只能局限于单视角。并且基于NMF的多视角聚类也是有限的。
那么接下来需要交代的是,将NMF应用于多视角聚类的困难和挑战,是吧!现有的应用NMF到多视角聚类的主要问题在于如何在多个视角上同时得到有意义,可比较的聚类结果。其次,传统的用于NMF中的归一化策略在多视角情境下变得困难,或者得不到有意义的聚类结果。在该文章中,我们通过提出一个新型的归一化策略,并遵循代表聚类结构的因子必须规范到一致的原则解决这些问题。
|
As NMF has shown to generate superior clustering results which are easy to interpret, it will be very useful to have an NMF-based multi-view clustering approach. However, studies on NMF-based multi-view approaches for clustering are still limited.
The main challenge of applying NMF to multi-view clustering is how to limit the search of factorizations to those that give meaningful and comparable clustering solutions across multiple views simultaneously. Moreover, traditional normalization strategies proposed for standard NMF are either diffcult to be optimized in the multi-view setting [20], or cannot generate meaningful clustering results [21, 10]. In this paper, we approach this problem by proposing a novel normalization strategy and following the principle that factors representing clustering structures learnt from multiple views should be regularized toward a common consensus. |
论文写作最好总结出来一些贡献。 本文贡献: 1、第一个提出基于联合NMF的多视角聚类算法。不像传统方法一样只是简单固定多个视角共享的单一因子。 |
1. As far as we know, this is the rst exploration towards a multi-view clustering approach based on joint nonnegative matrix factorization, which is different from traditional approaches simply fixing the shared one-side factor among multiple views.
2. As discussed, existing normalization strategies for standard NMF cannot keep factors from different views comparable and meaningful in the multi-view setting for clustering, making the fusion of views dicult and inconclusive. To tackle this challenge, we develop a novel normalization procedure inspired from connection between NMF and PLSA.
|
这一部分基本上在论文写作中变化不大。
文章接下来的结构如下: |
The rest of this paper is organized as follows. In the next section, a brief overview of NMF and its relationship with PLSA is provided. The proposed multi-view NMF algorithm is then presented in Section 3. Extensive experimental results are shown in Section 4. A discussion of related work is given in Section 5. Finally, in Section 6 we provide conclusions. |
表示非负数据矩阵,每一个代表一个样本每一行表示一个属性。NMF致力于寻找两个低维的矩阵,使得它们两者的纪能够好的近似原始数据矩阵:
一个常见的重建过程是形式化为Frobenius范数优化问题:
上面的目标函数在U和V上是不凸的。因此,期望一个算法找到全局最小值是不现实的。[18] 提出了“乘法更新规则”,迭代执行以最小化目标函数,如下所示:
NMF的抽象化式子(2.1)不止有一个解,若存在可逆矩阵,则有:
(2.3)
。因此在聚类中要加上一些约束使得聚类结果唯一。
接下来,我们将介绍NMF与概率潜在语义分析(PLSA)[15]的关系,这种关系有助于聚类。PLSA是一种传统的文档分析主题建模技术。它将项文档共现矩阵(每个条目是文档中单词的出现次数)建模为由具有个组件的混合模型生成:
,
其中,通过期望最大化算法,通过最大化相似关系来估计参数。
此处是将PLSA和NMF的思想结合起来,需要大家进行一个体会。 |
基本的想法是不同视角的一个数据点大概率将被分配到相同的聚类。因此,在矩阵分解方面,从不同视角学习得到的系数矩阵将被一个软正则约束成为一个一致性矩阵。除此之外,为了保证分解结果的一致性和正确性,在优化期间需要采取一些正则措施。 传统的NMF方法要么是非常困难优化,要么不能够生成有意义的聚类结果。因此,作者设计了一个新颖的正则策略。具体来说,在优化过程中,我们对基矩阵的每一个向量,基向量,做正则化。这被NMF和PLSA的启发,我们可以为不同视图的系数矩阵提供概率解释,使它们在优化过程中具有可比性,对聚类有意义。
在得到不同的系数矩阵之后,为了让它们尽可能的保持一致,因此作者提出了一个正则约束,将不同视角的系数矩阵融合为一个一致性矩阵。提出的损失函数测量不同视角的系数矩阵和一致性矩阵之间的差异,为:
注意,不同视图中的在同一尺度上可能不具有可比性,并且它们对于聚类没有意义,因为和任意可逆矩阵的乘积仍然可以是根据等式2.3的解。为了使不同的对同一一致性有合适的不一致测度,并使其对聚类有理论意义,一种可能的解决方法是对进行规范化,然后计算距离测度。然而,这样的规范化会使优化变得棘手。
为了解决这个问题,作者提出给每一个基向量施加一个正则化。用这种方式,近似等于1。由于不同视角的系数矩阵在相同的范围,作者能够保证在关系矩阵和一致性矩阵之间可比性是合理。再接下来这句话终于和PLSA又联系起来了:在每一次正则化之后,中的每一个元素有比较好的概率解释性,因为它能够被视为,那使得一致性矩阵在聚类方便更加有意义(即,中的每一个元素是的一致意义,由从不同观点进行加权。)。
最终的一致性损失函数如下:
对的约束可以通过引入辅助变量来实现:
结合以上3.4和3.5公式,最终损失函数变化为:
后续的优化略,劳请各位看原文推到过程。
自行查看..............(这个地方还是得需要自己好好看看,数据集,对比算法的一些基本介绍。最重要的实验结果分析)
略.......
感觉conclusion和abstract写的差不多.......
读好论文,好读论文,读论文好。
-----dugking