整理下资料,先丢上来,后面再慢慢扩充。
(不完全,只是选了几个我经常看到的)
2004 Finding andevaluating community structure in networks
2006 Modularity andcommunity structure in networks
提出著名的modularity,衡量类内链接密集,类间链接稀疏的社团
2006 Mixture models andexploratory analysis in networks
在不知道社团结构的情况下寻找社团,有点毁三观,其实目标函数是找相同的链接模式
2008 Hierarchicalstructure and the prediction of missing links in networks
这篇上了Nature,层次结构能够描述复杂网络的结构,进而用来预测边。你们还在用社团的准确性来衡量层次结构么,弱爆了,看大牛直接用层次结构重构网络!
2011 Stochasticblockmodels and community structure in networks
度修正的随机块模型,壮哉block model
2012 Communities,modules and large-scale structure
社团检测的入门读物,发到nature physics上,有中文翻译(一时间没找到网址,想要的留个邮箱)
2010 Network: AnIntroduction
Newman出的本书,网站有目录,讲的比较基础
2007 An algorithm to findoverlapping community structure in networks
改进GN算法到重叠社团,大致就是允许点也进行分裂
老爷子挺喜欢改进的,有篇文写的是将任意无重叠算法扩展成重叠的,大致是先用这里的方法把点分裂了,再用无重叠的方法检测
2010 Finding overlapping communities in networks by label propagation
标号传播方法
2011 Fuzzy overlappingcommunities in networks
认为重叠还有两种,crisp和fuzzy,相当于是硬重叠和软重叠,评价了当前方法对这两种重叠的检测效果
2010 Link communities revealmultiscale complexity in networks
感觉自从这篇文上了Nature,边社团一下就火了= =
方法很简单,定义了边的相似度,做了个层次聚类
实验做的非常丰富!
2009 Line graphs, linkpartitions, and overlapping communities
提起边社团,怎么能不提Evans的line graph呢,他把边映射成点,于是用传统关于点的方法就可以得到边社团。
Evans和Ahn还写了声明说两人是独立完成工作的,碰巧都是关于边社团╮( ̄▽ ̄)╭
2010 Community Structure in Time-Dependent, Multiscale, andMultiplex Networks
这篇上了Science,讲多片的网络,比如随时间依赖的,边类型多样的,多种分辨率的。方法很巧,把各网络相同的点连了条边,从而将所有网络连到一起
2008 Fast unfolding of communities in large networks
(无人可及)快速的无重叠社团检测方法BGLL,目标函数是modularity,仔细解读过它的代码,c++写的,以至于后来写的风格都跟它一样…
2005 Uncovering theoverlapping community structure of complex networks in nature and society
2007 Quantifying socialgroup evolution
砸上两篇Nature 05年那篇是讲经典的clique方法;07年那篇讲社团的演变
2006 CFinder: locatingcliques and overlapping modules in biological networks
经典的clique方法的工具CFinder,填个表可免费使用
和Evans合作的line graph,和Blondel合作的BGLL
关键字:community detection, social network, socialnetwork analysis, complex network, cluster, graph partition
Nature
Science
AAAI
WWW
ICDM
SIGKDD
SIGMOD
PKDD
PAKDD
TKDD
SDM
CIKM
Proceedings of the National Academy of Sciences 9.681
New journal of Physics 4.177
Physical Review E 2.255
Journal of Statistical Mechanics: Theory and Experiment 1.7
Journal of Physics A: Mathematical and Theoretical 1.540
The European Physical Journal B 1.534
Physica A: Statistical Mechanics and its Applications 1.373
EPL (Europhysics Letters)
PLOS One 4.096
Complex networks
Social networks 2.931
Network Science
右边一列数字是影响因子,每年在变,也忘记这是哪年的了…
以上也是摘的常见到的,除了数据挖掘相关的,还有大片物理的,是的,有一大群物理学家在搞这方面,比如MarkNewman = =事实上 生物,社会,物理,数学,计算机科学的人都有在搞,交叉学科嘛
相关的wiki
http://en.wikipedia.org/wiki/Community_structure
http://en.wikipedia.org/wiki/Cluster_analysis
学科关系图
从以下几方面能大致描述一篇论文的研究方面(个人总结,不足求喷)
Flat cluster聚类结果是对网络的一个划分,一般结果都是这样
Hierarchical cluster层次聚类,结果是社团包含关系的树形图(dendrogram)
Overlapping(Fuzzy/Crispassignment)成员可以属于多个社团
Non-overlapping(Hardassignment)成员只能属于一个社团
Static network网络是固定的,不随时间变化,通常是
Dynamic network网络会随着时间变化
Multiplex network网络中的边有多种类型
Bipartite network网络中的点有两种类型(依此类推可以有多种类型)
Density community目标是内部链接密集的社团
Bipartite community 目标是内部链接稀疏的社团,通常是将网络划分为二部图或多部图
Mixture community目标是链接模式类似的社团,上述两者的混合
说起来大多社团的定义都是靠的算法,算法检测出来什么就定义成什么==
Global利用全局信息,检测网络整体的社团划分
Local利用局部信息,比如考虑一个点时只看它的邻居点,可以检测网络局部的社团,比如指定一个点,看它周围的社团划分情况,很实际的应用,尤其是当数据规模非常大的时候
Increment(online) 算法支持在线更新,即添加或删除一些点(边),不用重新再跑一遍,简单地调整下就好了,适合于实时变动、规模大的网络。
进一步还有研究
Node properties (hub, periphery) 研究节点的性质,比如是否为关键点,中心点,边缘点,引导者,跟随者等
Spread process 研究信息的传播过程,比如舆论传播,病毒传播。
Link prediction预测缺失的边,其实就是推荐
Evaluation检测的效果好不好需要评价指标,目前还没有公认的好的评价指标。直接和带标签的真实网络比吧,小规模的网络没有说服力。大规模的数据,社团的定义都不一定相同。一些好文章,是自己做的数据集,用自己的评价指标来衡量。于是一些人专门做了一系列实验,从比较客观的角度,来评价当前的算法,这也是个研究方面。
Visualization评价指标得到定量的分析,但也只是一堆数,人们还是喜欢看到图,如何可视化地展示社团结构也是个问题。
来自http://blog.sciencenet.cn/blog-798640-677758.html
http://blog.sina.com.cn/s/blog_63891e610101722t.html
(留个空自己总结个)
2010 Community detectionin graphs
工具书般的综述= =
2012 Communities,modules and large-scale structure
社团检测的入门读物,发到nature physics上,有中文翻译
2012 Temporal networks
总结了随时间变化的网络结构的分析方法
2013 Overlappingcommunity detection in networks: The state-of-the-art and comparative study
重叠算法的综述
Gephi is an interactivevisualization and explorationplatform forall kinds of networks and complex systems, dynamic and hierarchical graphs.
Runs on Windows, Linuxand Mac OS X. Gephi is open-source and free.
http://gephi.org/users/download/
NetLogo is a multi-agentprogrammable modeling environment. It is used by tens of thousands of students,teachers and researchers worldwide. It also powers HubNet participatorysimulations. It is authored by Uri Wilensky and developed at the CCL. You candownload it free of charge.
http://ccl.northwestern.edu/netlogo/download.shtml
Pajek (Slovene word forSpider) is a program, for Windows, for analysis and visualization of largenetworks. It is freely available, for noncommercial use, at itsdownload page.
http://pajek.imfm.si/doku.php?id=download
igraphis a free software package for creating and manipulating undirected anddirected graphs. It includes implementations for classic graph theory problemslike minimum spanning trees and network flow, and also implements algorithmsfor some recent network analysis methods, like community structure search.
http://igraph.sourceforge.net/download.html
Cytoscape is an open sourcesoftware platform for visualizing complex networks and integrating these withany type of attribute data. A lot of Apps are available for various kinds ofproblem domains, including bioinformatics, social network analysis, andsemantic web.
http://www.cytoscape.org/download.html
http://code.google.com/p/community-detection/ C++的
http://code.google.com/p/linloglayout/ java的。
来自 <http://blog.sina.com.cn/s/blog_67532f7c0100qakz.html>
http://blog.sciencenet.cn/blog-404069-297233.html工具
MatlabBGL is a Matlabpackage for working with graphs. It uses the Boost Graph Library to efficiently implement the graph algorithms. MatlabBGL is designed to work with large sparse graphs with hundreds of thousandsof nodes.
来自 <https://www.cs.purdue.edu/homes/dgleich/packages/matlab_bgl/>
http://www.cs.cmu.edu/~enron/
http://www.informatik.uni-trier.de/~ley/db/
http://socialnetworks.mpi-sws.org/data-imc2007.html
http://www.cs.bris.ac.uk/~steve/networks/
http://www.cs.bris.ac.uk/~steve/networks/peacockpaper/
http://cran.r-project.org/web/packages/timeordered/index.html
http://www.facebook.com/press/info.php?statistics
http://www.cs.cornell.edu/projects/kddcup/datasets.html
http://www-personal.umich.edu/~mejn/netdata/
http://www.cise.ufl.edu/research/sparse/mat/Pajek/
http://arnetminer.org/download
http://yeast-complexes.russelllab.org/complexview.pl?rm=complex_list
http://thebiogrid.org/
http://mips.helmholtz-muenchen.de/genre/proj/yeast/
http://www.yeastgenome.org/
http://vlado.fmf.uni-lj.si/pub/networks/data/
http://archive.routeviews.org/
http://blog.sciencenet.cn/blog-40109-279160.html
http://deim.urv.cat/~aarenas/data/welcome.htm
https://www.coursera.org/course/sna
This course will use social network analysis, both its theory andcomputational tools, to make sense of the social and information networks thathave been fueled and rendered accessible by the internet.
http://cm.dce.harvard.edu/2014/01/14328/publicationListing.shtml