David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res.,3:993–1022, March 2003.
Rickjin. LDA数学八卦. 2013.2.8
Ian Porteous, David Newman, Alexander Ihler, Arthur Asuncion, Padhraic Smyth, and Max Welling. Fast collapsed gibbs sampling for latent dirichlet allocation. InProceeding of the 14th ACM SIGKDD inter-national conference on Knowledge discovery and data mining, KDD ’08, pages 569–577, New York, NY, USA, 2008. ACM.
Matthew Hoffman, David M. Blei, and Francis Bach. Online learning for latent dirichlet allocation. In NIPS, 2010.
Arindam Banerjee and Sugato Basu. Topic Models over Text Streams: A Study of Batch and Online Unsupervised Learning. InSDM. SIAM, 2007.
Limin Yao, David Mimno, and Andrew McCallum. Efficient methods for topic model inference on stream-ing document collections. In Proceedings of the 15th ACM SIGKDD international conference on Knowl-edge discovery and data mining, KDD ’09, ages 937–946, New York, NY, USA, 2009. ACM.
Feng Yan, Ningyi Xu, and Yuan Qi. Parallel inference for latent dirichlet allocation on graphics processing units. InNIPS, 2009.
D. Newman, A. Asuncion, P. Smyth, and M. Welling. Distributed Inference for Latent Dirichlet Allocation. 2007.
Zhiyuan Liu, Yuzhou Zhang, Edward Y. Chang, and Maosong Sun. Plda+: Parallel latent dirichlet allocation with data placement and pipeline processing. ACM Trans. Intell. Syst. Technol., 2:26:1–26:18, May 2011.
Arthur Asuncion, Padhraic Smyth, and Max Welling. Asynchronous distributed learning of topic models. In NIPS, pages 81–88, 2008.
T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101(Suppl. 1):5228–5235, April 2004.
1、打破原有可交换的假设
David M. Blei and John D. Lafferty. A correlated topic model of science. AAS, 1(1):17–35, 2007.
Wei Li and Andrew McCallum. Pachinko allocation: Dag-structured mixture models of topic correlations. InICML, 2006.
Jonathan Chang and David Blei. Relational topic models for document networks. InAIStats, 2009.
Xuerui Wang, Andrew McCallum, and Xing Wei. Topical n-grams: Phrase and topic discovery, with an application to information retrieval. InProceedings of the 2007 Seventh IEEE International Conference on Data Mining, pages 697–702, Washington, DC, USA, 2007. IEEE Computer Society
Yue Lu and Chengxiang Zhai. Opinion integration through semi-supervised topic modeling. InProceeding of the 17th international conference on World Wide Web, WWW ’08, pages 121–130, New York, NY, USA, 2008. ACM
Qiaozhu Mei, Deng Cai, Duo Zhang, and ChengXiang Zhai. Topic modeling with network regularization. InProceeding of the 17th international conference on World Wide Web, WWW ’08, pages 101–110, New York, NY, USA, 2008. ACM.
2、基于非参数贝叶斯方法的变形
Y. W. Teh. Dirichlet processes. InEncyclopedia of Machine Learning. Springer, 2010.
Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566–1581, 2006.
Yee Whye Teh. A hierarchical bayesian language model based on pitman-yor processes. InProceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Asso-ciation for Computational Linguistics, ACL-44, pages 985–992, Stroudsburg, PA, USA, 2006. Association for Computational Linguistics.
Issei Sato and Hiroshi Nakagawa. Topic models with power-law using pitman-yor process. InProceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’10, pages 673–682, New York, NY, USA, 2010. ACM.
3、从无结构化信息到结构化或者半结构化的信息
David M. Blei and Jon D. McAuliffe. Supervised topic models. InNIPS, 2007.
David Mimno and Andrew McCallum. Topic models conditioned on arbitrary features with dirichlet-multinomial regression. InUAI, 2008.
1、情感分析
Ivan Titov and Ryan McDonald. Modeling online reviews with multi-grain topic models. In Proceeding of the 17th international conference on World Wide Web, WWW ’08, pages 111–120, New York, NY, USA, 2008. ACM.
Ivan Titov and Ryan McDonald. A joint model of text and aspect ratings for sentiment summarization. In Proceedings of ACL-08: HLT, pages 308–316, Columbus, Ohio, June 2008. Association for Computational Linguistics.
Qiaozhu Mei, Xu Ling, Matthew Wondra, Hang Su, and ChengXiang Zhai. Topic sentiment mixture: modeling facets and opinions in weblogs. InProceedings of the 16th international conference on World Wide Web, WWW ’07, pages 171–180, New York, NY, USA, 2007. ACM.
Chenghua Lin and Yulan He. Joint sentiment/topic model for sentiment analysis. InProceeding of the 18th ACM conference on Information and knowledge management, CIKM ’09, pages 375–384, New York, NY, USA, 2009. ACM.
Xin Zhao, Jing Jiang, Hongfei Yan, and Xiaoming Li. Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. InProceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 56–65, Cambridge, MA, October 2010. Association for Computational Linguistics.
2、学术文章挖掘
Michal Rosen-Zvi, Tom Griffiths, Mark Steyvers, and Padhraic Smyth. The author-topic model for authors and documents. InUAI, 2004.
Ramesh M. Nallapati, Amr Ahmed, Eric P. Xing, and William W. Cohen. Joint latent topic models for text and citations. InProceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’08, pages 542–550, New York, NY, USA, 2008. ACM
T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101(Suppl. 1):5228–5235, April 2004.
Ding Zhou, Xiang Ji, Hongyuan Zha, and C. Lee Giles. Topic evolution and social interactions: how authors effect research. InProceedings of the 15th ACM international conference on Information and knowledge management, CIKM ’06, pages 248–257, New York, NY, USA, 2006. ACM.
Gideon S. Mann, David Mimno, and Andrew McCallum. Bibliometric impact measures leveraging topic analysis. InProceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries, JCDL ’06, pages 65–74, New York, NY, USA, 2006. ACM.
3、社会媒体
Wayne Xin Zhao, Jing Jiang, Jianshu Weng, Jing He, Ee-Peng Lim, Hongfei Yan, and Xiaoming Li. Comparing twitter and traditional media using topic models. InECIR, pages 338–349, 2011.
4、时序文本流
David M. Blei and John D. Lafferty. Dynamic topic models. In ICML, 2006.
Xuerui Wang and Andrew McCallum. Topics over time: a non-markov continuous-time model of topical trends. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’06, pages 424–433, New York, NY, USA, 2006. ACM.
Xuanhui Wang, ChengXiang Zhai, Xiao Hu, and Richard Sproat. Mining correlated bursty topic patterns from coordinated text streams. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, 2007
Qiaozhu Mei and ChengXiang Zhai. Discovering evolutionary theme patterns from text: an exploration of temporal text mining. InProceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, KDD ’05, pages 198–207, New York, NY, USA, 2005. ACM.
5、网络结构数据
Jonathan Chang and David Blei. Relational topic models for document networks. InAIStats, 2009.
Qiaozhu Mei, Deng Cai, Duo Zhang, and ChengXiang Zhai. Topic modeling with network regularization. InProceeding of the 17th international conference on World Wide Web, WWW ’08, pages 101–110, New York, NY, USA, 2008. ACM.