这里总结了近几年来,一些知名会议上有关信息抽取的文章,不断更新中。
[1] Rui Cai, Jiang-Ming Yang, Wei Lai, Yida Wang, and Lei Zhang. iRobot: An Intelligent Crawler for Web Forums. WWW 2008.
[2] Yan Guo, Kui Li, Kai Zhang, and Gang Zhang. Board forum crawling: a Web crawling method for Web forum. In Proc. 2006 IEEE/WIC/ACM Int. Conf. Web Intelligence, pages 745−748, Hong Kong, Dec. 2006.
[3] Ying Liu, Kun Bai, Prasenjit Mitra, and C. Lee Giles. Automatic Searching of Tables in Digital Libraries. WWW 2007.
[4] Ying Liu, Prasenjit Mitra, C. Lee Giles, and Kun Bai. Automatic Extraction of Table Metadata from Digital Documents. The 6th ACM/IEEE-CS joint conference on Digital libraries (JCDL’06).
[5] Bingjun Sun, Qingzhao Tan, Prasenjit Mitra, and C. Lee Giles. Extraction and Search of Chemical Formulae in Text Documents on the Web. WWW 2007.
[6] Yaoyong Li, and Kalina Bontcheva. Hierarchical, Perceptron-like Learning for Ontology Based Information Extraction. WWW 2007.
[7] Wolfgang Gatterbauer, Paul Bohunsky, Marcus Herzog, Bernhard Krupl, and Bernhard Pollak. Towards DomainIndependent Information Extraction from Web Tables. WWW 2007.
[8] Zaiqing Nie, Yunxiao Ma, Shuming Shi, Ji-Rong Wen, and Wei-Ying Ma. Web Object Retrieval. WWW 2007.
[9] Utku Irmak, and CIS Department. Interactive Wrapper Generation with Minimal User Effort. WWW 2006.
[10] Zhang Kuo, Wu Gang, and Li JuanZi. Logical Structure Based Semantic Relationship Extraction from Semi-Structured Documents. WWW 2006.
[11] Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kaczmarek, and Witold Abramowicz. Robust Web Content Extraction. WWW 2006.
[12] Suhit Gupta, Hila Becker, Gail Kaiser, and Salvatore Stolfo. Verifying Genre-based Clustering Approach to Content Extraction. WWW 2006.
[13] Jochen L. Leidner. Resource Monitoring in Information Extraction. SIGIR 2007.
[14] Jennifer ChuCarroll, and John PragerAn. Experimental Study of the Impact of Information Extraction Accuracy on Semantic Search Performance. CIKM 2007.
[15] Meishan Hu, Aixin Sun and Ee-Peng Lim. Comments-Oriented Blog Summarization by Sentence Extraction. CIKM 2007.
[16] Marius Pasca, Benjamin Van Durme, and Nikesh Garera. The Role of Documents vs. Queries in Extracting Class Attributes from Text. CIKM 2007.
[17] Sreenivas Gollapudi, Rina Panigrahy. Exploiting Asymmetry in Hierarchical Topic Extraction. CIKM 2006.
[18] Li Zhuang, Feng Jing, Xiao-Yan Zhu. Movie Review Mining and Summarization. CIKM 2006.
[19] Mstislav Maslennikov, and Tat-Seng Chua. A Multi-resolution Framework for Information Extraction from Free Text. ACL 2007.
[20] Yu Wang, Bingxing Fang, Xueqi Cheng, Li Guo, and Hongbo Xu. Incremental Web Page Template Detection. WWW 2008.
[21] Rupesh R. Mehta, Amit Madaan. Web Page Sectioning Using Regexbased Template. WWW 2008.
[22] Zhou GuoDong, Su Jian, Zhang Min. Modeling Commonality among Related Classes in Relation Extraction. ACL 2006.
[23] Jinxiu Chen, Donghong Ji, Chew Lim Tan, Zhengyu Niu. Relation Extraction Using Label Propagation Based Semi-supervised Learning. ACL 2006.
[24] Zhenmei Gu, Nick Cercone. Segment-based Hidden Markov Models for Information Extraction. ACL 2006.
[25] Jizhou Huang, Ming Zhou, and Dan Yang. Extracting Chatbot Knowledge from Online Discussion Forums. IJCAI 2007.
[26] Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni. Open Information Extraction from the Web. IJCAI 2007.
[27] Doug Downey, Oren Etzioni, and Stephen Soderland. A Probabilistic Model of Redundancy in Information Extraction. IJCAI 2005.
[28] Sanda Harabagiu, Cosmin Adrian Bejan, and Paul Morarescu. Shallow Semantics for Relation Extraction. IJCAI 2005.
[29] Jun Zhu, Zaiqing Nie, Bo Zhang, and Ji-Rong Wen. Dynamic Hierarchical Markov Random Fields and their Application to Web Data Extraction. ICML 2007.
[30] Shui-Lung Chuang, Kevin Chen-Chuan Chang, and ChengXiang Zhai. Collaborative Wrapping: A Turbo Framework for Web Data Extraction. ICDE 2007.
[31] Shui-Lung Chuang, Kevin Chen-Chuan Chang, and ChengXiang Zhai. ContextAware Wrapping: Synchronized Data Extraction. VLDB 2007.
[32] Eric Chu, Akanksha Baid, Ting Chen, AnHai Doan, and Jeffrey Naughton. A Relational Approach to Incrementally Extracting and Querying Structure in Unstructured Data. VLDB 2007.
[33] Warren Shen, AnHai Doan, Jeffrey F. Naughton, and Raghu Ramakrishnan. Declarative Information Extraction Using Datalog with Embedded Extraction Predicates. VLDB 2007.
[34] Hongkun Zhao, Weiyi Meng, and Clement Yu. Automatic Extraction of Dynamic Record Sections From Search Engine Result Pages. VLDB 2006.
[35] Marek Kowalkiewicz, Tomasz Kaczmarek, Witold Abramowicz. myPortal: Robust Extraction and Aggregation of Web Content. VLDB 2006.
[36] Shuyi Zheng, Di Wu, Ruihua Song, and JiRong Wen. Towards Joint Optimization of Wrapper Generation and Template Detection. SIGKDD 2007.
[37] Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, and Hsiao-Wuen Hon. Webpage Understanding: an Integrated Approach. SIGKDD 2007.
[38] Andrew McCallum. Information Extraction, Data Mining and Joint Inference. SIGKDD 2006. Invited Talk.
[39] Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, and Wei-Ying Ma. Simultaneous Record Detection and Attribute Labeling in Web Data Extraction. SIGKDD 2006.
[40] Rahul Gupta, Sunita Sarawagi. Creating Probabilistic Databases from Information Extraction Models. VLDB 2006.
[41] Boris Chidlovskii, Bruno Roustant, and Marc Brette. Documentum ECI Self-Repairing Wrappers: Performance Analysis. SIGMOD 2006.
[42] Y. Zhai and B. Liu. Web data extraction based on partial tree alignment. In Proc. of the 14th Int. World Wide Web Conf., 2005.
[43] B. Liu and Y. Zhai. Net - a system for extracting web data from flat and nested data records. In Proc. of the 6th Int. Conf. on Web Information Systems Engineering, 2005.