Lin-Hai Song
P.O. Box 2704 No.6, South Road, KeXueYuan, Beijing, 100190
Phone: 010-62600994
Mobile Phone: 15810436414
Email: [email protected] and [email protected]
Personal Site: http://sites.google.com/site/linhaisongsite/cv
OBJECTIVE
A Ph.D student on related areas of Statistical Learning and Complex Network
EDUCATION
Institute of Computing Technology, Chinese Academy of Sciences 9/2007 - Present
· Studying in Key Laboratory of Net Science and Technology
· Supervised by Xue-Qi Cheng Professor
· Overall GPA: 88.7/100
· Mathematics Curriculum GPA: 95.7/100
Huazhong University of Science and Technology 9/2003 – 6/2007
· Majoring in Software Engineering
· Overall GPA: 92.1/100
· Overall Rank: 1/305
· Mathematics Curriculum GPA: 98/100
RESEARCH
Referred Publications
· Linhai Song, Xueqi Cheng, Yan Guo, Yue Liu and Guodong Ding, “ContentEx: A Framework for Automatic Content Extraction Programs,”In ISI'2009: Proceeding of IEEE Intelligence and Security Informatics 2009, Dallas, Texas, USA, 2009.
· Linhai Song, Xueqi Cheng, Yan Guo, Bo Wu and Yu Wang, “Blog Post Extraction Using Title Finding,”In CCIR'2009: Proceeding of Chinese Conference on Information Retrieval, Shanghai, China, 2009.
· Bo Wu, Xueqi Cheng, Yu Wang, Yan Guo and Linhai Song, "Simultaneous Product Attribute Name and Value Extraction from Web Pages," In WI-IAT: IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, 2009
· Yu Wang, Bingxing Fang, Bo Wu, Linhai Song and Yan Guo,” Schema Matching Incorporating with Attribute Distribution Features,” In CCIR'2009: Proceeding of Chinese Conference on Information Retrieval, Shanghai, China, 2009. (In Chinese)
Work in Process
· Yu Wang, Xueqi Cheng, Yan Guo, Linhai Song and Bo Wu, “Recent Advances in Web Information Extraction,” submitted to Journal of Chinese Information Processing. (In Chinese)
· Yu Wang, Bingxing Fang, Linhai Song, Bo Wu and Yan Guo, “Topic Relevance Based Content Extraction,” submitted to High Technology Letters. (In Chinese)
Demos
· Content Extraction from Web Pages: an on-line demonstration for content extraction module in Golaxy Network Monitoring System. Demo [here].
· On-line Web Page Annotation Tool: an on-line demonstration of a wrapper induction system for SoftMealy extraction program. Still Under Construction.
PROJECTS
POC: An Archive for Public Opinions
· My master thesis
· Builds an archive for News Articles, Blog Posts and Forum Posts and Replies which are all sources for public opinions
· First-Order Logic is exploited to express heuristic rules of crawling and content extraction into a set of formulae
· Proper algorithms are used to learn a weight for each formula
Trec’09 at CAS-ICT
· Responsible for data preprocess and content extraction
· Participates in Faceted Blog Distillation Task
· Centroid classifier refined by DragPushing strategy is introduced to judge “Personal” and “In-depth” quality for each blog posts in ad-hoc list.
Golaxy Network Monitoring System
· Responsible for content extraction module
· Statistical models and heuristic rules are used to extract News Articles, Blog Posts and Forum Posts and Replies from web pages
Hash Functions Used in Data Packet Processing Engine
· My undergraduate thesis
· Advised by Zhong-Ping Qin Professor
· Design an efficient Hash Function to decide which flow a data packet belongs to
· Win Third-class Outstanding Scientific Results of Hubei Province, China, 2007
AWARDS AND HONORS
· HUST Distinguished Graduates in 2007 (Top 10% graduates)
· HUST First-Class Scholarship in 2004, 2005 and 2006 (Top 10% undergraduates)
· HUST Top Academic Student Award in 2004, 2005 and 2006 (Top 1% undergraduates)
· HUST Excellent Student Leader Award in 2005 and 2006
STANDARD TESTS
· GRE: 460(verbal)+800(Quantitative)+3.5(Analytical Writing)
· IBT: 28(reading)+24(listening)+22(speaking)+25(writing)