.NET使用Lucene.Net和盘古分词类库实现中文分词

.NET中文分词实现http://http://

使用

Lucene.Net.dll http://www.apache.org/dist/incubator/lucene.net/binaries/2.9.4g-incubating/

PanGu.dll http://pangusegment.codeplex.com/releases/view/50811

PanGu.Lucene.Analyzer.dll

和字典文件 http://pangusegment.codeplex.com/releases/view/31531

示例代码:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.PanGu;
using System.IO;
using System.Collections;
namespace FcCApp {
      	class Program     {
               static void Main(string[] args){
 			String text = "基于java语言开发的轻量级的中文分词工具包";
 			Analyzer anal = new PanGuAnalyzer();//使用盘古分词
 			StringReader sb = new StringReader(text);
 			TokenStream ts= anal.ReusableTokenStream("", sb);
 			Token t = null;
 			while ((t=ts.Next())!=null){
 			          Console.Write(t.TermText()+"|"); 
			}
 		}
 	}
 }


结果:

基于|java|语言|开发|的|轻量级|的|中文|分词|工具包|


示例下载地址:


http://download.csdn.net/detail/lijun7788/4412762


你可能感兴趣的:(Asp.NET)