coreseek中文全文检索的应用

Coreseek之我们的应用(触屏版HTML5

安装详情请参考:http://www.coreseek.cn/products-install/install_on_bsd_linux/

这里以centos6.2为例进行说明:(下面安装内容取自官网)

一.安装依赖包

yum install make gcc g++ gcc-c++ libtool autoconf automakeimake mysql-devel libxml2-devel expat-devel

.安装coreseek
$ wget http://www.coreseek.cn/uploads/csft/3.2/coreseek-3.2.14.tar.gz
$ 或者 http://www.coreseek.cn/uploads/csft/4.0/coreseek-4.0.1-beta.tar.gz
$ 或者 http://www.coreseek.cn/uploads/csft/4.0/coreseek-4.1-beta.tar.gz
$ tar xzvf coreseek-3.2.14.tar.gz 或者 coreseek-4.0.1-beta.tar.gz 或者 coreseek-4.1-beta.tar.gz
$ cd coreseek-3.2.14 或者 coreseek-4.0.1-beta 或者 coreseek-4.1-beta
 
 
##前提:需提前安装操作系统基础开发库mysql依赖库以支持mysql数据源和xml数据源
##安装mmseg
$ cd mmseg-3.2.14
$ ./bootstrap  #输出的warning信息可以忽略,如果出现error则需要解决
$ ./configure --prefix=/usr/local/mmseg3
$ make && make install
$ cd ..
 
 
##安装coreseek
$ cd csft-3.2.14 或者 cd csft-4.0.1 或者 cd csft-4.1
$ sh buildconf.sh  #输出的warning信息可以忽略,如果出现error则需要解决
$ ./configure --prefix=/usr/local/coreseek --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg/lib/ --with-mysql  ##如果提示mysql问题,可以查看MySQL数据源安装说明
$ make && make install
$ cd ..
 
 
##测试mmseg分词,coreseek搜索(需要预先设置好字符集为zh_CN.UTF-8,确保正确显示中文)
$ cd testpack
$ cat var/test/test.xml  #此时应该正确显示中文
$ /usr/local/mmseg/bin/mmseg -d /usr/local/mmseg/etc var/test/test.xml
$ /usr/local/coreseek/bin/indexer -c etc/csft.conf --all
$ /usr/local/coreseek/bin/search -c etc/csft.conf 网络搜索

.配置

[zdh@gy03 coreseek]$ cat etc/movie.conf

#

# Minimal Sphinx configuration sample(clean, simple, functional)

#

source movie

{

type = mysql


sql_host =dbIP

sql_user =dbuser

sql_pass =dbpassword

sql_db = dbname

sql_port =3306 # optional, default is 3306

sql_query_pre = SETNAMES utf8

sql_query = \

SELECT index_id, movie_id,movie_name,movie_name_alias, movie_name_pinyin, starin, starin_pinyin,director, director_pinyin, movie_type, show_time, region \

FROM index_movie


sql_attr_uint = movie_id

#sql_attr_timestamp =date_added


sql_query_info = SELECTmovie_id, movie_name, starin, director, movie_type, show_time, region FROMindex_movie WHERE index_id=$id

}


index movie

{

source = movie

path =/usr/local/coreseek/var/data/movie

charset_type =zh_cn.utf-8

#charset_table = 0..9,A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F

charset_dictpath =/usr/local/coreseek/etc/

morphology = none

docinfo = extern

mlock = 0

min_stemming_len = 1

ngram_len = 0

min_word_len = 1

html_strip = 0

#ngram_chars =U+3000..U+2FA1F

}

indexer

{

mem_limit = 64M

}

searchd

{

listen = 9312

#listen = 9306:mysql41

log =/usr/local/coreseek/var/log/searchd.log

query_log =/usr/local/coreseek/var/log/query.log

read_timeout = 5

max_children = 30

pid_file =/usr/local/coreseek/var/log/searchd.pid

max_matches = 1000

seamless_rotate = 1

preopen_indexes = 1

unlink_old = 1

#workers = threads # for RT to work

#binlog_path =/usr/local/coreseek/var/data

}


.启动

[zdh@gy03 coreseek]$ searchd -cetc/movie.conf #开启

[zdh@gy03 coreseek]$ searchd -cetc/movie.conf --stop #关闭

[zdh@gy03 coreseek]$ indexer -c /usr/local/sphinx/etc/movie.conf--all --rotate

#如果在启动serched之前建索引,去掉--rotate


五.附index_movie表结构:

CREATE TABLE `index_movie` (
`index_id` int(11) NOT NULL AUTO_INCREMENT,
`movie_id` int(11) DEFAULT NULL,
`movie_name` varchar(32) DEFAULT NULL,
`movie_name_alias` varchar(255) DEFAULT NULL,
`movie_name_pinyin` varchar(255) DEFAULT NULL,
`starin` varchar(128) DEFAULT NULL,
`starin_pinyin` varchar(256) DEFAULT NULL,
`director` varchar(64) DEFAULT NULL,
`director_pinyin` varchar(64) DEFAULT NULL,
`movie_type` varchar(64) DEFAULT NULL,
`show_time` varchar(10) DEFAULT NULL,
`region` varchar(32) DEFAULT NULL,
`movie_desc` varchar(1024) DEFAULT NULL,
`weight` int(11) DEFAULT '0',
`dp_class_type` varchar(128) DEFAULT NULL,
`dp_district_type` varchar(128) DEFAULT NULL,
`dp_age_type` varchar(64) DEFAULT NULL,
`show_time_format` varchar(64) DEFAULT NULL,
`resource_flag` int(10) DEFAULT '0' COMMENT '1 0',
PRIMARY KEY (`index_id`),
KEY `index_movie_id` (`movie_id`)
) ENGINE=InnoDB AUTO_INCREMENT=196606 DEFAULT CHARSET=utf8;




本文出自 “zhangdh开放空间” 博客,转载请与作者联系!

你可能感兴趣的:(coreseek,中文检索)