sphinx 尝试

      最近下来sphinx试验了下,因为我们的数据都是xml的,所以使用数据源类型是 xmlpile2 ,下面是配置文件和数据文件

source src1 { type = xmlpipe2 xmlpipe_command = cat /home/admin/sphinx-2.0.1-beta/conf/data } index test1 { source = src1 path = /home/admin/sphinx-2.0.1-beta/conf/test1 docinfo = extern mlock = 0 min_word_len = 1 charset_type = utf-8 html_strip = 0 } indexer { mem_limit = 32M } searchd { listen = 9312 listen = 9306:mysql41 log = @CONFDIR@/log/searchd.log query_log = @CONFDIR@/log/query.log read_timeout = 5 client_timeout = 300 max_children = 30 pid_file = @CONFDIR@/log/searchd.pid max_matches = 1000 seamless_rotate = 1 preopen_indexes = 1 unlink_old = 1 mva_updates_pool = 1M max_packet_size = 8M max_filters = 256 max_filter_values = 4096 max_batch_queries = 32 }  

 

数据文件 

<?xml version="1.0" encoding="utf-8"?> <sphinx:docset> <sphinx:schema> <sphinx:field name="subject"/> <sphinx:field name="content"/> <sphinx:attr name="published" type="timestamp"/> <sphinx:attr name="author_id" type="int" bits="16" default="1"/> </sphinx:schema> <sphinx:document id="1234"> <content>this is the main content <!--[CDATA[[and this <cdata> entry must be handled properly by xml parser lib]]--></content> <published>1012325463</published> <subject>note how field/attr tags can be in <b>randomized</b> order</subject> <misc>some undeclared element</misc> </sphinx:document> </sphinx:docset> 

 

注意:这里数据源是xmlpipe2 ,对应的charset_type 必须是utf-8的

 

 

你可能感兴趣的:(sphinx 尝试)