跟着官网学solr(二):Document、Field说明

        上一篇(http://blog.csdn.net/peppengliu/article/details/51463918)做了solr测试环境的安装,本篇学习下solr中document及field。

        下面就介绍下document、field及相关的概念:

        document:文档是索引的基本单元,它是一组待索引数据的描述的集合。document由field组成。一个简单的document示例如下:

        

{"id":"138761234112","goods_name":"product","value":12}

        field:document的组成部分,它描述了了待索引数据的更详细的信息。定义了document中每个field的数据类型,由定义,它包含了两个必选属性name、type及一些可选属性,name对应索引数据的字段名称,type对应索引数据类型,可选属性说明如下(官网wiki摘录):

default
The default value for this field if none is provided while adding documents
indexed=true|false
True if this field should be "indexed". If (and only if) a field is indexed, then it is searchable, sortable, and facetable.
stored=true|false
True if the value of the field should be retrievable during a search, or if you're using highlighting or MoreLikeThis.
compressed=true|false
True if this field should be stored using gzip compression. (This will only apply if the field type is compressible; among the standard field types, only TextField and StrField are.)
compressThreshold=
multiValued=true|false
True if this field may contain multiple values per document, i.e. if it can appear multiple times in a document
omitNorms=true|false
This is arguably an advanced option.
Set to true to omit the norms associated with this field (this disables length normalization and index-time boosting for the field, and saves some memory). Only full-text fields or fields that need an index-time boost need norms.
termVectors=false|true  Solr 1.1
If set, include full term vector info.
If enabled, often also used with termPositions="true" and termOffsets="true".
To use interactively, requires TermVectorComponent
Corresponds to TV button in Luke, and V field attribute.
omitTermFreqAndPositions=true|false  Solr1.4
If set, omit term freq, positions and payloads from postings for this field. This can be a performance boost for fields that don't require that information and reduces storage space required for the index. Queries that rely on position that are issued on a field with this option fail with an exception. Prior to  Solr4.0 the queries would silently fail to find documents.
omitPositions=true|false  Solr3.4
If set, omits positions, but keeps term frequencies
        field分为以下几类:define fields(由field定义)、 copyField(由copyField定义)、dynamicField(由dynamicField定义)。

        field analysis:field值分析器,定义了field域中value的分析方法。当field需要进行额外处理时(如分词、过滤等)需定义此项。典型配置如下:


      
        
        
        
        
      
      
        
        
        
        
      
    

        该配置型定义一个名为text_general的数据类型,当field中type为text_general时,自动为该field的值使用该标签中定义的类型来处理field的值。


        以上配置均配置于schema.xml中,想了解其他配置项,请参考:http://wiki.apache.org/solr/SchemaXml

        本文主要参考http://wiki.apache.org/solr/SchemaXml及https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide,有总结不足之处,欢迎留言指正。

        

你可能感兴趣的:(solr)