1.6.9 UIMA Integration

1. UIMA 集成

  你可以使用solr集成Apache的非结构化信息管理架构(UIMA).UIMA可以让你定义自己的分析引擎通道,逐步添加元数据到文档标注.

  关于Solr UIMA的更多信息,参考https://wiki.apache.org/solr/SolrUIMA.

1.1 Configuring UIMA

 solr UIMA的UpdateRequestProcessor是一个自定义的更新请求处理器.发送它们给UIMA管道,然后返回具有丰富元数据的文档.按照下面步骤配置UIMA:

  1. solrconfig.xml,复制/solr-4.x.y/dist/solr-uima-4.x.y.jar包和它的contrib/uima/lib下面的类库到solr的类库目录下.

<lib dir="../../contrib/uima/lib" />
<lib dir="../../dist/" regex="solr-uima-\d.*\.jar" />

 

  2.schema.xml中,添加元数据字段:

<field name="language" type="string" indexed="true" stored="true"  required="false" />
<field name="concept" type="string" indexed="true" stored="true" multiValued="true" required="false" />
<field name="sentence" type="text" indexed="true" stored="true" multiValued="true" required="false" />

 

  3.在solrconfig.xml中添加如下片段:

<updateRequestProcessorChain name="uima">
    <processor
        class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
        <lst name="uimaConfig">
            <lst name="runtimeParameters">
                <str name="keyword_apikey">VALID_ALCHEMYAPI_KEYstr>
                <str name="concept_apikey">VALID_ALCHEMYAPI_KEYstr>
                <str name="lang_apikey">VALID_ALCHEMYAPI_KEYstr>
                <str name="cat_apikey">VALID_ALCHEMYAPI_KEYstr>
                <str name="entities_apikey">VALID_ALCHEMYAPI_KEYstr>
                <str name="oc_licenseID">VALID_OPENCALAIS_KEYstr>
            lst>
            <str name="analysisEngine">
                /org/apache/uima/desc/OverridingParamsExtServicesAE.xml
            st
r>
                
                <bool name="ignoreErrors">truebool>
                
                <lst name="analyzeFields">
                    <bool name="merge">falsebool>
                    <arr name="fields">
                        <str>textstr>
                    arr>
                lst>
                <lst name="fieldMappings">
                    <lst name="type">
                        <str name="name">org.apache.uima.alchemy.ts.concept.ConceptFSstr>
                        <lst name="mapping">
                            <str name="feature">textstr>
                            <str name="field">conceptstr>
                        lst>
                    lst>
                    <lst name="type">
                        <str name="name">org.apache.uima.alchemy.ts.language.LanguageFSstr>
                        <lst name="mapping">
                            <str name="feature">languagestr>
                            <str name="field">languagestr>
                        lst>
                    lst>
                    <lst name="type">
                        <str name="name">org.apache.uima.SentenceAnnotationstr>
                        <lst name="mapping">
                            <str name="feature">coveredTextstr>
                            <str name="field">sentencestr>
                        lst>
                    lst>
                lst>
        lst>
    processor>
    <processor class="solr.LogUpdateProcessorFactory" />
    <processor class="solr.RunUpdateProcessorFactory" />
updateRequestProcessorChain>

   4. 在solrconfig.xml中替换已经存在的UpdateRequestHandler或者创建新的UpdateRequestHandler.

<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
  <lst name="defaults">
    <str name="update.processor">uimastr>
  lst>
requestHandler>

 

转载于:https://www.cnblogs.com/a198720/p/4323208.html

你可能感兴趣的:(1.6.9 UIMA Integration)