solr 5.3.1 源码 阅读 - 2

Solr5.3.1流程整理 2 
 


1. 主要的action
/$[new_core]/update/extract, /$[new_core]/update/,
其中$[new_core],为实际的core


2. 配置文件 conf/solrconfig.xml 中,对应以下
<requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler">
<lst name="defaults">
<str name="lowernames">true</str>
<str name="fmap.meta">ignored_</str>
<str name="fmap.content">_text_</str>
</lst>
</requestHandler>


源文件位置:
solr/contrib/extraction/src/java/org/apache/solr/handler/extraction/ExtractingRequestHandler.java


3. 主Servlet — SolrDispatchFilter,


abstract interface javax.servlet.Filter ...
public abstract void init(FilterConfig arg0) throws ServletException;
public abstract void doFilter(ServletRequest arg0, ServletResponse arg1, FilterChain arg2) throws...
public abstract void destroy();


abstract class BaseSolrFilter implements Filter ...
只包含了一个static  静态代码块 (日志配置 相关)


SolrDispatchFilter extends BaseSolrFilter ...


public void init(FilterConfig config) ...
|-> o.a.s.c.CoreContainer
cores = new CoreContainer(nodeConfig, extraProperties, true);
cores.load();
|-> org.apache.solr.handler.component.HttpShardHandlerFactory
|-> org.apache.solr.update.UpdateShardHandler
|-> zkSys.initZooKeeper(this, solrHome, cfg.getCloudConfig());
|-> o.a.s.c.CoresLocator


public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) ...
|-> HttpSolrCall call = getHttpSolrCall(...
|-> Action result = call.call();
|-> private void init() throws...
|-> handler = cores.getRequestHandler(path);


public void destroy() {...
|-> cores.shutdown();
DirectUpdateHandler2
SolrCoreState
CachingDirectoryFactory
NRTCachingDirectoryFactory 
ContextHandler




4. 类继承关系
SolrInfoMBean                                                     interface
SolrRequestHandler extends SolrInfoMBean
RequestHandlerBase implements SolrRequestHandler, SolrInfoMBean, NestedRequestHandler
ContentStreamHandlerBase extends RequestHandlerBase
ExtractingRequestHandler extends ContentStreamHandlerBase implements SolrCoreAware


主要方法:
interface SolrRequestHandler


public void handleRequest(SolrQueryRequest req, SolrQueryResponse rsp);


abstract class RequestHandlerBase implements SolrRequestHandler...


public abstract void handleRequestBody( SolrQueryRequest req, SolrQueryResponse rsp ) throws Exception;


public void handleRequest(SolrQueryRequest req, SolrQueryResponse rsp) {...
|-> handleRequestBody( req, rsp );




5. 数据相关的方法
abstract class ContentStreamHandlerBase extends RequestHandlerBase ...


public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception {...
|-> ContentStreamLoader documentLoader = newLoader(req, processor);
|-> documentLoader.load(req, rsp, stream, processor);


ExtractingRequestHandler extends ContentStreamHandlerBase ...


protected ContentStreamLoader newLoader(SolrQueryRequest req, UpdateRequestProcessor processor) {...
|-> new ExtractingDocumentLoader(req, processor, config, factory);


以上可以看到,在重载的handler中,创建了与之相关的Document类, 并且由这个类的load方法,具体处理输入的content


接下来,需要看一下这个文档的loader具体做了哪些。
abstract class ContentStreamLoader ...


  public abstract void load(SolrQueryRequest req, 
      SolrQueryResponse rsp, 
      ContentStream stream, 
      UpdateRequestProcessor processor) throws Exception;


ExtractingDocumentLoader extends ContentStreamLoader ...


public void load(SolrQueryRequest req, SolrQueryResponse rsp, ...




主要被调用的方法有:(准备以后详细分析)
1)
SolrContentHandler handler = factory.createSolrContentHandler(metadata, params, req.getSchema());
ContentHandler parsingHandler = handler;




2)
parser.parse(inputStream, parsingHandler, metadata, context);




tika-core-1.7.jar (以前是lucene的一个子项目)
org.apache.tika.parser.Parser




3)
addDoc(handler);
|-> processor.processAdd(template);

你可能感兴趣的:(源码,Solr,5.3.1)