Elastic 提供丰富的排序,大部分基于TF/IDF计算score。然后有时业务需要自定义排序,就是根据一个规则来计算score,然后根据这个score进行排序。目前实现自定义排序有两种方案:
- Function Score
- Script
1.1 Groovy scripts
1.2 Native Scripts
本文重点介绍以Native Scripts插件的形式实现Elatic 自定义排序。
注意:Elastic版本更新比较快,在不同版本实现方式不一样。在参考本文时候注意Elastic的版本。Native Scripts在5.0~5.4可以正常使用,在5.5版本中被弃用,6.0版本完全被移除。使用Elastic 5.5版本以上需要使用ScriptEngine。
- Sometimes groovy and expression aren’t enough. For those times you can implement a native script.
- Native Scripts were deprecated in v5.5.0 and removed in v6.0.0。Consider migrating your native scripts to the ScriptEngine.
使用ScriptPlugin插件实现一个简单排序:
定义一个”feature”字段,而该字段的打分规则由我们自己制定。其规则如下:
- 如果查询字段feature与被查询字段feature长度相等,此时被查询的文档得分90
- 如果查询字段feature长度比被查询字段feature长度小,此时被查询的文档得分60
- 如果查询字段feature长度比被查询字段feature长度大,此时被查询的文档得分30
参考 Native(Java)Scripts帮助文档, 其代码实现如下:
package com.zz.localservice.es.plugin;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.xcontent.support.XContentMapValues;
import org.elasticsearch.plugins.Plugin;
import org.elasticsearch.plugins.ScriptPlugin;
import org.elasticsearch.script.AbstractDoubleSearchScript;
import org.elasticsearch.script.ExecutableScript;
import org.elasticsearch.script.NativeScriptFactory;
import java.util.Collections;
import java.util.List;
import java.util.Map;
/**
* Created with IntelliJ IDEA.
* User: smartfly2017
* Date: 2017/10/16
* Time: 15:53
* Description:
* To change this template use File | Settings | File Templates | Includes | File Header
*/
public class MyNativeScriptPlugin extends Plugin implements ScriptPlugin{
private final static Logger logger = LogManager.getLogger(MyNativeScriptPlugin.class);
@Override
public List getNativeScripts() {
return Collections.singletonList(new MyNativeScriptFactory());
}
public static class MyNativeScriptFactory implements NativeScriptFactory {
@Override
public ExecutableScript newScript(@Nullable Map params) {
String feature = params == null ? null : XContentMapValues.nodeStringValue(params.get("feature"), null);
if (feature == null){
logger.info("feature is null!");
}
return new MyNativeScript(feature);
}
@Override
public boolean needsScores() {
return false;
}
@Override
public String getName() {
return "my_script";
}
}
public static class MyNativeScript extends AbstractDoubleSearchScript {
private final String feature;
public MyNativeScript(String feature) {
this.feature = feature;
}
@Override
public double runAsDouble() {
String sourceFeature = (String) source().get("feature");
int len1 = feature.length();
int len2 = sourceFeature.length();
if (len1 == len2){
return 90;
} else if (len1 < len2){
return 60;
} else {
return 30;
}
}
}
}
由于Elastic所有插件必须包含plugin-descriptor.properties
文件在elasticsearch文件夹中。
其plugin-descriptor.properties
文件如下:
description=${project.description}.
version=${project.version}
name=${project.artifactId}
classname=com.zz.localservice.es.plugin.MyNativeScriptPlugin
java.version=1.8
elasticsearch.version=5.3.0
为了保证配置文件和jar都包含在elasticsearch文件下,使用plugin.xml配置文件,其配置如下:
<assembly>
<id>pluginid>
<formats>
<format>zipformat>
formats>
<includeBaseDirectory>falseincludeBaseDirectory>
<files>
<file>
<source>${project.basedir}/src/main/resources/plugin-descriptor.propertiessource>
<outputDirectory>elasticsearchoutputDirectory>
<filtered>truefiltered>
file>
files>
<dependencySets>
<dependencySet>
<outputDirectory>elasticsearchoutputDirectory>
<useProjectArtifact>trueuseProjectArtifact>
<useTransitiveFiltering>trueuseTransitiveFiltering>
dependencySet>
dependencySets>
assembly>
为了保证编译打包正确,需要配置好pom文件。其pom.xml文件配置如下:
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0modelVersion>
<groupId>com.zz.localservice.aigroupId>
<artifactId>es-sortartifactId>
<version>1.0-SNAPSHOTversion>
<name>Plugin: Basicname>
<description>Only for testdescription>
<properties>
<es.version>5.3.0es.version>
<lucene.version>6.4.1lucene.version>
properties>
<dependencies>
<dependency>
<groupId>org.elasticsearchgroupId>
<artifactId>elasticsearchartifactId>
<version>${es.version}version>
<scope>providedscope>
dependency>
<dependency>
<groupId>org.apache.logging.log4jgroupId>
<artifactId>log4j-apiartifactId>
<version>2.7version>
<scope>providedscope>
dependency>
<dependency>
<groupId>org.apache.logging.log4jgroupId>
<artifactId>log4j-coreartifactId>
<version>2.7version>
<scope>testscope>
dependency>
<dependency>
<groupId>org.elasticsearch.testgroupId>
<artifactId>frameworkartifactId>
<version>${es.version}version>
<scope>testscope>
dependency>
<dependency>
<groupId>org.apache.lucenegroupId>
<artifactId>lucene-test-frameworkartifactId>
<version>${lucene.version}version>
<scope>testscope>
dependency>
dependencies>
<build>
<resources>
<resource>
<directory>src/main/resourcesdirectory>
<filtering>falsefiltering>
<excludes>
<exclude>*.propertiesexclude>
excludes>
resource>
resources>
<plugins>
<plugin>
<artifactId>maven-assembly-pluginartifactId>
<version>2.3version>
<configuration>
<appendAssemblyId>falseappendAssemblyId>
<outputDirectory>${project.build.directory}/releases/outputDirectory>
<descriptors>
<descriptor>${basedir}/src/main/assemblies/plugin.xmldescriptor>
descriptors>
configuration>
<executions>
<execution>
<phase>packagephase>
<goals>
<goal>singlegoal>
goals>
execution>
executions>
plugin>
<plugin>
<groupId>org.apache.maven.pluginsgroupId>
<artifactId>maven-compiler-pluginartifactId>
<version>3.7.0version>
<configuration>
<source>1.8source>
<target>1.8target>
configuration>
plugin>
plugins>
build>
project>
使用打包的方式和一般的maven项目相同,使用下面命令:
mvn clean install
打包完成后插件在/target/releases/目录下es-sort-1.0-SNAPSHOT.zip文件
当我们此时Java plugin时候,需要安装该插件。此处安装插件跟安装其他插件一样。使用bin/elasticsearch-plugin install file:///path/to/your/plugin
。
注意:
为了使Elastic支持动态插件,需要在elasticsearch.yml配置文件添加如下配置:
script.inline: true
script.stored: true
如果不添加此配置,在后面测试会报如下错误:
Failed to compile inline script [my_script] using lang [native]
PUT my_index_test
{
"mappings": {
"my_type": {
"properties": {
"feature": {
"type": "keyword",
"index": "not_analyzed"
},
"tag": {
"type": "keyword"
},
"testname": {
"type": "text"
}
}
}
}
}
PUT /my_index_test/my_type/1
{
"feature": "abc",
"tag": "mytagabc",
"testname": "Hello world"
}
PUT /my_index_test/my_type/2
{
"feature": "123456",
"tag": "2mytagabc",
"testname": "2Hello world"
}
PUT /my_index_test/my_type/3
{
"feature": "def789kkk",
"tag": "3mytagabc",
"testname": "3Hello world"
}
POST /my_index_test/my_type/_search?pretty
{
"query": {
"function_score": {
"query": {
"match_all" : { }
},
"functions": [
{
"script_score": {
"script": {
"inline": "my_script",
"lang" : "native",
"params":
{
"feature": "aaaaaa"
}
}
}
}
]
}
}
}
测试结果为:
{
"took" : 101,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 90.0,
"hits" : [
{
"_index" : "my_index_test",
"_type" : "my_type",
"_id" : "2",
"_score" : 90.0,
"_source" : {
"feature" : "123456",
"tag" : "2mytagabc",
"testname" : "2Hello world"
}
},
{
"_index" : "my_index_test",
"_type" : "my_type",
"_id" : "3",
"_score" : 60.0,
"_source" : {
"feature" : "def789kkk",
"tag" : "3mytagabc",
"testname" : "3Hello world"
}
},
{
"_index" : "my_index_test",
"_type" : "my_type",
"_id" : "1",
"_score" : 30.0,
"_source" : {
"feature" : "abc",
"tag" : "mytagabc",
"testname" : "Hello world"
}
}
]
}
}
Elastic 5.3 Native(Java)Scripts
Help for plugin authors
elasticsearch系列(七)java定义score
Elasticsearch 2.0 自定排序插件实现
elasticsearch5.2.2 插件开发(三)ScriptPlugin 的实现