前言
数据来源,由Rapid7收集并提供下载
https://scans.io/study/sonar.fdns
下载Elasticsearch 2.3
ElasticSearch是一个基于Lucene开发的搜索服务器,具有分布式多用户的能力,ElasticSearch是用Java开发的开源项目(Apache许可条款),基于Restful Web接口,能够达到实时搜索、稳定、可靠、快速、高性能、安装使用方便,同时它的横向扩展能力非常强,不需要重启服务。
Elasticsearch 高版本和低版本有细微的差别,大多数中文文档都是关于低版本的
https://www.elastic.co/downloads/past-releases/elasticsearch-2-3-0
安装head插件
elasticsearch-head是一个web前端工具,可以用来和ElasticSearch集群进行可视化交互
安装好jdk
bin/elasticsearch.bat
bin/plugin.bat install mobz/elasticsearch-head
https://github.com/mobz/elasticsearch-head
建立索引并创建映射
PUT /test
{
"settings": {
"index": {
"number_of_shards": "5",
"number_of_replicas": "0"
}
},
"mappings": {
"my_type": {
"properties": {
"title": {
"type": "string",
"index": "not_analyzed"
},
"name" : {
"type" : "string"
}
}
}
}
}
测试映射
GET /test/_analyze
{
"field": "title",
"text": "[email protected]"
}
添加单条数据
POST /test/my_type/
{
"title": "[email protected]",
"name": "[email protected]",
}
简单搜索
GET /test/my_type/_search?q=name:cats
https://www.elastic.co/guide/en/elasticsearch/reference/2.3/search-uri-request.html
利用请求体进行结构化搜索
GET /test/my_type/_search?q=name:cats
{
"query": {
"prefix": {
"name": "blacdfdsfk"
}
}
}
自定义分析器
包含字符过滤器,分词器,标记过滤器三部分
由于是dns数据,需要根据特定的情况自定义分析器,将词逆转,分割符设为”.”等
PUT /my_index
{
"settings": {
"analysis": {
"analyzer": {
"domain_name_analyzer": {
"filter":"lowercase",
"tokenizer": "domain_name_tokenizer",
"type": "custom"
}
},
"tokenizer": {
"domain_name_tokenizer": {
"type": "PathHierarchy",
"delimiter": ".",
"reverse": true
}
}
}
}
}
PUT /test_index/_mapping/site
{
"properties": {
"url": {
"type": "string",
"analyzer": "domain_name_analyzer"
}
}
}
导入数据测试
PUT /dnsrecords
{
"settings": {
"index": {
"number_of_shards": "5",
"number_of_replicas": "0"
},
"analysis": {
"analyzer": {
"domain_name_analyzer": {
"filter":"lowercase",
"tokenizer": "domain_name_tokenizer",
"type": "custom"
}
},
"tokenizer": {
"domain_name_tokenizer": {
"type": "PathHierarchy",
"delimiter": ".",
"reverse": true
}
}
}
},
"mappings": {
"forward": {
"properties": {
"domain": {
"type": "string",
"analyzer": "domain_name_analyzer"
},
"type" : {
"type" : "string",
"index": "not_analyzed"
},
"record" :{
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
查询
GET /dnsrecords/forward/_search HTTP/1.1
{
"query": {
"term": {
"domain": "qidian.com"
}
}
}
参考
https://github.com/Pynow/elasticsearch
http://wiki.jikexueyuan.com/project/elasticsearch-definitive-guide-cn/