ElasticSearch中使用reiver-jdbc从数据库导入数据

ElasticSearch中提供了River模块来从其他数据源中获取数据,该项功能以插件的形式存在,目前已有的River插件包括:

River Pluginsedit

Supported by Elasticsearch

  • CouchDB River Plugin

  • RabbitMQ River Plugin

  • Twitter River Plugin

  • Wikipedia River Plugin

Supported by the community

  • ActiveMQ River Plugin (by Dominik Dorn)

  • Amazon SQS River Plugin (by Alex Bogdanovski)

  • CSV River Plugin (by Martin Bednar)

  • Dropbox River Plugin (by David Pilato)

  • FileSystem River Plugin (by David Pilato)

  • Git River Plugin (by Olivier Bazoud)

  • GitHub River Plugin (by uberVU)

  • Hazelcast River Plugin (by Steve Samuel)

  • JDBC River Plugin (by Jörg Prante)

  • JMS River Plugin (by Steve Sarandos)

  • Kafka River Plugin (by Endgame Inc.)

  • Kafka River Plugin 2 (by Mariam Hakobyan)

  • LDAP River Plugin (by Tanguy Leroux)

  • MongoDB River Plugin (by Richard Louapre)

  • Neo4j River Plugin (by Steve Samuel)

  • Open Archives Initiative (OAI) River Plugin (by Jörg Prante)

  • Redis River Plugin (by Steve Samuel)

  • RethinkDB River Plugin (by RethinkDB)

  • RSS River Plugin (by David Pilato)

  • Sofa River Plugin (by adamlofts)

  • Solr River Plugin (by Luca Cavanna)

  • St9 River Plugin (by Sunny Gleason)

  • Subversion River Plugin (by Pascal Lombard)

  • DynamoDB River Plugin (by Kevin Wang)

  • IMAP/POP3 Email River Plugin (by Hendrik Saly)

  • Web River Plugin (by CodeLibs Project)

  • EEA ElasticSearch RDF River Plugin (by the European Environment Agency)

  • Amazon S3 River Plugin (by Laurent Broudoux)

  • Google Drive River Plugin (by Laurent Broudoux)

可以看出,已经覆盖了大部分的数据源,特别是针对关系型数据库提供了统一的jdbc-river来进行数据操作。elasticsearch-river-jdbc的源码在:github.com/jprante/elasticsearch-river-jdbc,该项目提供了详细的文档,下面以SQL Server为例简单说明使用方法。

首先,需要安装elasticsearch-river-jdbc,在elasticsearch目录下执行:

./bin/plugin --install jdbc --url http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-river-jdbc/1.5.0.0/elasticsearch-river-jdbc-1.5.0.0.zip

然后,安装SQLServer的JDBC库,链接为: Microsoft JDBC Driver.把其中的 'sqljdbc4.jar'复制到elasticsearch安装目录的lib文件夹下。

考虑到elasticsearch集群,以上两个步骤在每个节点上都需要执行。

最后也是最关键的一步,在elasticsearch中建立river,让elasticsearch自动从SQLServer中获取数据。

PUT /_river/mytest_river/_meta

{

"type" : "jdbc",

"jdbc" : {

"driver":"com.microsoft.sqlserver.jdbc.SQLServerDriver",

"url":"jdbc:sqlserver://MYSQLSERVERNAME;databaseName=MYProductDatabase",

"user":"admin","password":"Password",

"sql":"select ProductID as _id, CategoryID,ManufacturerID,MfName,ProductTitle,MfgPartNumber from MyProductsTable(nolock)",

"poll":"10m",

"strategy" : "simple",

"index" : "myinventory",

"type" : "product",

"bulk_size" : 100,

"max_retries": 5,

"max_retries_wait":"30s",

"max_bulk_requests" : 5,

"bulk_flush_interval" : "5s"

}

}


你可能感兴趣的:(ElasticSearch中使用reiver-jdbc从数据库导入数据)