Logstash 安装与测试数据导入

一、相关连接

  • 下载最MovieLens最小测试数据集:https://grouplens.org/datasets/movielens/
  • Logstash下载:https://www.elastic.co/cn/downloads/logstash
  • Logstash参考文档:https://www.elastic.co/guide/en/logstash/current/index.html

二、安装 Logstash

下载与ES相同版本号的logstash,并解压到相应目录

修改logstash/conf目录下的logstash.conf文件

path修改为,你实际的movies.csv路径

input {
  file {
    path => "YOUR_FULL_PATH_OF_movies.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

启动Elasticsearch实例,然后启动 logstash,并制定配置文件导入数据

bin/logstash -f /YOUR_PATH_of_logstash.conf

windows 版

input {
  file {
    path => ["D:/logstash-7.1.1/movielens/ml-latest-small/movies.csv"]
    start_position => beginning
    sincedb_path => "D:/logstash-7.1.1/123"
  }
}

三、导入测试数据

logstash.conf //logstash 7.x 配置文件,
logstash6.conf //logstash 6.x 配置文件

input {
  file {
    path => "/Users/izaodao/Documents/es_file/ml-25m/movies.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}
filter {
  csv {
    separator => ","
    columns => ["id","content","genre"]
  }

  mutate {
    split => { "genre" => "|" }
    remove_field => ["path", "host","@timestamp","message"]
  }

  mutate {

    split => ["content", "("]
    add_field => { "title" => "%{[content][0]}"}
    add_field => { "year" => "%{[content][1]}"}
  }

  mutate {
    convert => {
      "year" => "integer"
    }
    strip => ["title"]
    remove_field => ["path", "host","@timestamp","message","content"]
  }

}
output {
   elasticsearch {
     hosts => "http://localhost:9200"
     index => "movies"
     document_id => "%{id}"
   }
  stdout {}
}

你可能感兴趣的:(Logstash 安装与测试数据导入)