logstash导入movielens测试数据

1. movielens数据

https://grouplens.org/dataset...
学习训练,使用最小数据集即可:
(ml-latest-small)[https://files.grouplens.org/d...]

2. logstash配置文件:

在logstash/conf目录下拷贝一份logstash-sample.conf文件, 命名为:logstash-movies.conf,内容如下:

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

input {
  file {
    path => "/export/_backup/elk_bak/ml-latest-small/movies.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {
  csv {
    separator => ","
    columns => ["id", "content", "genre"]
  }
  
  mutate {
    split => { "genre" => "|"}
    remove_field => ["path", "host", "@timestamp", "message"]
  }
  
  mutate {
    split => { "content" => "(" }
    add_field => { "title" => "%{[content][0]}"}
    add_field => { "year" => "%{[content][1]}"}
  }
  
  mutate {
    convert => {
      "year" => "integer"
    }
    strip => ["title"]
    remove_field => ["path", "host", "@timestamp", "content"]
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "movies"
    document_id => "%{id}"
    #user => "user"
    #password => "password"
  }
  stdout {}
}

3. 执行导入

bin/logstash -f config config/logstash-movies.conf
执行需要等一会!
而后控制台输出内容,如下
......
{
          "id" => "193609",
       "genre" => [
        [0] "Comedy"
    ],
       "title" => "Andrew Dice Clay: Dice Rules",
    "@version" => "1",
        "year" => 1991
}

待控制台不再输出,ctrl+c停止即可

4. kibana检查数据是否导入index

index管理中出现所导入的索引,即成功!
logstash导入movielens测试数据_第1张图片

你可能感兴趣的:(logstash导入movielens测试数据)