Pinot安装并简单部署测试环境

1. 下载代码

$ git clone https://github.com/linkedin/pinot.git

2. 编译pinot

$ cd pinot
$ mvn install package -DskipTests

3. 部署并启动

$ cd pinot-distribution/target/pinot-0.016-pkg
$ nohup ./bin/quick-start-offline.sh &

4. 创建schema

$ ./bin/pinot-admin.sh AddSchema -schemaFile cjf_sample/flights-schema.json

5. 创建table

$ ./bin/pinot-admin.sh AddTable -filePath ./cjf_sample/flights-definition.json

6. 生成随机数据

$ ./bin/pinot-admin.sh GenerateData -numRecords 1000000 -numFiles 1 -outDir ./flights_outdir -schemaFile ./cjf_sampleflights-schema.json

7. 生成segment

$ ./bin/pinot-admin.sh CreateSegment -format AVRO -dataDir ./flights_outdir -tableName flights -segmentName flights_seg1 -schemaFile ./cjf_sample/flights-schema.json -outDir ./flights_seg1_outdir

8. 上传segment

$ ./bin/pinot-admin.sh UploadSegment -segmentDir ./flights_seg1_outdir

9. 查询表数据

有多种方式访问pinot,可以通过web访问(http://localhost:9000/query/index.html),也可以通过API访问,这里通过curl访问:

$ curl -X POST -d '{"pql":"select count(*) from flights"}' http://localhost:8099/query


附录1:flights-schema.json

{
  "dimensionFieldSpecs" : [
    {
      "name": "Year",
      "dataType" : "INT",
      "delimiter" : null,
      "singleValueField" : true
    },
    {
      "name": "Month",
      "dataType" : "INT",
      "delimiter" : null,
      "singleValueField" : true
    },
    {
      "name": "Carrier",
      "dataType" : "STRING",
      "delimiter" : null,
      "singleValueField" : true
    },
    {
      "name": "Origin",
      "dataType" : "STRING",
      "delimiter" : null,
      "singleValueField" : true
    },
    {
      "name": "Dest",
      "dataType" : "STRING",
      "delimiter" : null,
      "singleValueField" : true
    },
    {
      "name": "DivAirports",
      "dataType" : "STRING",
      "delimiter" : null,
      "singleValueField" : false
    }
  ],
  "timeFieldSpec" : {
    "incomingGranularitySpec" : {
      "timeType" : "DAYS",
      "dataType" : "INT",
      "name" : "DaysSinceEpoch"
    }
  },
  "metricFieldSpecs" : [
    {
      "name" : "Delayed",
      "dataType" : "INT",
      "delimiter" : null,
      "singleValueField" : true
    },
    {
      "name" : "Cancelled",
      "dataType" : "INT",
      "delimiter" : null,
      "singleValueField" : true
    },
    {
      "name" : "Diverted",
      "dataType" : "INT",
      "delimiter" : null,
      "singleValueField" : true
    }
   ],
  "schemaName" : "flights"
}

附录2:flights-definition.json

{
    "tableName":"flights",
    "segmentsConfig" : {
        "retentionTimeUnit":"DAYS",
        "retentionTimeValue":"700000",
        "segmentPushFrequency":"daily",
        "segmentPushType":"APPEND",
        "replication" : "3",
        "schemaName" : "flights",
        "timeColumnName" : "daysSinceEpoch",
        "timeType" : "DAYS",
        "segmentAssignmentStrategy" : "BalanceNumSegmentAssignmentStrategy"
    },
    "tenants" : {
        "broker":"brokerOne",
        "server":"serverOne"
    },
    "tableIndexConfig" : {
        "invertedIndexColumns" : ["Carrier"],
        "loadMode"  : "HEAP",
        "lazyLoad"  : "false"
    },
    "tableType":"OFFLINE",
    "metadata": {}
}



参考文档:
Pinot 官方wiki: https://github.com/linkedin/pinot/wiki

你可能感兴趣的:(数据库,大数据)