Hadoop,Hive参考:
大数据学习[04]:Hive安装配置:
http://blog.csdn.net/ld326/article/details/78023101
大数据学习[02]:hadoop安装配置: http://blog.csdn.net/ld326/article/details/78004402
https://prestodb.io/
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.184/presto-server-0.184.tar.gz
tar zvf presto-server-0.184.tar.gz
与目录presto-server-0.184相同的目录创建一个数据日志数据文件夹,presto-data; 为了方便之后升级这个文件不用修改;
etc
├── catalog
│ └── hive.properties
├── config.properties
├── jvm.config
├── log.properties
└── node.properties
在presto的home目录下创建一个etc目录,主要配置以下的信息:
● Node Properties: environmental configuration specific to each node
● JVM Config: command line options for the Java Virtual Machine
● Config Properties: configuration for the Presto server
● Catalog Properties: configuration for Connectors (data sources)
创建配置节点信息,先创建etc/node.properties文件,在一个机器上,节点是Presto安装的单例。简单配置一下etc/node.properties文件:
node.environment=pcluster
node.id=111
node.data-dir=/home/hadoop/data/presto
记得创建一个/home/hadoop/data/presto文件夹,当presto运行后会是这样的目录结构:
data
└── presto
├── etc -> /home/hadoop/presto-server-0.184/etc
├── plugin -> /home/hadoop/presto-server-0.184/plugin
└── var
├── log
│ ├── http-request.log
│ ├── launcher.log
│ └── server.log
└── run
└── launcher.pid
● node.environment: The name of the environment. All Presto nodes in a cluster must have the same environment name.
node.environment:是一个presto事个集群环境的名字,所有的Presto节点都要有这个相同的名字。
● node.id: The unique identifier for this installation of Presto. This must be unique for every node. This identifier should remain consistent across reboots or upgrades of Presto. If running multiple installations of Presto on a single machine (i.e. multiple nodes on the same machine), each installation must have a unique identifier.
node.id:Presto安装的唯一识别,对于每个节点这个id是唯一的;Presto无论重启或更新这个id是保持不变的。
● node.data-dir: The location (filesystem path) of the data directory. Presto will store logs and other data here.
nod.data-dir:Presto的数据储存的本地目录
文件:etc/jvm.config
-server
-Xmx1G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
文件:etc/config.properties
presto server 中分两个角色:coordinator,workers;
Every Presto server can function as both a coordinator and a worker, but dedicating a single machine to only perform coordination work provides the best performance on larger clusters.
每台Presto服务器都可以做coordinatorgn与worker,但是在一个大的集群中,用单个机器作为coordination有好的性能。
coordinator[192.168.137.101]:
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
query.max-memory=1GB
query.max-memory-per-node=512MB
discovery-server.enabled=true
discovery.uri=http://hadoop01:8080
workers[192.168.137.102,192.168.137.103]:
coordinator=false
http-server.http.port=8080
query.max-memory=1GB
discovery.uri=http://hadoop01:8080
或想一台机器设置两个角色:
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=1GB
query.max-memory-per-node=512MB
discovery-server.enabled=true
discovery.uri=http://hadoop01:8080
这里就调整了一下:node-scheduler.include-coordinator,让主机也作为work节点。
注意:这里的hadoop01为主机名,如果配置成IP会出错。
说明:
● coordinator: Allow this Presto instance to function as a coordinator (accept queries from clients and manage query execution).
● node-scheduler.include-coordinator: Allow scheduling work on the coordinator. For larger clusters, processing work on the coordinator can impact query performance because the machine’s resources are not available for the critical task of scheduling, managing and monitoring query execution.
● http-server.http.port: Specifies the port for the HTTP server. Presto uses HTTP for all communication, internal and external.
● query.max-memory: The maximum amount of distributed memory that a query may use.
● query.max-memory-per-node: The maximum amount of memory that a query may use on any one machine.
● discovery-server.enabled: Presto uses the Discovery service to find all the nodes in the cluster. Every Presto instance will register itself with the Discovery service on startup. In order to simplify deployment and avoid running an additional service, the Presto coordinator can run an embedded version of the Discovery service. It shares the HTTP server with Presto and thus uses the same port.
● discovery.uri: The URI to the Discovery server. Because we have enabled the embedded version of Discovery in the Presto coordinator, this should be the URI of the Presto coordinator. Replace example.net:8080 to match the host and port of the Presto coordinator. This URI must not end in a slash.
文件:etc/log.properties
配置内容
com.facebook.presto=INFO
Connectors
● 5.1. Accumulo Connector
● 5.2. Black Hole Connector
● 5.3. Cassandra Connector
● 5.4. Hive Connector
● 5.5. Hive Security Configuration
● 5.6. JMX Connector
● 5.7. Kafka Connector
● 5.8. Kafka Connector Tutorial
● 5.9. Local File Connector
● 5.10. Memory Connector
● 5.11. MongoDB Connector
● 5.12. MySQL Connector
● 5.13. PostgreSQL Connector
● 5.14. Redis Connector
● 5.15. SQL Server Connector
● 5.16. System Connector
● 5.17. Thrift Connector
● 5.18. TPCDS Connector
● 5.19. TPCH Connector
vim hive.properties
connector.name=hive-hadoop2
hive.metastore.uri=thrift://hadoop01:9083
hive.config.resources=/home/hadoop/hadoop/etc/hadoop/core-site.xml, /home/hadoop/apache-hive-1.2.1-bin/conf/hdfs-site.xml
hive.allow-drop-table=true
记得hive-hadoop2这个名字不能乱起的哈,其它名字presto不认识的。
presto的hive配置属性说明表
两种启动方法
daemon运行:bin/launcher start
foreground运行:bin/launcher run
如果用到supersive运行的话,要用foreground运行这个。
下载:
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.184/presto-cli-0.184-executable.jar
修改名为presto,修改运行权限chmod +x,运行
./presto --server localhost:8080 --catalog hive --schema default
提示:Run the CLI with the –help option to see the available options.
cli启动:
[hadoop@hadoop01 ~]$ ./presto.jar --server 192.168.137.101:8080 --catalog hive --schema default
–server要跟前面presto中config.properties中所配置的IP与端口一致。
2017-09-24T22:04:24.360+0800 INFO main Bootstrap PROPERTY DEFAULT RUNTIME DESCRIPTION
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.allow-corrupt-writes-for-testing false false Allow Hive connector to write data even when data will likely be corrupt
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.assume-canonical-partition-keys false false
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.bucket-execution true true Enable bucket-aware execution: only use a single worker per bucket
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.bucket-writing true true Enable writing to bucketed tables
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.dfs.connect.max-retries 5 5
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.dfs.connect.timeout 500.00ms 500.00ms
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.dfs-timeout 60.00s 60.00s
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.domain-compaction-threshold 100 100 Maximum ranges to allow in a tuple domain without compacting it
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.dfs.domain-socket-path null null
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.fs.cache.max-size 1000 1000 Hadoop FileSystem cache size
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.force-local-scheduling false false
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.hdfs.authentication.type NONE NONE HDFS authentication type
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.hdfs.impersonation.enabled false false Should Presto user be impersonated when communicating with HDFS
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.compression-codec GZIP GZIP
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore.authentication.type NONE NONE Hive Metastore authentication type
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.storage-format RCBINARY RCBINARY
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.immutable-partitions false false Can new data be inserted into existing partitions or existing unpartitioned tables
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.dfs.ipc-ping-interval 10.00s 10.00s
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-concurrent-file-renames 20 20
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-initial-split-size 32MB 32MB
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-initial-splits 200 200
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore-refresh-max-threads 100 100
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-outstanding-splits 1000 1000
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore.partition-batch-size.max 100 100
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-partitions-per-scan 100000 100000 Maximum allowed partitions for a single table scan
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-partitions-per-writers 100 100 Maximum number of partitions per writer
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-split-iterator-threads 1000 1000
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.max-split-size 64MB 64MB
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore-cache-maximum-size 10000 10000
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore-cache-ttl 0.00s 0.00s
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore-refresh-interval 0.00s 0.00s
2017-09-24T22:04:24.360+0800 INFO main Bootstrap hive.metastore.thrift.client.socks-proxy null null
2017-09-24T22:04:24.361+0800 INFO main Bootstrap hive.metastore-timeout 10.00s 10.00s
2017-09-24T22:04:24.361+0800 INFO main Bootstrap hive.metastore.partition-batch-size.min 10 10
2017-09-24T22:04:24.361+0800 INFO main Bootstrap hive.orc.bloom-filters.enabled false false
2017-09-24T22:04:24.361+0800 INFO main Bootstrap hive.orc.default-bloom-filter-fpp 0.05 0.05 ORC Bloom filter false positive probability
2017-09-24T22:04:24.361+0800 INFO main Bootstrap hive.orc.max-buffer-size 8MB 8MB
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.orc.max-merge-distance 1MB 1MB
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.orc.max-read-block-size 16MB 16MB
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.orc.optimized-writer.enabled false false
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.orc.stream-buffer-size 8MB 8MB
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.parquet-optimized-reader.enabled false false
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.parquet-predicate-pushdown.enabled false false
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.per-transaction-metastore-cache-maximum-size 1000 1000
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.rcfile-optimized-writer.enabled true true
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.rcfile.writer.validate false false Validate RCFile after write by re-reading the whole file
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.recursive-directories false false
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.config.resources null [/home/hadoop/hadoop/etc/hadoop/core-site.xml, /home/hadoop/apache-hive-1.2.1-bin/conf/hdfs-site.xml]
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.respect-table-format true true Should new partitions be written using the existing table format or the default Presto format
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.skip-deletion-for-alter false false Skip deletion of old partition data when a partition is deleted and then inserted in the same transaction
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.table-statistics-enabled true true Enable use of table statistics
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.time-zone PRC PRC
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.orc.use-column-names false false Access ORC columns using names from the file
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.parquet.use-column-names false false Access Parquet columns using names from the file
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.dfs.verify-checksum true true
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.write-validation-threads 16 16 Number of threads used for verifying data after a write
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.non-managed-table-writes-enabled false false Enable writes to non-managed (external) tables
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.pin-client-to-current-region false false Should the S3 client be pinned to the current EC2 region
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.aws-access-key null null
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.aws-secret-key [REDACTED] [REDACTED]
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.connect-timeout 5.00s 5.00s
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.encryption-materials-provider null null Use a custom encryption materials provider for S3 data encryption
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.endpoint null null
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.kms-key-id null null Use an AWS KMS key for S3 data encryption
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.max-backoff-time 10.00m 10.00m
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.max-client-retries 5 5
2017-09-24T22:04:24.365+0800 INFO main Bootstrap hive.s3.max-connections 500 500
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.max-error-retries 10 10
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.max-retry-time 10.00m 10.00m
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.multipart.min-file-size 16MB 16MB Minimum file size for an S3 multipart upload
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.multipart.min-part-size 5MB 5MB Minimum part size for an S3 multipart upload
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.signer-type null null
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.socket-timeout 5.00s 5.00s
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.sse.enabled false false Enable S3 server side encryption
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.sse.kms-key-id null null KMS Key ID to use for S3 server-side encryption with KMS-managed key
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.sse.type S3 S3 Key management type for S3 server-side encryption (S3 or KMS)
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.ssl.enabled true true
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.staging-directory /tmp /tmp Temporary directory for staging files before uploading to S3
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.use-instance-credentials true true
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.s3.user-agent-prefix The user agent prefix to use for S3 calls
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.metastore.uri null [thrift://hadoop01:9083] Hive metastore URIs (comma separated)
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.metastore thrift thrift
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.allow-add-column false false Allow Hive connector to add column
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.allow-drop-column false false Allow Hive connector to drop column
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.allow-drop-table false true Allow Hive connector to drop table
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.allow-rename-column false false Allow Hive connector to rename column
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.allow-rename-table false false Allow Hive connector to rename table
2017-09-24T22:04:24.366+0800 INFO main Bootstrap hive.security legacy legacy
查询:presto:default> show tables;
Table
-----------
dual
student
student01
student02
student03
student04
student05
(7 rows)
Query 20170924_141223_00007_bg85h, FINISHED, 1 node
Splits: 18 total, 18 done (100.00%)
0:01 [7 rows, 175B] [11 rows/s, 290B/s]
presto:default> select * from student05;
student_id | student_name | born | sex
------------+--------------+------------+-----
dlmu_01 | happyprince | 2015-07-08 | 0
(1 row)
Query 20170924_141229_00008_bg85h, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:01 [1 rows, 358B] [1 rows/s, 448B/s]
presto:default> show tables;
Table
-----------
dual
student
student01
student02
student03
student04
student05
(7 rows)
Query 20170924_145732_00004_uaaik, FINISHED, 2 nodes
Splits: 18 total, 18 done (100.00%)
0:01 [7 rows, 175B] [9 rows/s, 238B/s]
presto:default> select * from student;
student_id | student_name
------------+--------------
dlmu_01 | happyprince
(1 row)
Query 20170924_145746_00005_uaaik, FINISHED, 2 nodes
Splits: 17 total, 17 done (100.00%)
0:05 [1 rows, 20B] [0 rows/s, 3B/s]
(这个系统带有的,不用配置)
presto> show schemas from system;
Schema
--------------------
information_schema
jdbc
metadata
runtime
(4 rows)
Query 20170924_154001_00002_hnbrr, FINISHED, 2 nodes
Splits: 18 total, 18 done (100.00%)
0:02 [4 rows, 57B] [2 rows/s, 35B/s]
presto> show tables from system.information_schema;
Table
-------------------------
__internal_partitions__
columns
schemata
table_privileges
tables
views
(6 rows)
Query 20170924_154021_00003_hnbrr, FINISHED, 2 nodes
Splits: 18 total, 18 done (100.00%)
0:02 [6 rows, 233B] [3 rows/s, 136B/s]
presto> select * from system.information_schema.columns;
table_catalog | table_schema | table_name | column_name | ordinal_position | column_default | is_nullable | data_type | comment | extra_info
---------------+--------------------+-------------------------+------------------------------+------------------+----------------+-------------+----------------+---------+------------
system | runtime | queries | node_id | 1 | NULL | YES | varchar | NULL | NULL
system | runtime | queries | query_id | 2 | NULL | YES | varchar | NULL | NULL
system | runtime | queries | state | 3 | NULL | YES | varchar | NULL | NULL
system | runtime | queries | user | 4 | NULL | YES | varchar | NULL | NULL
system | runtime | queries | source | 5 | NULL | YES | varchar | NULL | NULL
[hadoop@hadoop01 catalog]$ vim jmx.properties
connector.name=jmx
jmx.dump-tables=java.lang:type=Runtime,com.facebook.presto.execution.scheduler:name=NodeScheduler
jmx.dump-period=10s
jmx.max-entries=86400
presto> SHOW TABLES FROM jmx.current;
Table
--------------------------------------------------------------------------------------------------------------------------------------------------
com.facebook.presto.execution.executor:name=multilevelsplitqueue
com.facebook.presto.execution.executor:name=taskexecutor
com.facebook.presto.execution.resourcegroups:name=internalresourcegroupmanager
com.facebook.presto.execution.scheduler:name=nodescheduler
com.facebook.presto.execution.scheduler:name=splitschedulerstats
com.facebook.presto.execution:name=queryexecution
com.facebook.presto.execution:name=querymanager
com.facebook.presto.execution:name=remotetaskfactory
com.facebook.presto.execution:name=taskmanager
com.facebook.presto.execution:type=queryqueue,name=global,expansion=global
com.facebook.presto.failuredetector:name=heartbeatfailuredetector
com.facebook.presto.hive.metastore:type=cachinghivemetastore,name=hive
com.facebook.presto.hive.metastore:type=thrifthivemetastore,name=hive
com.facebook.presto.hive:type=fileformatdatasourcestats,name=hive
com.facebook.presto.hive:type=namenodestats,name=hive
com.facebook.presto.hive:type=prestos3filesystem,name=hive
com.facebook.presto.memory:name=clustermemorymanager
com.facebook.presto.memory:type=clustermemorypool,name=general
com.facebook.presto.memory:type=clustermemorypool,name=reserved
com.facebook.presto.memory:type=clustermemorypool,name=system
com.facebook.presto.memory:type=memorypool,name=general
com.facebook.presto.memory:type=memorypool,name=reserved
com.facebook.presto.memory:type=memorypool,name=system
com.facebook.presto.metadata:name=discoverynodemanager
com.facebook.presto.operator.index:name=indexjoinlookupstats
com.facebook.presto.security:name=accesscontrolmanager
com.facebook.presto.server.remotetask:name=remotetaskstats
com.facebook.presto.server:name=asynchttpexecutionmbean
com.facebook.presto.server:name=exchangeexecutionmbean
com.facebook.presto.server:name=taskresource
com.facebook.presto.spiller:name=spillerfactory
com.facebook.presto.sql.gen:name=expressioncompiler
com.facebook.presto.sql.gen:name=joincompiler
com.facebook.presto.sql.gen:name=joinfilterfunctioncompiler
com.facebook.presto.sql.gen:name=joinprobecompiler
com.facebook.presto.sql.gen:name=orderingcompiler
com.facebook.presto.sql.gen:name=pagefunctioncompiler
com.facebook.presto.sql.planner.iterative:name=iterativeoptimizer,rule=addintermediateaggregations
com.facebook.presto.sql.planner.iterative:name=iterativeoptimizer,rule=aggregationexpressionrewrite
com.facebook.presto.sql.planner.iterative:name=iterativeoptimizer,rule=applyexpressionrewrite
com.facebook.presto.sql.planner.iterative:name=iterativeoptimizer,rule=createpartialtopn
com.facebook.presto.sql.planner.iterative:name=iterativeoptimizer,rule=eliminatecrossjoins
Query 20170924_154511_00014_hnbrr, FINISHED, 1 node
Splits: 18 total, 18 done (100.00%)
0:03 [177 rows, 15.7KB] [67 rows/s, 5.98KB/s]
presto> SELECT node, vmname, vmversion
-> FROM jmx.current."java.lang:type=runtime";
node | vmname | vmversion
-------+-----------------------------------+------------
12345 | Java HotSpot(TM) 64-Bit Server VM | 25.144-b01
111 | Java HotSpot(TM) 64-Bit Server VM | 25.144-b01
222 | Java HotSpot(TM) 64-Bit Server VM | 25.144-b01
(3 rows)
Query 20170924_154519_00015_hnbrr, FINISHED, 3 nodes
Splits: 19 total, 19 done (100.00%)
0:00 [3 rows, 140B] [11 rows/s, 523B/s]
presto> SELECT openfiledescriptorcount, maxfiledescriptorcount
-> FROM jmx.current."java.lang:type=operatingsystem";
openfiledescriptorcount | maxfiledescriptorcount
-------------------------+------------------------
978 | 4096
919 | 4096
925 | 4096
(3 rows)
Query 20170924_154527_00016_hnbrr, FINISHED, 3 nodes
Splits: 19 total, 19 done (100.00%)
0:00 [3 rows, 48B] [8 rows/s, 130B/s]
The local file connector allows querying data stored on the local file system of each worker
本地文件连接器,允许查询每个节点的数据文件存储的数据。
创建文件:etc/catalog/localfile.properties
connector.name=localfile
配置属性
Property Name Description
presto-logs.http-request-log.location Directory or file where HTTP request logs are written
presto-logs.http-request-log.pattern If the log location is a directory this glob is used to match file names in the directory
Local File Connector Schemas and Tables
The local file connector provides a single schema named logs. You can see all the available tables by running SHOW TABLES:
SHOW TABLES FROM localfile.logs;
http_request_log
This table contains the HTTP request logs from each node on the cluster.
测试结果:
presto> show schemas from localfile;
Schema
--------------------
information_schema
logs
(2 rows)
Query 20170924_154211_00006_hnbrr, FINISHED, 2 nodes
Splits: 18 total, 18 done (100.00%)
0:00 [2 rows, 32B] [4 rows/s, 79B/s]
presto> show tables from localfile.logs;
Table
------------------
http_request_log
(1 row)
Query 20170924_154222_00007_hnbrr, FINISHED, 2 nodes
Splits: 18 total, 18 done (100.00%)
0:00 [1 rows, 30B] [2 rows/s, 89B/s]
presto> show tables from localfile.information_schema;
Table
-------------------------
__internal_partitions__
columns
schemata
table_privileges
tables
views
(6 rows)
Query 20170924_154232_00008_hnbrr, FINISHED, 1 node
Splits: 18 total, 18 done (100.00%)
0:00 [7 rows, 263B] [19 rows/s, 728B/s]
presto> select * from localfile.information_schema.tables;
table_catalog | table_schema | table_name | table_type
---------------+--------------------+-------------------------+------------
localfile | information_schema | columns | BASE TABLE
localfile | information_schema | tables | BASE TABLE
localfile | information_schema | views | BASE TABLE
localfile | information_schema | schemata | BASE TABLE
localfile | information_schema | __internal_partitions__ | BASE TABLE
localfile | information_schema | table_privileges | BASE TABLE
localfile | logs | http_request_log | BASE TABLE
(7 rows)
Query 20170924_154251_00009_hnbrr, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:00 [7 rows, 466B] [32 rows/s, 2.11KB/s]
[hadoop@hadoop01 catalog]$ vim mysql.properties
配置内容为
connector.name=mysql
connection-url=jdbc:mysql://hadoop01:3306
connection-user=root
connection-password=AAAaaa111
结果测试显示:
presto> SHOW SCHEMAS FROM mysql;
Schema
--------------------
hive
information_schema
test
(3 rows)
Query 20170924_152408_00007_r5qdw, FINISHED, 2 nodes
Splits: 18 total, 18 done (100.00%)
0:00 [3 rows, 41B] [10 rows/s, 142B/s]
presto> SHOW tables FROM mysql.hive;
Table
---------------------------
bucketing_cols
cds
columns_v2
database_params
dbs
func_ru
funcs
global_privs
part_col_stats
partition_key_vals
partition_keys
partition_params
partitions
roles
sd_params
sds
sequence_table
serde_params
serdes
skewed_col_names
skewed_col_value_loc_map
skewed_string_list
skewed_string_list_values
skewed_values
sort_cols
tab_col_stats
table_params
tbls
version
(29 rows)
Query 20170924_152420_00008_r5qdw, FINISHED, 2 nodes
Splits: 18 total, 18 done (100.00%)
0:00 [29 rows, 737B] [66 rows/s, 1.66KB/s]
presto> desc mysql.hive.version;
Column | Type | Extra | Comment
-----------------+--------------+-------+---------
ver_id | bigint | |
schema_version | varchar(127) | |
version_comment | varchar(255) | |
(3 rows)
Query 20170924_152526_00009_r5qdw, FINISHED, 1 node
Splits: 18 total, 18 done (100.00%)
0:01 [3 rows, 215B] [2 rows/s, 173B/s]
presto> select * from mysql.hive.version;
ver_id | schema_version | version_comment
--------+----------------+-----------------------------------------
1 | 1.2.0 | Set by MetaStore [email protected]
(1 row)
Query 20170924_152559_00010_r5qdw, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:02 [1 rows, 0B] [0 rows/s, 0B/s]
java.lang.IllegalArgumentException: No factory for connector hive
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:191)
at com.facebook.presto.connector.ConnectorManager.createConnection(ConnectorManager.java:170)
at com.facebook.presto.metadata.StaticCatalogStore.loadCatalog(StaticCatalogStore.java:99)
at com.facebook.presto.metadata.StaticCatalogStore.loadCatalogs(StaticCatalogStore.java:77)
at com.facebook.presto.server.PrestoServer.run(PrestoServer.java:120)
at com.facebook.presto.server.PrestoServer.main(PrestoServer.java:67)
这个因为hive的名字写错了.记得要写hive-hadoop2这个名字不能乱起的哈,其它名字presto不认识的。
2017-09-24T22:25:57.586+0800 ERROR main com.facebook.presto.server.PrestoServer Unable to create injector, see the following errors:
1) Error: Invalid configuration property node.environment: is malformed (for class io.airlift.node.NodeConfig.environment)
2) Configuration property 'ordinator' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)
2 errors
com.google.inject.CreationException: Unable to create injector, see the following errors:
1) Error: Invalid configuration property node.environment: is malformed (for class io.airlift.node.NodeConfig.environment)
2) Configuration property 'ordinator' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)
2 errors
at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:466)
at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:155)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:107)
at com.google.inject.Guice.createInjector(Guice.java:96)
at io.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:242)
at com.facebook.presto.server.PrestoServer.run(PrestoServer.java:116)
at com.facebook.presto.server.PrestoServer.main(PrestoServer.java:67)
修改一下节点的名字,节点名字写得有问题;集群用一串英文,节点ID用数字;或属性名字写错。要检查配置文件中属性名是否写错或写漏字母。实践过程中,这个是因为我复制配置时,漏了一个字母造成的。
2017-09-24T22:40:08.513+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Cannot connect to discovery server for announce: Announcement failed for http://192.169.137.101:8080
2017-09-24T22:40:08.514+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 1.16s. Next request will happen within 0.00s
2017-09-24T22:40:09.278+0800 ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 744.71ms. Next request will happen within 1.00ms
2017-09-24T22:40:09.539+0800 ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 258.22ms. Next request will happen within 2.00ms
2017-09-24T22:40:10.282+0800 ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 739.25ms. Next request will happen within 4.00ms
2017-09-24T22:40:10.515+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 223.25ms. Next request will happen within 8.00ms
2017-09-24T22:40:10.562+0800 ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 30.86ms. Next request will happen within 16.00ms
2017-09-24T22:40:11.293+0800 ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 698.12ms. Next request will happen within 32.00ms
2017-09-24T22:40:11.534+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 175.65ms. Next request will happen within 64.00ms
2017-09-24T22:40:12.362+0800 ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 699.87ms. Next request will happen within 128.00ms
2017-09-24T22:40:12.665+0800 ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 45.73ms. Next request will happen within 256.00ms
^C2017-09-24T22:40:13.600+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 422.26ms. Next request will happen within 512.00ms
2017-09-24T22:40:14.667+0800 ERROR Announcer-1 io.airlift.discovery.client.Announcer Service announcement failed after 66.66ms. Next request will happen within 1000.00ms
^C2017-09-24T22:40:16.182+0800 ERROR Announcer-2 io.airlift.discovery.client.Announcer Service announcement failed after 514.05ms. Next request will happen within 1000.00ms
^V2017-09-24T22:40:17.604+0800 ERROR Announcer-0 io.airlift.discovery.client.Announcer Service announcement failed after 421.29ms. Next request will happen within 1000.00ms
把IP修改成域名或主机名就可以了, 用IP不识别;
另外对重启了一下主结点的情况,其它节点会显示,先是找不到,然后就恢复了:
配置好的presto引擎,不论源数据是什么形式的数据都可以由这引擎去查询,对于windows的界面像oracle sql/pl那样的查询,可以考虑一下这个工具:
连接Presto:SQuirrel SQL Client安装配置:
http://blog.csdn.net/ld326/article/details/77987507
【作者:happyprince, http://blog.csdn.net/ld326/article/details/78088863】