安装前准备
Homebrew
参见: Mac下Homebrew的安装和使用
jdk安装
java -version
java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
如果没有安装, 建议安装Java8
brew cask install java # 安装最新版本
# 安装Java8
brew tap caskroom/versions
brew cask install java8
配置ssh
配置ssh就是为了实现免密登录, 这样方便远程管理Hadoop并无需登录密码在Hadoop集群上共享文件资源
如果你的机子没有配置ssh, 在命令终端输入ssh localhost
是需要输入你的电脑登录密码的.
配置好ssh后, 就无需输入密码了.
-
打开设置 > 共享 > 打开远程登陆
- iterm(终端)执行
ssh-keygen -t rsa # 然后yes
cat ~/.ssh/local.pub >> ~/.ssh/authorized_keys
现在, 在终端输入 ssh localhost
就OK了.
ssh localhost # ssh 登陆
# Last login: Fri Jan 18 14:44:36 2019
exit # 退出登陆
# Connection to localhost closed.
安装hadoop
下载安装
-
brew install hadoop
(推荐), 安装完成后你会看到安装路径在那里 - 官网下载压缩包, 解压到你指定的目录, 然后安装(不推荐)
配置hadoop
配置hadoop-env.sh
hadoop-env.sh位置:
/usr/local/Cellar/hadoop/3.1.1/libexec/etc/hadoop
添加JAVA_HOME路径
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_202.jdk/Contents/Home
# Mac查看jdk 位置 /usr/libexec/java_home -V
配置core-site.xml
配置hdfs地址和端口
fs.default.name
hdfs://localhost:8020
hadoop.tmp.dir
/Users/用户名/data/hadoop/tmp
A base for other temporary directories.
dfs.name.dir
/Users/用户名/data/hadoop/filesystem/name
Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.
dfs.data.dir
/Users/用户名/data/hadoop/filesystem/data
Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.
配置hdfs-site.xml
修改HDFS备份数, 配置namenode和datanode
dfs.replication
1
配置mapred-site.xml
配置mapreduce中jobtracker的地址和端口. 3.1.1版本下有这个文件, 可直接配置
mapreduce.framework.name
yarn
配置yarn-site.xml
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.env-whitelist
JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
格式化HDFS
# /usr/local/Cellar/hadoop/3.1.1/libexec
bin/hdfs namenode -format
运行
- sbin/start-all.sh: 启动全部
- sbin/start-dfs.sh: 启动NameNode和DataNode, NameNode - http://localhost:9870 (2.7的版本端口为50070)
- sbin/start-yarn.sh: ResourceManager - http://localhost:8088 (All Applications)
jps 可以查看进程
jps
# 34214 NameNode
# 34313 DataNode
# 34732 NodeManager
# 34637 ResourceManager
# 34446 SecondaryNameNode
# 34799 Jps
安装hive
下载安装
brew install hive
配置Hive元数据库
Hive默认用derby作为元数据库这, 我们这里换用大家熟悉的mysql来存储元数据
# 进入数据库
mysql -uroot -p
# 在数据库执行
CREATE DATABASE metastore;
# CREATE user 'hive'@'localhost' IDENTIFIED BY 'hive';
# Unable to load authentication plugin 'caching_sha2_password'.
ALTER USER 'hive'@'localhost' IDENTIFIED WITH mysql_native_password BY 'hive';
GRANT SELECT,INSERT,UPDATE,DELETE,ALTER,CREATE,INDEX,REFERENCES ON METASTORE.* TO 'hive'@'localhost';
FLUSH PRIVILEGES;
配置hive
配置mysql-connector jar包
下载地址: https://dev.mysql.com/downloads/connector/j/
将下载的文件解压, 复制
cp mysql-connector-java-5.1.44-bin.jar /usr/local/Cellar/hive/3.1.1/libexec/lib/
配置hive-site.xml
修改以下部分
javax.jdo.option.ConnectionURL
jdbc:mysql://localhost/metastore
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
javax.jdo.option.ConnectionUserName
hive(配置Hive元数据库: mysql中创建的用户名)
javax.jdo.option.ConnectionPassword
hive(配置Hive元数据库: mysql中创建的用户密码)
hive.exec.local.scratchdir
/Users/用户名/data/hive
hive.querylog.location
/Users/用户名/data/hive/querylog
hive.downloaded.resources.dir
/Users/用户名/data/hive/download
hive.server2.logging.operation.log.location
/Users/用户名/data/hive/log
注意3210行可能会有�最好删掉, 不然在初始化元数据库会报错
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/Cellar/hive/3.1.1/libexec/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/3.1.1/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
at [row,col,system-id]: [3210,96,"file:/usr/local/Cellar/hive/3.1.1/libexec/conf/hive-site.xml"]
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3003)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2931)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2806)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1460)
at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:4990)
at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5063)
at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5150)
at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:5098)
at org.apache.hive.beeline.HiveSchemaTool.(HiveSchemaTool.java:96)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1473)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
at [row,col,system-id]: [3210,96,"file:/usr/local/Cellar/hive/3.1.1/libexec/conf/hive-site.xml"]
at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:621)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:491)
at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2456)
at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java:2403)
at com.ctc.wstx.sr.StreamScanner.resolveCharEnt(StreamScanner.java:2369)
at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1515)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2828)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123)
at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3257)
at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3063)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2986)
... 15 more
初始化元数据库
schematool -initSchema -dbType mysql
现在进入数据库metastore, 可以看到相关表(此处只做部分表展示)
mysql> show tables;
+-------------------------------+
| Tables_in_metastore |
+-------------------------------+
| AUX_TABLE |
| BUCKETING_COLS |
| CDS |
| COLUMNS_V2 |
| COMPACTION_QUEUE |
| COMPLETED_COMPACTIONS |
| COMPLETED_TXN_COMPONENTS |
| CTLGS |
| DATABASE_PARAMS |
| DB_PRIVS |
运行Hive
安装spark
brew install apache-spark
一般直接安装就好了, 然后直接运行spark-shell
实际上完成以上配置之后还是会有一些问题, 大家可以评论一起讨论