Mac上Hadoop, HDFS, Hive, Spark环境的安装和搭建

安装前准备

Homebrew

参见: Mac下Homebrew的安装和使用

jdk安装

java -version

java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

如果没有安装, 建议安装Java8

brew cask install java # 安装最新版本
# 安装Java8
brew tap caskroom/versions
brew cask install java8

配置ssh

配置ssh就是为了实现免密登录, 这样方便远程管理Hadoop并无需登录密码在Hadoop集群上共享文件资源

如果你的机子没有配置ssh, 在命令终端输入ssh localhost是需要输入你的电脑登录密码的.
配置好ssh后, 就无需输入密码了.

  1. 打开设置 > 共享 > 打开远程登陆


    image.png
  2. iterm(终端)执行
ssh-keygen -t rsa # 然后yes
cat ~/.ssh/local.pub >> ~/.ssh/authorized_keys

现在, 在终端输入 ssh localhost就OK了.

ssh localhost # ssh 登陆
# Last login: Fri Jan 18 14:44:36 2019
exit # 退出登陆
# Connection to localhost closed.

安装hadoop

下载安装

  • brew install hadoop (推荐), 安装完成后你会看到安装路径在那里
  • 官网下载压缩包, 解压到你指定的目录, 然后安装(不推荐)

配置hadoop

配置hadoop-env.sh

hadoop-env.sh位置:

/usr/local/Cellar/hadoop/3.1.1/libexec/etc/hadoop

添加JAVA_HOME路径

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_202.jdk/Contents/Home 
# Mac查看jdk 位置 /usr/libexec/java_home -V

配置core-site.xml

配置hdfs地址和端口


  
    fs.default.name
    hdfs://localhost:8020
  
  
  
    
    hadoop.tmp.dir  
    /Users/用户名/data/hadoop/tmp  
    A base for other temporary directories.  
   
  
    
    dfs.name.dir  
    /Users/用户名/data/hadoop/filesystem/name  
    Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.   
    
    
    dfs.data.dir  
    /Users/用户名/data/hadoop/filesystem/data  
    Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.  
    

配置hdfs-site.xml

修改HDFS备份数, 配置namenode和datanode


  
    dfs.replication
    1
  

配置mapred-site.xml

配置mapreduce中jobtracker的地址和端口. 3.1.1版本下有这个文件, 可直接配置


  
    mapreduce.framework.name
    yarn
  

配置yarn-site.xml

 
   
    yarn.nodemanager.aux-services 
    mapreduce_shuffle 
   
   
    yarn.nodemanager.env-whitelist
    JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME 
   

格式化HDFS

# /usr/local/Cellar/hadoop/3.1.1/libexec
bin/hdfs namenode -format

运行

  • sbin/start-all.sh: 启动全部
  • sbin/start-dfs.sh: 启动NameNode和DataNode, NameNode - http://localhost:9870 (2.7的版本端口为50070)
  • sbin/start-yarn.sh: ResourceManager - http://localhost:8088 (All Applications)

jps 可以查看进程

jps
# 34214 NameNode
# 34313 DataNode
# 34732 NodeManager
# 34637 ResourceManager
# 34446 SecondaryNameNode
# 34799 Jps

安装hive

下载安装

brew install hive

配置Hive元数据库

Hive默认用derby作为元数据库这, 我们这里换用大家熟悉的mysql来存储元数据

# 进入数据库
mysql -uroot -p 
# 在数据库执行
CREATE DATABASE metastore;
# CREATE user 'hive'@'localhost' IDENTIFIED BY 'hive';
# Unable to load authentication plugin 'caching_sha2_password'.
ALTER USER 'hive'@'localhost' IDENTIFIED WITH mysql_native_password BY 'hive';
GRANT SELECT,INSERT,UPDATE,DELETE,ALTER,CREATE,INDEX,REFERENCES ON METASTORE.* TO 'hive'@'localhost';
FLUSH PRIVILEGES;

配置hive

配置mysql-connector jar包

下载地址: https://dev.mysql.com/downloads/connector/j/
将下载的文件解压, 复制

cp mysql-connector-java-5.1.44-bin.jar /usr/local/Cellar/hive/3.1.1/libexec/lib/

配置hive-site.xml

修改以下部分


  javax.jdo.option.ConnectionURL
  jdbc:mysql://localhost/metastore


  javax.jdo.option.ConnectionDriverName
  com.mysql.jdbc.Driver


  javax.jdo.option.ConnectionUserName
  hive(配置Hive元数据库: mysql中创建的用户名)


  javax.jdo.option.ConnectionPassword
  hive(配置Hive元数据库: mysql中创建的用户密码)


  hive.exec.local.scratchdir
  /Users/用户名/data/hive


  hive.querylog.location
  /Users/用户名/data/hive/querylog


  hive.downloaded.resources.dir
  /Users/用户名/data/hive/download


  hive.server2.logging.operation.log.location
  /Users/用户名/data/hive/log

注意3210行可能会有�最好删掉, 不然在初始化元数据库会报错

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/Cellar/hive/3.1.1/libexec/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/3.1.1/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
 at [row,col,system-id]: [3210,96,"file:/usr/local/Cellar/hive/3.1.1/libexec/conf/hive-site.xml"]
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3003)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2931)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2806)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1460)
    at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:4990)
    at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5063)
    at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5150)
    at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:5098)
    at org.apache.hive.beeline.HiveSchemaTool.(HiveSchemaTool.java:96)
    at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1473)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
 at [row,col,system-id]: [3210,96,"file:/usr/local/Cellar/hive/3.1.1/libexec/conf/hive-site.xml"]
    at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:621)
    at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:491)
    at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2456)
    at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java:2403)
    at com.ctc.wstx.sr.StreamScanner.resolveCharEnt(StreamScanner.java:2369)
    at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1515)
    at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2828)
    at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123)
    at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3257)
    at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3063)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2986)
    ... 15 more

初始化元数据库

schematool -initSchema -dbType mysql

现在进入数据库metastore, 可以看到相关表(此处只做部分表展示)

mysql> show tables;
+-------------------------------+
| Tables_in_metastore           |
+-------------------------------+
| AUX_TABLE                     |
| BUCKETING_COLS                |
| CDS                           |
| COLUMNS_V2                    |
| COMPACTION_QUEUE              |
| COMPLETED_COMPACTIONS         |
| COMPLETED_TXN_COMPONENTS      |
| CTLGS                         |
| DATABASE_PARAMS               |
| DB_PRIVS                      |

运行Hive

安装spark

brew install apache-spark

一般直接安装就好了, 然后直接运行spark-shell

实际上完成以上配置之后还是会有一些问题, 大家可以评论一起讨论

你可能感兴趣的:(Mac上Hadoop, HDFS, Hive, Spark环境的安装和搭建)