配置ssh就是为了能够实现免密登录,这样方便远程管理Hadoop并无需登录密码在Hadoop集群上共享文件资源。
如果机子没有配置ssh的话,在命令终端输入ssh localhost是需要输入你的电脑登录密码的。配置好ssh后,就无需输入密码了。
第一步就是在终端执行" ssh-keygen -t rsa -P ‘’ ",
之后一路enter键,当然如果你之前已经执行过这样的语句,那过程中会提示是否要覆盖原有的key,输入y即可。
home:~ root$ ssh-keygen -t rsa -P ''
Generating public/private rsa key pair.
Enter file in which to save the key (/Users/root/.ssh/id_rsa):
/Users/root/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Your identification has been saved in /Users/root/.ssh/id_rsa.
Your public key has been saved in /Users/root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:5FXIbTsTcgSg4xApj2y5sY+n7+qEPKorhZaNhmioGDk [email protected]
The key's randomart image is:
+---[RSA 2048]----+
| .. .o.=o |
| . .. . +.= |
| . =. o . .+ o |
| * .o + . + |
|++++ . S o |
|EB+. |
|B*.o |
|=.o o |
|*o+*o |
+----[SHA256]-----+
第二步执行语句" cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys "用于授权你的公钥到本地可以无需密码实现登录。
理论上这时候,你在终端输入ssh lcoalhost就能够免密登录了。
home:~ root$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
报错信息:
home:~ root$ ssh localhost
ssh: connect to host localhost port 22: Connection refused
解决方法:
选择系统偏好设置->选择共享->点击远程登录
home:~ root$ ssh localhost
Last login: Fri Jan 25 17:31:54 2019 from ::1
vim ~/.bash_profile
添加以下两句
export HADOOP_HOME=/Users/root/software/hadoop/hadoop3.2
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
生效环境变量
source ~/.bash_profile
修改或替换以下内容
export JAVA_HOME=${JAVA_HOME}
export HADOOP_HEAPSIZE=2000
exportHADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK-Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
进入hadoop2.9/etc/core-site.xml文件
hadoop.tmp.dir
/Users/hadoop-2.9/tmp/hadoop-${user.name} (根据情况定义当前目录)
A base for other temporary directories.
fs.default.name
hdfs://localhost:9000
进入hadoop2.9/etc/hdfs-site.xml文件
dfs.replication
1
进入hadoop2.9/etc/mapred-site.xml文件
mapred.job.tracker
hdfs://localhost:9000
mapred.tasktracker.map.tasks.maximum
2
mapred.tasktracker.reduce.tasks.maximum
2
注:如果mapred-site.xml文件不存在,需要自己创建(复制mapred-site.xml.template文件对后缀名进行修改)
yarn.nodemanager.aux-services
mapreduce_shuffle
进入hadoop2.9文件夹,用如下命令格式化:
bin/hdfs namenode -format (指定其安装目录的路径)
出现如下,说明成功
2019-01-25 17:58:32,570 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
2019-01-25 17:58:32,570 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
2019-01-25 17:58:32,570 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2019-01-25 17:58:32,573 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
2019-01-25 17:58:32,573 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2019-01-25 17:58:32,575 INFO util.GSet: Computing capacity for map NameNodeRetryCache
2019-01-25 17:58:32,575 INFO util.GSet: VM type = 64-bit
2019-01-25 17:58:32,576 INFO util.GSet: 0.029999999329447746% max memory 3.6 GB = 1.1 MB
2019-01-25 17:58:32,576 INFO util.GSet: capacity = 2^17 = 131072 entries
2019-01-25 17:58:32,613 INFO namenode.FSImage: Allocated new BlockPoolId: BP-137592425-192.168.11.67-1548410312604
2019-01-25 17:58:32,629 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
2019-01-25 17:58:32,642 INFO namenode.FSImageFormatProtobuf: Saving image file /tmp/hadoop-root/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
2019-01-25 17:58:32,760 INFO namenode.FSImageFormatProtobuf: Image file /tmp/hadoop-root/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 409 bytes saved in 0 seconds .
2019-01-25 17:58:32,776 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2019-01-25 17:58:32,781 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at xlc-2.local/192.168.11.67
************************************************************/
sbin/start-dfs.sh
sbin/start-yarn.sh
直接启动所有服务
sbin/start-all.sh
http://localhost:50070 打开能进入hdfs管理页面,表示启动成功。
http://localhost:8088 打开能进入hadoop进程管理页面,表示启动成功。
问题:
无法访问HDFS http://localhost:50070/
yarn访问地址正常: http://localhost:8088
解决办法:
直接使用hadoop 2.9版本安装后可正常访问。
在安装hive之前需要安装mysql,由于本机已经安装了mysql,所以省略。
下载地址:https://hive.apache.org/downloads.html
当前下载为apache-hive-3.1.1-bin.tar.gz,解压后重命名为Hadoop目录 ,将迁移到hadoop安装目录下。
vim ~/.bash_profile
export HIVE_HOME=/usr/hadoop/hadoop2.9/hive(注:按自己路径修改)
export PATH= P A T H : PATH: PATH:HIVE_HOME/bin:$HIVE_HOME/conf
cp hive-env.sh.template hive-env.sh
cp hive-default.xml.template hive-default.xml
cp hive-site.xml.template hive-site.xml
cp hive-log4j.properties.template hive-log4j.properties
cp hive-exec-log4j.properties.template hive-exec-log4j.properties
hive.metastore.local
true
javax.jdo.option.ConnectionURL
jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&useSSL=false
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
javax.jdo.option.ConnectionUserName
root
javax.jdo.option.ConnectionPassword
111111
HADOOP_HOME=/usr/hadoop/hadoop2.9
export HIVE_CONF_DIR=/usr/hadoop/hadoop2.9/hive/conf
将java项目中使用的mysql-connector-java.jar移至hive/lib目录下面。
mysql-connector-java-5.1.46.jar
1)、 如果是第一次启动Hive,则需要先执行如下初始化命令:
schematool -dbType mysql -initSchema
XLC-2:bin xianglingchuan$ schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/xianglingchuan/software/hadoop/hadoop2.9/hive3.1/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/xianglingchuan/software/hadoop/hadoop2.9/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&useSSL=false
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: root
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException : Could not create connection to database server. Attempted reconnect 3 times. Giving up.
SQL Error code: 0
Use --verbose for detailed stacktrace.
*** schemaTool failed ***
启动hive
XLC-2:bin xianglingchuan$ bin/hive
19/01/26 23:37:57 DEBUG util.VersionInfo: version: 2.9.2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/xianglingchuan/software/hadoop/hadoop2.9/hive3.1/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/xianglingchuan/software/hadoop/hadoop2.9/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 96860f82-17bb-48ca-9475-700e2ebffc6f
Logging initialized using configuration in file:/Users/xianglingchuan/software/hadoop/hadoop2.9/hive3.1/conf/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
home:sbin root$ ./start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as xianglingchuan in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [localhost]
localhost: /Users/root/.ssh/config: line 8: Bad configuration option: usekeychain
home:sbin root$ ssh localhost
/Users/root/.ssh/config: line 8: Bad configuration option: usekeychain
/Users/root/.ssh/config: line 21: Bad configuration option: usekeychain
/Users/root/.ssh/config: line 30: Bad configuration option: usekeychain
/Users/root/.ssh/config: terminating, 3 bad configuration options
解决办法:
直接将.ssh/config文件内容清空,重新生成key
```
home:sbin root$ sbin/start-dfs.sh
19/01/26 17:36:51 DEBUG util.Shell: setsid is not available on this machine. So not using it.
19/01/26 17:36:51 DEBUG util.Shell: setsid exited with exit code 0
19/01/26 17:36:51 ERROR conf.Configuration: error parsing conf core-site.xml
com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 start byte 0xa0 (at char #766, byte #37)
解决办法:
将core-site.xml新加入的property节点去空格及验证编码格式。
# 常用信息
## Hadoop开启关闭调试信息
开启:export HADOOP_ROOT_LOGGER=DEBUG,console
关闭:export HADOOP_ROOT_LOGGER=INFO,console
## 各配置文件作用说明
core-site.xml 配置Service的URI地址、Hadoop集群临时目录等信息
hdfs-site.xml 配置Hadoop集群的HDFS别名、通信地址、端口等信息
map-site.xml 计算框架资源管理名称、历史任务访问地址等信息(2.9为mapred-site.xml)
yarn-site.xml 配置资源管理器的相关内容
fair-scheduler.xml Hadoop FairScheduler调度策略配置文件