2023-09-12 14:00——2023.09.13 20:06
目录
00、软件版本
01、阿里云服务器部署hadoop
1.1、修改四个配置文件
1.1.1、core-site.xml
1.1.2、hdfs-site.xml
1.1.3、mapred-site.xml
1.1.4、yarn-site.xml
1.2、修改系统/etc/hosts文件与系统变量
1.2.1、修改主机名解析文件/etc/hosts
1.2.2、修改系统环境变量/etc/profile.d/my_env.sh
02、阿里云服务器部署elasticsearch
2.1、三节点的同样操作
2.2、修改es的elasticsearch.yml文件
环境及软件版本:
- centOS 7
- jdk-1.8
- hadoop-3.3.4
- elasticsearch-7.17.6
按照尚硅谷的教程安装hadoop-3.3.4,尚硅谷大数据技术之Hadoop.docx。
/opt/module/hadoop/hadoop-3.3.4/etc/hadoop
fs.defaultFS
hdfs://bd1:8020
hadoop.tmp.dir
/opt/module/hadoop/hadoop-3.3.4/data
hadoop.http.staticuser.user
xxh
hadoop.proxyuser.xxh.hosts
*
hadoop.proxyuser.xxh.groups
*
hadoop.proxyuser.xxh.users
*
dfs.namenode.http-address
bd1:9870
dfs.namenode.secondary.http-address
bd3:9868
dfs.replication
3
dfs.permissions
false
mapreduce.framework.name
yarn
mapreduce.jobhistory.address
bd1:10020
mapreduce.jobhistory.webapp.address
bd1:19888
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.resourcemanager.hostname
bd2
yarn.nodemanager.env-whitelist
JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
yarn.scheduler.minimum-allocation-mb
512
yarn.scheduler.maximum-allocation-mb
4096
yarn.nodemanager.resource.memory-mb
4096
yarn.nodemanager.pmem-check-enabled
false
yarn.nodemanager.vmem-check-enabled
false
yarn.log-aggregation-enable
true
yarn.log.server.url
http://bd1:19888/jobhistory/logs
yarn.log-aggregation.retain-seconds
604800
[root@bd1 ~]# vim /etc/hosts
# 外网ip地址
x.x.x.x bd1
x.x.x.x bd2
x.x.x.x bd3
# 内网ip地址(使用命令ifconfig命令进行查看)
x.x.x.x bd1
x.x.x.x bd2
x.x.x.x bd3
[root@bd1 ~]# vim /etc/profile.d/my_env.sh
# HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop/hadoop-3.3.4
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin# HADOOP相关配置【重中之重,使得root用户可以直接运行hadoop】
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
# JAVA_HOME
export JAVA_HOME=/opt/module/jdk1.8.0_212
export PATH=$PATH:$JAVA_HOME/bin
# HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop/hadoop-3.3.4
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
# zookeeper
export ZK_HOME=/opt/module/zookeeper
export PATH=$ZK_HOME/bin:$PATH
# kafka
#KAFKA_HOME
export KAFKA_HOME=/opt/module/kafka
export PATH=$PATH:$KAFKA_HOME/bin
export PATH=$PATH:/opt/software/tool
# HADOOP相关配置
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
es安装教程
- Linux搭建es集群详细教程(最终版)_es集群搭建_Nick丶Xin的博客-CSDN博客
- Linux安装elk_upward337的博客-CSDN博客
- [2020-04-06T12:57:13,793][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [node-1] uncaught exce_Lan_Se_Tian_Ma的博客-CSDN博客
三节点集群的服务器,每台服务器都需要:
- 创建es用户,useradd es、passwd es
- 安装elasticsearch,tar -zxvf elasticsearch-7.17.6-linux-x86_64.tar.gz -C /opt/module/es/
- 修改elasticsearch文件夹权限,chown -R es:es /opt/module/es/
- 修改/etc/...目录下的若干配置文件,vi /etc/security/limits.conf、vi /etc/security/limits.d/20-nproc.conf、vi /etc/sysctl.conf
- 修改/opt/module/es/elasticsearch-7.17.6/config/jvm.options文件。
启动elasticsearch时,需要切换到es用户,使用如下命令在后台启动es:
- [es@bd1 root]$ nohup /opt/module/es/elasticsearch-7.17.6/bin/elasticsearch & # 后台运行elasticsearch
- [es@bd2 root]$ nohup /opt/module/es/elasticsearch-7.17.6/bin/elasticsearch & # 后台运行elasticsearch
- [es@bd3 root]$ nohup /opt/module/es/elasticsearch-7.17.6/bin/elasticsearch & # 后台运行elasticsearch
修改每台服务器的elasticsearch.yml文件(/opt/module/es/elasticsearch-7.17.6/config/elasticsearch.yml),如下两个参数的配置每台服务器都不一样:
- node.name: node-1 # 节点名称,每个节点的名称不能重复
- network.host: 内网ip地址 # 内网ip地址,每个节点的地址不能重复
# /opt/module/es/elasticsearch-7.17.6/config/elasticsearch.yml
#es加入如下配置
#集群名称
cluster.name: cluster-es-7.17.6
#节点名称,每个节点的名称不能重复
node.name: node-1
#内网ip地址,每个节点的地址不能重复
network.host: 内网ip地址
#是不是有资格主节点
node.master: true
node.data: true
#http端口
http.port: 9200
# 服务通信端口
transport.port: 9300
# 数据文件及日志存储路径
path.data: /opt/module/es/elasticsearch-7.17.6/data
path.logs: /opt/module/es/elasticsearch-7.17.6/logs
# head 插件需要这打开这两个配置
http.cors.allow-origin: "*"
http.cors.enabled: true
http.max_content_length: 200mb
#es7.x 之后新增的配置,初始化一个新的集群时需要此配置来选举 master
cluster.initial_master_nodes: ["node-1"]
#es7.x 之后新增的配置,节点发现
discovery.seed_hosts: ["bd1:9300","bd2:9300","bd3:9300"]
gateway.recover_after_nodes: 2
network.tcp.keep_alive: true
network.tcp.no_delay: true
transport.tcp.compress: true
#集群内同时启动的数据任务个数,默认是 2 个
cluster.routing.allocation.cluster_concurrent_rebalance: 16
#添加或删除节点及负载均衡时并发恢复的线程个数,默认 4 个
cluster.routing.allocation.node_concurrent_recoveries: 16
#初始化数据恢复时,并发恢复线程的个数,默认 4 个
cluster.routing.allocation.node_initial_primaries_recoveries: 16
加油~