SPARK集群 端口使用整理
服务
|
端口
|
备注
|
---|---|---|
spark-master | 7077 | |
spark-slave | ||
hadoop-master | 9000 | |
kafka-zookeeper | 2181 |
|
kafka-master | 9092 |
说明 带master的服务端口 需要暴露给业务程序; hadoop master slave 和 spark master slave 部分通讯是以 ssh为通道, 所以master和slave之间需要开启ssh免密码登录
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
# 保证 .ssh 权限 700
# authorized_keys 权限 744
# id_rsa.pub 权限 744
{host ip} spark-master
{host ip} hdfs-master
{host ip} kafka-master
# 下载
wget http://********/hadoop-2.7.3/hadoop-2.7.3.tar.gz
# 解压
tar -zxvf hadoop-2.7.3.tar.gz
cd hadoop-2.7.3
vim ~/.bashrc
export JAVA_HOME=/usr/java/latest
export HADOOP_HOME=/*****/hadoop-2.7.3
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export HADOOP_HOME_WARN_SUPPRESS=1
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$PATH
vim conf/core-site.xml
fs.defaultFS
hdfs://hdfs-master:9000
hadoop.tmp.dir
********/hadoop/tmp
fs.trash.interval
1440
mkdir -pfr ******/hadoop/tmp
vim conf/hdfs-site.xml
dfs.replication
1
dfs.permissions
false
dfs.support.append
true
dfs.client.block.write.replace-datanode-on-failure.enable
true
dfs.client.block.write.replace-datanode-on-failure.policy
never
vim conf/log4j.properties
log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR
cp mapred-site.xml.template mapred-site.xml
vim conf/mapred-site.xml
mapreduce.framework.name
yarn
vim conf/yarn-site.xml
hadoop namenode -format
看到倒数N行 包含
xx/xx/xx xx:xx:xx INFO common.Storage: Storage directory /*****/hadoop/tmp/dfs/name has been successfully formatted.
说明创建成功
$HADOOP_HOME/sbin/start-dfs.sh
hdfs dfs -put README.txt /README.txt
hdfs dfs -cat /README.txt
# 能打印README内容说明OK
hadoop(hdfs) 使用端口 hdfs-master:9000
wget http://*****/spark-2.1.0-bin-hadoop2.7.tgz
tar -zxvf spark-2.1.0-bin-hadoop2.7.tgz
cd spark-2.1.0-bin-hadoop2.7
vim ~/.bashrc
export SPARK_HOME=/home/*****/spark-2.1.0-bin-hadoop2.7
conf/spark-defaults.conf
cp spark-defaults.conf.template spark-defaults.conf
spark.master spark://spark-master:7077
spark.eventLog.enabled true
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.driver.memory 1g
spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
spark.ui.enabled false
spark.executor.memory 1g
cp log4j.properties.template log4j.propertie
修改log4j.rootCategory=INFO, console, file
新增log4j.appender.file=org.apache.log4j.DailyRollingFileAppender
log4j.appender.file.File=/home/******/spark/log
log4j.appender.file.DatePattern='.'yyyy-MM-dd
log4j.appender.file.layout=org.apache.log4j.PatternLayout log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1} - %m%n
mkdir -p /home/******/spark/
$SPARK_HOME/sbin/start-all.sh
、、、
jps
```
$ jps
14340 SecondaryNameNode
14132 DataNode
13960 NameNode
14760 Master
14953 Jps
14892 Worker
出现Master Worker说明已启动
spark master 使用端口 spark-master 7077
wget http://apache.mirror.iweb.ca/kafka/0.10.2.0/kafka_2.11-0.10.2.0.tgz
tar -zxvf kafka_2.11-0.10.2.0.tgz
cd kafka_2.11-0.10.2.0
vim config/server.properties
############################# Socket Server Settings #############################
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
#listeners=PLAINTEXT://:9092
nohup bin/zookeeper-server-start.sh config/zookeeper.properties >/dev/null 2>&1 &
nohup bin/kafka-server-start.sh config/server.properties >/dev/null 2>&1 &
bin/kafka-topics.sh --create --zookeeper kafka-master:2181 --replication-factor 1 --partitions 1 --topic log
# 创建一个 test 主题
bin/kafka-topics.sh --create --zookeeper kafka-master:2181 --replication-factor 1 --partitions 1 --topic test
# 发送一个消息
bin/kafka-console-producer.sh --broker-list kafka-master:9092 --topic test
> it's a test message!
bin/kafka-console-consumer.sh --bootstrap-server kafka-master:9092 --topic test --fm-beginning
# 看到接收到 "it's a test message!" 就OK