【大数据】从0配置一个大数据集群

三台虚拟机

三个节点

修改hosts

ip地址 hadoop1
ip地址 hadoop2
ip地址 hadoop3

免密登录

生成秘钥

ssh-keygen -t rsa

拷贝

ssh-copy-id root@hadoop1   (分别三个机器上都拷贝3次)

永久关闭防火墙

systemctl disable firewalld
chkconfig iptables off

解压

tar -zxvf  xxx.targz -C /xxx/
tar -xvf xxx.tar -C /xxx/

Zookeeper安装

在/opt/programs/zookeeper-3.4.12/下新建文件夹data,logs

cd /opt/bigdata/zookeeper-3.4.12/conf/

cp zoo_sample.cfg zoo.cfg

dataDir=/opt/bigdata/zookeeper-3.4.14/data
dataLogDir=/opt/bigdata/zookeeper-3.4.14/logs
server.1=hadoop1:2888:3888
server.2=hadoop2:2888:3888
server.3=hadoop3:2888:3888

Hadoop全分布模式,非高可用

core-site.xml


        fs.defaultFS
        hdfs://hadoop3:9000


	hadoop.tmp.dir
	/opt/bigdata/hadoop-2.8.5/data/tmp


hdfs-site.xml


		dfs.replication
		3
	
	 
		dfs.permissions 
		false 
	
	
		dfs.namenode.secondary.http-address
		hadoop2:50090
	

其他配置和下面高可用一样

Hadoop高可用配置(遇到8485端口超时,失败(已解决,需要编辑安全组,开放TCP的1-65535端口权限))

yarn-site.xml


		yarn.nodemanager.aux-services
		mapreduce_shuffle
	
	
		yarn.resourcemanager.hostname
		hadoop2
	
	
		yarn.log-aggregation-enable
		true
	
	
		yarn.log-aggregation.retain-seconds
		640800
	

slaves

hadoop1
hadoop2
hadoop3

mapre-site.xml


		mapreduce.framework.name
		yarn
	
	
		mapreduce.jobhistory.address
		hadoop3:10020
	
	
		mapreduce.jobhistory.webapp.address
		hadoop3:19888
	

hdfs-site.xml


	dfs.nameservices
	ns1


	dfs.ha.namenodes.ns1
	nn1,nn2


	dfs.namenode.rpc-address.ns1.nn1
	hadoop1:9000


	dfs.namenode.rpc-address.ns1.nn2
	hadoop3:9000


	dfs.namenode.http-address.ns1.nn1
	hadoop1:50070


	dfs.namenode.http-address.ns1.nn2
	hadoop3:50070


	dfs.namenode.shared.edits.dir
	qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/ns1


	dfs.journalnode.edits.dir
	/opt/bigdata/hadoop-2.8.5/data/tmp/dfs/jn


    dfs.client.failover.proxy.provider.ns1org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider


	dfs.ha.automatic-failover.enabled
	true


        dfs.replication
        3

 
	dfs.permissions 
	false 

hadoop-env.sh

export HADOOP_IDENT_STRING=$USER
export JAVA_HOME=/usr/local/java/jdk1.8.0_25
export HADOOP_PREFIX=/opt/bigdata/hadoop-2.8.5

core-site.xml


	dfs.ha.fencing.methods
	sshfence


	dfs.ha.fencing.ssh.private-key-files
	/root/.ssh/id_rsa


	fs.defaultFS
	hdfs://ns1


	ha.zookeeper.quorum
	hadoop1:2181,hadoop2:2181,hadoop3:2181


	hadoop.tmp.dir
	/opt/bigdata/hadoop-2.8.5/data/tmp


Spark

spark-env.sh

export JAVA_HOME=/usr/local/java/jdk1.8.0_25
export SCALA_HOME=/opt/bigdata/scala-2.10.7
export HADOOP_HOME=/opt/bigdata/hadoop-2.8.5
export HADOOP_CONF_DIR=/opt/bigdata/hadoop-2.8.5/etc/hadoop
#export SPARK_MASTER_HOST=hadoop1
export SPARK_WORKER_MEMORY=1g
export SPARK_WORKER_CORES=2
export SPARK_HOME=/opt/bigdata/spark-1.6.3/
export SPARK_DIST_CLASSPATH=$(/opt/bigdata/hadoop-2.8.5/bin/hadoop classpath)

export SPARK_DAEMON_JAVA_OPTS="
-Dspark.deploy.recoveryMode=ZOOKEEPER 
-Dspark.deploy.zookeeper.url=hadoop1,hadoop2,hadoop3
-Dspark.deploy.zookeeper.dir=/spark"

Hive

hive安装教程

/。。。。。。

你可能感兴趣的:(hadoop)