大数据生态对每个组成的技术版本有一定要求,如果不是适配版本,则很可能会出现各种问题。像hadoop1.x、2.x、3.x每个大版本都有很大区别,如果基于Hadoop-hdfs去搭建诸如hive数据仓库或者hbase数据库的时候,对版本的选定是优先的。像平常版本的支持,网上也有很多文章指明,但是基于Hadoop2.10.0的可能较为少见,于是作一整理:资源地址(hive,hadoop,zookeeper,hbase,mysql数据库驱动等等):链接: pan.baidu.com/s/1n4wRfi9G…提取码: s8yx链接: https://pan.baidu.com/s/1n4wRfi9G5Ff9yfcKlMdVLg 提取码: s8yx一:Hadoop2.10.0的安装 参考环境:16g内存笔记本(mac pro)虚拟机parallels DeskTop 虚拟机中安装操作系统CentOs7jdk版本:sun的jdk1.8hadoop2.10.0高可用(HA)模式,hive2.3.7单节点,hbase2.2.4集群(未设置备master),zookeeper3.4.14(三节点集群),hive元数据存储至mysqlhadoop集群启动hdfs集群和yarn四个虚拟机centos 前置准备:四台虚拟机分别安装jdk1.8并配置/etc/profile 环境变量 JAVA_HOME 和 path,参考如下export JAVA_HOME=/usr/local/jdk1.8.0_65
export HADOOP_HOME=/home/hadoop/hadoop-2.10.0
export HIVE_HOME=/home/hadoop/apache-hive-2.3.7-bin
export HBASE_HOME=/home/hadoop/hbase-2.2.4
export PATH= J A V A H O M E / b i n : JAVA_HOME/bin: JAVAHOME/bin:PATH: H A D O O P H O M E / b i n : HADOOP_HOME/bin: HADOOPHOME/bin:HADOOP_HOME/sbin: H I V E H O M E / b i n : HIVE_HOME/bin: HIVEHOME/bin:HBASE_HOME/bin
export CLASSPATH=.: J A V A H O M E / l i b / d t . j a r : JAVA_HOME/lib/dt.jar: JAVAHOME/lib/dt.jar:JAVA_HOME/lib/tools.jar
export HADOOP_CLASSPATH= J A V A H O M E / l i b / t o o l s . j a r e x p o r t H A D O O P C O N F D I R = {JAVA_HOME}/lib/tools.jar export HADOOP_CONF_DIR= JAVAHOME/lib/tools.jarexportHADOOPCONFDIR=HADOOP_HOME/etc/hadoop
export HADOOP_COMMON_HOME= H A D O O P H O M E e x p o r t H A D O O P H D F S H O M E = HADOOP_HOME export HADOOP_HDFS_HOME= HADOOPHOMEexportHADOOPHDFSHOME=HADOOP_HOME
export HADOOP_MAPRED_HOME= H A D O O P H O M E e x p o r t H A D O O P Y A R N H O M E = HADOOP_HOME export HADOOP_YARN_HOME= HADOOPHOMEexportHADOOPYARNHOME=HADOOP_HOME
export HADOOP_OPTS="-Djava.library.path= H A D O O P H O M E / l i b / n a t i v e " e x p o r t H A D O O P C O M M O N L I B N A T I V E D I R = HADOOP_HOME/lib/native" export HADOOP_COMMON_LIB_NATIVE_DIR= HADOOPHOME/lib/native"exportHADOOPCOMMONLIBNATIVEDIR=HADOOP_HOME/lib/native复制代码设置节点名称,在/etc/hosts中添加 node01:ip地址,node02:IP地址,node03:ip地址,node04:ip地址,四台机器设置一样,使用scp命令进行分发。 /etc/hosts的配置参考:19.211.55.3 node01
19.211.55.4 node02
19.211.55.5 node03
19.211.55.6 node04复制代码四台虚拟机配置ssh免密登陆(每台都要设置,设置在/root目录下,.ssh/ 即为完成后的…)ssh-keygen
ssh-copy-id -i /root/.ssh/id_rsa.pub node01(节点名称)复制代码四台虚拟机时间同步,同步aliyunyum install ntpdate
ntpdate ntp1.aliyun.com复制代码hadoop2.10.0的tar.gz包上传至虚拟机 /home/hadoop目录下并进行解压(为方便期间,不设置单独用户,全部使用root用户做启动等操作)参考如下:设置 /etc/profile 环境变量,完成后进行 source /etc/profile 并分发至其他节点进行同等操作准备zookeeper集群,在node02,node03,node04设置zookeeper集群,/home/hadoop/zookeeper-3.4.14/conf/zoo.cfg 文件配置参考如下:(/var/zfg/zookeeper 目录下设置myid文件,根据服务名不同,分别设置1,2,3作为zookeeper识别标记)# The number of milliseconds of each tick
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/zfg/zookeeper
server.1=node02:2888:3888
server.2=node03:2888:3888
server.3=node04:2888:3888
clientPort=2181
#maxClientCnxns=60
#autopurge.snapRetainCount=3
#autopurge.purgeInterval=1复制代码hadoop配置文件设置(同时将hdfs和yarn配置文件都写好,便于一次分发),分别对hdfs-site.xml,mapred-site.xml,core-site.xml,yarn-site.xml,slaves等文件做配置添加和修改。设置 hadoop-env.sh 参数,主要设置jdk目录等。hdfs-site.xml 配置文件参考:
dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 node01:8020 dfs.namenode.rpc-address.mycluster.nn2 node02:8020 dfs.namenode.http-address.mycluster.nn1 node01:50070 dfs.namenode.http-address.mycluster.nn2 node02:50070 dfs.namenode.shared.edits.dir qjournal://node01:8485;node02:8485;node03:8485/mycluster dfs.journalnode.edits.dir /var/sxt/hadoop/ha/jn dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_rsa dfs.ha.automatic-failover.enabled true dfs.replication 3 复制代码core-site.xml配置文件参考:Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an “AS IS” BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
–>
fs.defaultFS
hdfs://mycluster
ha.zookeeper.quorum
node02:2181,node03:2181,node04:2181
复制代码yarn-site.xml配置文件参考:
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.zk-address
node02:2181,node03:2181,node04:2181
yarn.resourcemanager.cluster-id
mashibing
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.hostname.rm1
node03
yarn.resourcemanager.hostname.rm2
node04
yarn.nodemanager.resource.memory-mb
2048
yarn.nodemanager.resource.cpu-vcores
1
yarn.resourcemanager.address.rm1
master:8032
yarn.resourcemanager.scheduler.address.rm1
node03:8030
yarn.resourcemanager.webapp.address.rm1
node03:8088
yarn.resourcemanager.resource-tracker.address.rm1
node03:8031
yarn.resourcemanager.admin.address.rm1
node03:8033
yarn.resourcemanager.ha.admin.address.rm1
node03:23142
yarn.resourcemanager.address.rm2
node04:8032
yarn.resourcemanager.scheduler.address.rm2
node04:8030
yarn.resourcemanager.webapp.address.rm2
node04:8088
yarn.resourcemanager.resource-tracker.address.rm2
node04:8031
yarn.resourcemanager.admin.address.rm2
node04:8033
yarn.resourcemanager.ha.admin.address.rm2
node04:23142
复制代码slaves文件参考:node02
node03
node04复制代码mapred-site.xml文件参考
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an “AS IS” BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
–>
hive.metastore.warehouse.dir
/user/hive/warehouse
javax.jdo.option.ConnectionURL
jdbc:mysql://46.77.56.200:3306/hive?createDatabaseIfNotExist=true
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
javax.jdo.option.ConnectionUserName
root
javax.jdo.option.ConnectionPassword
123456
复制代码配置 /etc/profile环境变量,将mysql数据库连接驱动拷贝进hive的lib库中初始化hive元数据存入mysql三:hbase2.2.4集群安装 前置准备: hbase2.2.4的tar.gz包上传至/home/hadoop目录并进行解压配置hbase-site.xml,hbase-env.sh,并将hadoop集群下的配置文件hdfs-site.xml拷贝至hbase的conf目录下。hbase-site.xml参考: