centos hadoop2 环境安装

java 环境配置

参看centos hadoop环境安装

Hadoop2

  1. 安装包hadoop-2.6.1.tar.gz
    解压后进入目录/usr/local/src/hadoop-2.6.1/etc/hadoop
  2. hadoop-env.sh配置JAVA_HOME
    export JAVA_HOME=/usr/local/src/jdk1.7.0_45
  3. yarn-env.sh配置JAVA_HOME
    export JAVA_HOME=/usr/local/src/jdk1.7.0_45
  4. slaves配置从节点

    slave1
    slave2
  5. core-site.xml配置下面配置



    fs.defaultFS
    hdfs://master:9000/


    hadoop.tmp.dir
    file:/usr/local/src/hadoop-2.6.1/tmp


  6. 在HAOOP_HOME创建目录:
    mkdir tmp
    mkdir -p dfs/name
    mkdir -p dfs/data
  7. 修改hdfs-site.xml



    dfs.namenode.secondary.http-address
    master:9001


    dfs.namenode.name.dir
    file:/usr/local/src/hadoop-2.6.1/dfs/name


    dfs.namenode.data.dir
    file:/usr/local/src/hadoop-2.6.1/dfs/data


    dfs.replication
    3


  8. cp mapred-site.xml.template mapred-site.xml
    • mapred-site.xml配置



    mapreduce.framework.name
    yarn


  9. yarn-site.xml配置



    yarn.nodemanager.aux-services
    mapreduce_shuffle


    yarn.nodemanager.aux-services.mapreduce.shuffle.class
    org.apache.hadoop.mapred.ShuffleHandler


    yarn.resourcemanager.address
    master:8032


    yarn.resourcemanager.scheduler.address
    master:8030


    yarn.resourcemanager.resource-tracker.address
    master:8035


    yarn.resourcemanager.admin.address
    master:8033


    yarn.resourcemanager.webapp.address
    master:8088


  10. 将配置分发到从节点,在HADOOP_HOME下执行即可启动

    ./bin/hadoop namenode -format #格式化
    ./sbin/start-dfs.sh #这两句等同于start-all.sh
    ./sbin/start-yarn.sh

spark 搭建(在yarn集群之上)

  1. 安装包spark-1.6.0-bin-hadoop2.6.tgz,解压后进入目录
  2. 修改conf/spark-env.sh

    export SCALA_HOME=/usr/local/src/scala-2.11.4
    export JAVA_HOME=/usr/local/src/jdk1.7.0_45
    export HADOOP_HOME=/usr/local/src/hadoop-2.6.1
    export HADOOP_CONF=$HADOOP_HOME/etc/hadoop
    SPARK_MASTER_IP=master
    SPARK_LOCAL_DIRS=/usr/local/src/spark-1.6.0-bin-hadoop2.6
    SPARK_DRIVER_MEMORY=1G
  3. cp slaves.template slaves, 修改内容为:

    slave1
    slave2
  4. 将配置分发到从节点
  5. 启动Spark ./sbin/start-all.sh

spark开发环境搭建

安装包sbt-0.13.15.tgz,解压后进入相应目录
1. 修改~/.bashrc

# sbt config
export SBT_HOME=/usr/local/src/sbt
export PATH=$PATH:$SBT_HOME/bin
  1. 在开发目录内
•[root@master spark_test]# mkdir -p spark_wordcount/lib
•[root@master spark_test]# mkdir -p spark_wordcount/project
•[root@master spark_test]# mkdir -p spark_wordcount/src
•[root@master spark_test]# mkdir -p spark_wordcount/target
•[root@master spark_test]# mkdir -p spark_wordcount/src/main/scala
  1. 拷贝spark安装包lib目录下的spark-assembly-1.6.0-hadoop2.6.0.jar到spark_wordcount/lib目录下
  2. 在目录spark_wordcount下创建build.sbt文件,写入内容
name := "WordCount"
version := "1.6.0"
scalaVersion := "2.11.4"
  1. 在spark_wordcount目录下执行编译:sbt compile,会下载很多jar包,时间较长
  2. 开发完成后执行打包命令:sbt package

你可能感兴趣的:(centos,hadoop,yarn,大数据)