hadoop安装

根据官方文档作此记录

操作系统:ubuntu 14.04,32位;
java版本:1.7。
hadoop安装

安装ssh:
sudo apt-get install ssh
sudo apt-get install rsync


下载hadoop:
下载网址 http://www.apache.org/dyn/closer.cgi/hadoop/common/,自己下载最新版本2.4.1;
将包移动到自己定义的目录下(我的在/home/pmonkey/hadoop),然后解压:
tar zxvf hadoop-2.4.1.tar.gz
生成hadoop-2.4.1文件,为方便操作将解压后的文件夹重命名为hadoop
mv hadoop-2.4.1 hadoop
那么hadoop目录为/home/pmonkey/hadoop/hadoop

配置hadoop环境变量:
修改etc/hadoop/hadoop-env.sh文件
sudo gedit etc/hadoop/hadoop-env.sh
添加配置java_home与hadoop_home:
export JAVA_HOME=/home/pmonkey/java/java7
export HADOOP_PREFIX=/home/pmonkey/hadoop/hadoop
hadoop安装

启动:
单机模式:
mkdir input
cp etc/hadoop/*.xml input
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs[a-z.]+'
cat output/*

伪分布式模式:
sudo gedit etc/hadoop/core-site.xml
添加配置
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>
如图: hadoop安装
sudo gedit etc/hadoop/hdfs-site.xml
添加配置
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>
如图:
hadoop安装
ssh免密码登录设置:
hadoop安装如果失败执行:
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa,cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

运行本地MapReduce:
bin/hdfs namenode -format
sbin/start-dfs.sh

http://localhost:50070/ 访问namenode网络接口
配置input/output(注意路径):
bin/hdfs dfs -mkdirs /home/pmonkey/hadoop/hadoop
bin/hdfs dfs -put etc/hadoop /home/pmonkey/hadoop/hadoop/input
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep /home/pmonkey/hadoop/hadoop/input /home/pmonkey/hadoop/hadoop/output 'dfs[a-z.]+'
bin/hdfs dfs -get /home/pmonkey/hadoop/hadoop/output /home/pmonkey/hadopmonkey/hadoop/hadoop/output/*

run a MapReduce job on YARN in a pseudo-distributed mode:
配置etc/hadoop/mapred-site.xml文件
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

配置etc/hadoop/yarn-site.xml文件
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>

sbin/start-yarn.sh

http://localhost:8088/访问网络接口

你可能感兴趣的:(hadoop)