hadoop端口

----------------

1.namenode 50070

http://namenode:50070/

2.resourcemanager:8088

http://localhost:8088/

3.historyServer

http://hs:19888/

4.name rpc(remote procedure call,远程过程调用)

hdfs://namenode:8020/


ssh指令结合操作命令

---------------------

$>ssh s300 rm -rf /xx/x/x


通过scp远程复制

--------------------

$>scp -r /xxx/x ubuntu@s200:/path



编写脚本,实现文件或者文件夹的在所有节点远程复制。

xcopy.sh

--------------------

scp -r path ubuntu@s200:/path 


删除

------

xrm.sh a.txt

ssh s200 rm -rf path


远程复制文件

[/usr/local/sbin/xcp.sh]
#!/bin/bash
if [ $# -lt 1 ] ;then
  echo no args
  exit;
fi
#get first argument
arg1=$1;
cuser=`whoami`
fname=`basename $arg1`
dir=`dirname $arg1`
if [ "$dir" = "." ]; then
  dir=`pwd`
fi
for (( i=200;i<=500;i=i+100)) ;
do
  echo -----coping $arg1 to $i ------;
  if [ -d $arg1 ] ;then
    scp -r $arg1 $cuser@s$i:$dir 
  else
    scp $arg1 $cuser@s$i:$dir 
  fi
  echo
done


slaves

----------

master

masters


hadoop2.7.2源代码处理

-----------------------

1.下载并加压hadoop.2.7.2-tar.gz文件

2.对Jar包按照CONF,LIB,SOURCES,TSET等分类

从jar包提取所有的配置项

------------------------

1.core-default.xml

D:\downloads\bigdata\hadoop-2.7.2\_libs\hadoop-common-2.7.2.jar

2.hdfs-default.xml

D:\downloads\bigdata\hadoop-2.7.2\_libs\hadoop-hdfs-2.7.2.jar

3.mapred-default.xml

D:\downloads\bigdata\hadoop-2.7.2\_libs\hadoop-mapreduce-client-core-2.7.2.jar

4.yarn-default.xml

D:\downloads\bigdata\hadoop-2.7.2\_libs\hadoop-yarn-common-2.7.2.jar


master node == NameNode

------------------------


{hadoop}/sbin/start-all.sh

--------------------------------------

1.{hadoop}\libexec\hadoop-config.sh

HADOOP_CONF_DIR=...//--config参数

2./sbin/start-dfs.sh --config $HADOOP_CONF_DIR

3./sbin/start-yarn.sh --config $HADOOP_CONF_DIR


{hadoop_home}/sbin/start-dfs.sh

--------------------------------

1.{hadoop}\libexec\hadoop-config.sh

HADOOP_CONF_DIR=...//--config参数

2.NAMENODE={hadoop_home}/bin/hdfs getconf -namenodes//提取名称节点的主机名

3.{hadoop_home}/sbin/hadoop-daemons.sh --config ... --hostnames ... --script "{hadoop_home}/bin/hdfs" start namenode $dataStartOpt

4.{hadoop_home}/sbin/hadoop-daemons.sh --config ... --hostnames ... --script "{hadoop_home}/bin/hdfs" start datanode $dataStartOpt

5.{hadoop_home}/sbin/hadoop-daemons.sh --config ... --hostnames ... --script "{hadoop_home}/bin/hdfs" start secondarynamenode



{hadoop_home}/sbin/hadoop-daemons.sh

---------------------------------------

1.{hadoop}\libexec\hadoop-config.sh

HADOOP_CONF_DIR=...//--config参数

2.exec "$bin/slaves.sh" --config $HADOOP_CONF_DIR cd "$HADOOP_PREFIX" \; "$bin/hadoop-daemon.sh" --config $HADOOP_CONF_DIR "$@"



{hadoop_home}/sbin/slaves.sh

-----------------------------

1.{hadoop}\libexec\hadoop-config.sh

HADOOP_CONF_DIR=...//--config参数

2."${HADOOP_CONF_DIR}/hadoop-env.sh"

3.提取slaves文件的所有主机名-->SLAVE_NAMES

4.for SLAVE_NAMES --> ssh @hostname ...


"$bin/hadoop-daemon.sh"

-----------------------------

1.{hadoop}\libexec\hadoop-config.sh

HADOOP_CONF_DIR=...//--config参数

2.namenode|datanode|2namenode|..

bin/hdfs/xxxx


2NN配置独立的主机

--------------------

 dfs.namenode.secondary.http-address

 0.0.0.0:50090

 

The secondary namenode http server address and port.

 


修改默认的hadoop临时目录

-------------------------

[core-site.xml]

hadoop.tmp.dir=/home/ubuntu/hadoop/


修改blocksize大小,默认是128m

-----------------------------

[hdfs-site.xml]

dfs.blocksize=8m


1.测试方式

put 文件 > 8m,通过webui查看块大小