hadoop 安装部署的三种模式(本地模式不需要jdk):
单机(本地)安装(standalone mode):部署在一台机器上,没有分布式不使用hdfs,主要用于本地开发和调试。
伪分布式安装(pseudo-distributed mode):一台机器上运行所有的hadoop 服务,每个hadoop守护进程都是一个独立的jvm进程,常用于调试。
全分布式模式(fullil distributed mode):运行于多台机器的的真实环境模式。
单机模式安装
1安装jdk
解压jdk压缩包
vi /etc/profile
##jdk config
export JAVA_HOME=/usr/java/jdk1.8.0_171
export CLASSPATH=.:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
export PATH=$PATH:${JAVA_HOME}/bin
让配置生效
source /etc/profile
2安装hadoop
解压
tar -zxvf hadoop-1.2.1.tar.gz
配置环境变量 /etc/profile 末尾添加内容:
##hadoop config
export HADOOP_HOME=/usr/hadoop/hadoop-1.2.1
export PATH=$PATH:/usr/hadoop/hadoop-1.2.1/bin
以root用户让配置生效
source /etc/profile
测试是否安装配置成功
输入hadoop查看:
配置hadoop中的jdk安装路径(修改为自己安装jdk)
cd /usr/hadoop/hadoop-1.2.1/conf #cd 到hadoop的配置文件目录
vi hadoop-env.sh
显示如下:
修改路径为:
测试mapreduce程序:
在usr目录下创建目录data,data下创建目录input(此处目录自己随便命名,只要使用一致就行)
mkdir -p /usr/data/input
把hadoop配置文件全部copy过来,作为mapreduce的输入文件
cp /usr/hadoop/hadoop-1.2.1/conf/*.xml /usr/data/input
在hadoop目录下执行命令测试mapreduce:
hadoop jar hadoop-examples-1.2.1.jar grep /usr/data/input /usr/data/output 'dfs[a-z.]+'
可以根据下面的过程看mapreduce执行过程
18/05/29 21:11:35 INFO util.NativeCodeLoader: Loaded the native-hadoop library
18/05/29 21:11:35 WARN snappy.LoadSnappy: Snappy native library not loaded
18/05/29 21:11:35 INFO mapred.FileInputFormat: Total input paths to process : 7
18/05/29 21:11:36 INFO mapred.JobClient: Running job: job_local2023822132_0001
18/05/29 21:11:36 INFO mapred.LocalJobRunner: Waiting for map tasks
18/05/29 21:11:36 INFO mapred.LocalJobRunner: Starting task: attempt_local2023822132_0001_m_000000_0
18/05/29 21:11:36 INFO util.ProcessTree: setsid exited with exit code 0
18/05/29 21:11:36 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@58ee6b86
18/05/29 21:11:36 INFO mapred.MapTask: Processing split: file:/usr/data/input/capacity-scheduler.xml:0+7457
18/05/29 21:11:36 INFO mapred.MapTask: numReduceTasks: 1
18/05/29 21:11:36 INFO mapred.MapTask: io.sort.mb = 100
18/05/29 21:11:37 INFO mapred.JobClient: map 0% reduce 0%
18/05/29 21:11:37 INFO mapred.MapTask: data buffer = 79691776/99614720
18/05/29 21:11:37 INFO mapred.MapTask: record buffer = 262144/327680
18/05/29 21:11:37 INFO mapred.MapTask: Starting flush of map output
18/05/29 21:11:37 INFO mapred.Task: Task:attempt_local2023822132_0001_m_000000_0 is done. And is in the process of commiting
18/05/29 21:11:37 INFO mapred.LocalJobRunner: file:/usr/data/input/capacity-scheduler.xml:0+7457
18/05/29 21:11:37 INFO mapred.Task: Task 'attempt_local2023822132_0001_m_000000_0' done.
18/05/29 21:11:37 INFO mapred.LocalJobRunner: Finishing task: attempt_local2023822132_0001_m_000000_0
18/05/29 21:11:37 INFO mapred.LocalJobRunner: Starting task: attempt_local2023822132_0001_m_000001_0
18/05/29 21:11:37 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@56f687af
18/05/29 21:11:37 INFO mapred.MapTask: Processing split: file:/usr/data/input/hadoop-policy.xml:0+4644
18/05/29 21:11:37 INFO mapred.MapTask: numReduceTasks: 1
18/05/29 21:11:37 INFO mapred.MapTask: io.sort.mb = 100
18/05/29 21:11:37 INFO mapred.MapTask: data buffer = 79691776/99614720
18/05/29 21:11:37 INFO mapred.MapTask: record buffer = 262144/327680
18/05/29 21:11:37 INFO mapred.MapTask: Starting flush of map output
18/05/29 21:11:37 INFO mapred.MapTask: Finished spill 0
18/05/29 21:11:37 INFO mapred.Task: Task:attempt_local2023822132_0001_m_000001_0 is done. And is in the process of commiting
18/05/29 21:11:37 INFO mapred.LocalJobRunner: file:/usr/data/input/hadoop-policy.xml:0+4644
18/05/29 21:11:37 INFO mapred.Task: Task 'attempt_local2023822132_0001_m_000001_0' done.
18/05/29 21:11:37 INFO mapred.LocalJobRunner: Finishing task: attempt_local2023822132_0001_m_000001_0
18/05/29 21:11:37 INFO mapred.LocalJobRunner: Starting task: attempt_local2023822132_0001_m_000002_0
18/05/29 21:11:37 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@6796c850
18/05/29 21:11:37 INFO mapred.MapTask: Processing split: file:/usr/data/input/mapred-queue-acls.xml:0+2033
18/05/29 21:11:37 INFO mapred.MapTask: numReduceTasks: 1
18/05/29 21:11:37 INFO mapred.MapTask: io.sort.mb = 100
18/05/29 21:11:37 INFO mapred.MapTask: data buffer = 79691776/99614720
18/05/29 21:11:37 INFO mapred.MapTask: record buffer = 262144/327680
18/05/29 21:11:37 INFO mapred.MapTask: Starting flush of map output
18/05/29 21:11:37 INFO mapred.Task: Task:attempt_local2023822132_0001_m_000002_0 is done. And is in the process of commiting
18/05/29 21:11:37 INFO mapred.LocalJobRunner: file:/usr/data/input/mapred-queue-acls.xml:0+2033
18/05/29 21:11:37 INFO mapred.Task: Task 'attempt_local2023822132_0001_m_000002_0' done.
18/05/29 21:11:37 INFO mapred.LocalJobRunner: Finishing task: attempt_local2023822132_0001_m_000002_0
18/05/29 21:11:37 INFO mapred.LocalJobRunner: Starting task: attempt_local2023822132_0001_m_000003_0
18/05/29 21:11:37 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@16efbfc4
18/05/29 21:11:37 INFO mapred.MapTask: Processing split: file:/usr/data/input/fair-scheduler.xml:0+327
18/05/29 21:11:37 INFO mapred.MapTask: numReduceTasks: 1
18/05/29 21:11:37 INFO mapred.MapTask: io.sort.mb = 100
18/05/29 21:11:37 INFO mapred.MapTask: data buffer = 79691776/99614720
18/05/29 21:11:37 INFO mapred.MapTask: record buffer = 262144/327680
18/05/29 21:11:37 INFO mapred.MapTask: Starting flush of map output
18/05/29 21:11:37 INFO mapred.Task: Task:attempt_local2023822132_0001_m_000003_0 is done. And is in the process of commiting
18/05/29 21:11:37 INFO mapred.LocalJobRunner: file:/usr/data/input/fair-scheduler.xml:0+327
18/05/29 21:11:37 INFO mapred.Task: Task 'attempt_local2023822132_0001_m_000003_0' done.
18/05/29 21:11:37 INFO mapred.LocalJobRunner: Finishing task: attempt_local2023822132_0001_m_000003_0
18/05/29 21:11:37 INFO mapred.LocalJobRunner: Starting task: attempt_local2023822132_0001_m_000004_0
18/05/29 21:11:37 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@493129d5
18/05/29 21:11:37 INFO mapred.MapTask: Processing split: file:/usr/data/input/core-site.xml:0+178
18/05/29 21:11:37 INFO mapred.MapTask: numReduceTasks: 1
18/05/29 21:11:37 INFO mapred.MapTask: io.sort.mb = 100
18/05/29 21:11:38 INFO mapred.MapTask: data buffer = 79691776/99614720
18/05/29 21:11:38 INFO mapred.MapTask: record buffer = 262144/327680
18/05/29 21:11:38 INFO mapred.MapTask: Starting flush of map output
18/05/29 21:11:38 INFO mapred.Task: Task:attempt_local2023822132_0001_m_000004_0 is done. And is in the process of commiting
18/05/29 21:11:38 INFO mapred.LocalJobRunner: file:/usr/data/input/core-site.xml:0+178
18/05/29 21:11:38 INFO mapred.Task: Task 'attempt_local2023822132_0001_m_000004_0' done.
18/05/29 21:11:38 INFO mapred.LocalJobRunner: Finishing task: attempt_local2023822132_0001_m_000004_0
18/05/29 21:11:38 INFO mapred.LocalJobRunner: Starting task: attempt_local2023822132_0001_m_000005_0
18/05/29 21:11:38 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@17ce0e1f
18/05/29 21:11:38 INFO mapred.MapTask: Processing split: file:/usr/data/input/hdfs-site.xml:0+178
18/05/29 21:11:38 INFO mapred.MapTask: numReduceTasks: 1
18/05/29 21:11:38 INFO mapred.MapTask: io.sort.mb = 100
18/05/29 21:11:38 INFO mapred.JobClient: map 71% reduce 0%
18/05/29 21:11:38 INFO mapred.MapTask: data buffer = 79691776/99614720
18/05/29 21:11:38 INFO mapred.MapTask: record buffer = 262144/327680
18/05/29 21:11:38 INFO mapred.MapTask: Starting flush of map output
18/05/29 21:11:38 INFO mapred.Task: Task:attempt_local2023822132_0001_m_000005_0 is done. And is in the process of commiting
18/05/29 21:11:38 INFO mapred.LocalJobRunner: file:/usr/data/input/hdfs-site.xml:0+178
18/05/29 21:11:38 INFO mapred.Task: Task 'attempt_local2023822132_0001_m_000005_0' done.
18/05/29 21:11:38 INFO mapred.LocalJobRunner: Finishing task: attempt_local2023822132_0001_m_000005_0
18/05/29 21:11:38 INFO mapred.LocalJobRunner: Starting task: attempt_local2023822132_0001_m_000006_0
18/05/29 21:11:38 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4a78398e
18/05/29 21:11:38 INFO mapred.MapTask: Processing split: file:/usr/data/input/mapred-site.xml:0+178
18/05/29 21:11:38 INFO mapred.MapTask: numReduceTasks: 1
18/05/29 21:11:38 INFO mapred.MapTask: io.sort.mb = 100
18/05/29 21:11:38 INFO mapred.MapTask: data buffer = 79691776/99614720
18/05/29 21:11:38 INFO mapred.MapTask: record buffer = 262144/327680
18/05/29 21:11:38 INFO mapred.MapTask: Starting flush of map output
18/05/29 21:11:38 INFO mapred.Task: Task:attempt_local2023822132_0001_m_000006_0 is done. And is in the process of commiting
18/05/29 21:11:38 INFO mapred.LocalJobRunner: file:/usr/data/input/mapred-site.xml:0+178
18/05/29 21:11:38 INFO mapred.Task: Task 'attempt_local2023822132_0001_m_000006_0' done.
18/05/29 21:11:38 INFO mapred.LocalJobRunner: Finishing task: attempt_local2023822132_0001_m_000006_0
18/05/29 21:11:38 INFO mapred.LocalJobRunner: Map task executor complete.
18/05/29 21:11:38 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@2a27f691
18/05/29 21:11:38 INFO mapred.LocalJobRunner:
18/05/29 21:11:38 INFO mapred.Merger: Merging 7 sorted segments
18/05/29 21:11:38 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 21 bytes
18/05/29 21:11:38 INFO mapred.LocalJobRunner:
18/05/29 21:11:38 INFO mapred.Task: Task:attempt_local2023822132_0001_r_000000_0 is done. And is in the process of commiting
18/05/29 21:11:38 INFO mapred.LocalJobRunner:
18/05/29 21:11:38 INFO mapred.Task: Task attempt_local2023822132_0001_r_000000_0 is allowed to commit now
18/05/29 21:11:38 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local2023822132_0001_r_000000_0' to file:/usr/hadoop/hadoop-1.2.1/grep-temp-1671205082
18/05/29 21:11:38 INFO mapred.LocalJobRunner: reduce > reduce
18/05/29 21:11:38 INFO mapred.Task: Task 'attempt_local2023822132_0001_r_000000_0' done.
18/05/29 21:11:39 INFO mapred.JobClient: map 100% reduce 100%
18/05/29 21:11:39 INFO mapred.JobClient: Job complete: job_local2023822132_0001
18/05/29 21:11:39 INFO mapred.JobClient: Counters: 21
18/05/29 21:11:39 INFO mapred.JobClient: Map-Reduce Framework
18/05/29 21:11:39 INFO mapred.JobClient: Spilled Records=2
18/05/29 21:11:39 INFO mapred.JobClient: Map output materialized bytes=61
18/05/29 21:11:39 INFO mapred.JobClient: Reduce input records=1
18/05/29 21:11:39 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0
18/05/29 21:11:39 INFO mapred.JobClient: Map input records=369
18/05/29 21:11:39 INFO mapred.JobClient: SPLIT_RAW_BYTES=637
18/05/29 21:11:39 INFO mapred.JobClient: Map output bytes=17
18/05/29 21:11:39 INFO mapred.JobClient: Reduce shuffle bytes=0
18/05/29 21:11:39 INFO mapred.JobClient: Physical memory (bytes) snapshot=0
18/05/29 21:11:39 INFO mapred.JobClient: Map input bytes=14995
18/05/29 21:11:39 INFO mapred.JobClient: Reduce input groups=1
18/05/29 21:11:39 INFO mapred.JobClient: Combine output records=1
18/05/29 21:11:39 INFO mapred.JobClient: Reduce output records=1
18/05/29 21:11:39 INFO mapred.JobClient: Map output records=1
18/05/29 21:11:39 INFO mapred.JobClient: Combine input records=1
18/05/29 21:11:39 INFO mapred.JobClient: CPU time spent (ms)=0
18/05/29 21:11:39 INFO mapred.JobClient: Total committed heap usage (bytes)=1303212032
18/05/29 21:11:39 INFO mapred.JobClient: File Input Format Counters
18/05/29 21:11:39 INFO mapred.JobClient: Bytes Read=14995
18/05/29 21:11:39 INFO mapred.JobClient: FileSystemCounters
18/05/29 21:11:39 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1570178
18/05/29 21:11:39 INFO mapred.JobClient: FILE_BYTES_READ=1272608
18/05/29 21:11:39 INFO mapred.JobClient: File Output Format Counters
18/05/29 21:11:39 INFO mapred.JobClient: Bytes Written=123
18/05/29 21:11:39 INFO mapred.FileInputFormat: Total input paths to process : 1
18/05/29 21:11:39 INFO mapred.JobClient: Running job: job_local251880250_0002
18/05/29 21:11:39 INFO mapred.LocalJobRunner: Waiting for map tasks
18/05/29 21:11:39 INFO mapred.LocalJobRunner: Starting task: attempt_local251880250_0002_m_000000_0
18/05/29 21:11:39 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@50ab4b47
18/05/29 21:11:39 INFO mapred.MapTask: Processing split: file:/usr/hadoop/hadoop-1.2.1/grep-temp-1671205082/part-00000:0+111
18/05/29 21:11:39 INFO mapred.MapTask: numReduceTasks: 1
18/05/29 21:11:39 INFO mapred.MapTask: io.sort.mb = 100
18/05/29 21:11:39 INFO mapred.MapTask: data buffer = 79691776/99614720
18/05/29 21:11:39 INFO mapred.MapTask: record buffer = 262144/327680
18/05/29 21:11:39 INFO mapred.MapTask: Starting flush of map output
18/05/29 21:11:39 INFO mapred.MapTask: Finished spill 0
18/05/29 21:11:39 INFO mapred.Task: Task:attempt_local251880250_0002_m_000000_0 is done. And is in the process of commiting
18/05/29 21:11:39 INFO mapred.LocalJobRunner: file:/usr/hadoop/hadoop-1.2.1/grep-temp-1671205082/part-00000:0+111
18/05/29 21:11:39 INFO mapred.Task: Task 'attempt_local251880250_0002_m_000000_0' done.
18/05/29 21:11:39 INFO mapred.LocalJobRunner: Finishing task: attempt_local251880250_0002_m_000000_0
18/05/29 21:11:39 INFO mapred.LocalJobRunner: Map task executor complete.
18/05/29 21:11:39 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4d0a2b92
18/05/29 21:11:39 INFO mapred.LocalJobRunner:
18/05/29 21:11:39 INFO mapred.Merger: Merging 1 sorted segments
18/05/29 21:11:39 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 21 bytes
18/05/29 21:11:39 INFO mapred.LocalJobRunner:
18/05/29 21:11:39 INFO mapred.Task: Task:attempt_local251880250_0002_r_000000_0 is done. And is in the process of commiting
18/05/29 21:11:39 INFO mapred.LocalJobRunner:
18/05/29 21:11:39 INFO mapred.Task: Task attempt_local251880250_0002_r_000000_0 is allowed to commit now
18/05/29 21:11:39 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local251880250_0002_r_000000_0' to file:/usr/data/output
18/05/29 21:11:39 INFO mapred.LocalJobRunner: reduce > reduce
18/05/29 21:11:39 INFO mapred.Task: Task 'attempt_local251880250_0002_r_000000_0' done.
18/05/29 21:11:40 INFO mapred.JobClient: map 100% reduce 100%
18/05/29 21:11:40 INFO mapred.JobClient: Job complete: job_local251880250_0002
18/05/29 21:11:40 INFO mapred.JobClient: Counters: 21
18/05/29 21:11:40 INFO mapred.JobClient: Map-Reduce Framework
18/05/29 21:11:40 INFO mapred.JobClient: Spilled Records=2
18/05/29 21:11:40 INFO mapred.JobClient: Map output materialized bytes=25
18/05/29 21:11:40 INFO mapred.JobClient: Reduce input records=1
18/05/29 21:11:40 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0
18/05/29 21:11:40 INFO mapred.JobClient: Map input records=1
18/05/29 21:11:40 INFO mapred.JobClient: SPLIT_RAW_BYTES=114
18/05/29 21:11:40 INFO mapred.JobClient: Map output bytes=17
18/05/29 21:11:40 INFO mapred.JobClient: Reduce shuffle bytes=0
18/05/29 21:11:40 INFO mapred.JobClient: Physical memory (bytes) snapshot=0
18/05/29 21:11:40 INFO mapred.JobClient: Map input bytes=25
18/05/29 21:11:40 INFO mapred.JobClient: Reduce input groups=1
18/05/29 21:11:40 INFO mapred.JobClient: Combine output records=0
18/05/29 21:11:40 INFO mapred.JobClient: Reduce output records=1
18/05/29 21:11:40 INFO mapred.JobClient: Map output records=1
18/05/29 21:11:40 INFO mapred.JobClient: Combine input records=0
18/05/29 21:11:40 INFO mapred.JobClient: CPU time spent (ms)=0
18/05/29 21:11:40 INFO mapred.JobClient: Total committed heap usage (bytes)=322510848
18/05/29 21:11:40 INFO mapred.JobClient: File Input Format Counters
18/05/29 21:11:40 INFO mapred.JobClient: Bytes Read=123
18/05/29 21:11:40 INFO mapred.JobClient: FileSystemCounters
18/05/29 21:11:40 INFO mapred.JobClient: FILE_BYTES_WRITTEN=782023
18/05/29 21:11:40 INFO mapred.JobClient: FILE_BYTES_READ=610105
18/05/29 21:11:40 INFO mapred.JobClient: File Output Format Counters
18/05/29 21:11:40 INFO mapred.JobClient: Bytes Written=23
可以去data/output/查看输出的内容
至此单机部署已经结束
按照伪分布模式安装部署