


---- 并行计算框架模型

Hadoop MapReduce是一个软件框架,基于该框架能够容易易地编写应⽤用程序,这些应用程序能够运行在由上千个商⽤用机器器组成的⼤大集群上,并以一种可靠的,具有容错能⼒力力的⽅方式并⾏行行地处理理上TB级别的海量数据集。这个定义里面有着这些关键词:


  • Mapper负责“分”,即把复杂的任务分解为若干个“简单的任务”来处理理。“简单的任务”包含三层含义:

    • 是数据或计算的规模相对原任务要⼤大缩小;
    • 是就近计算原则,即任务会分配到存放着所需数据的节点上进⾏行行计算;
    • 是这些⼩小任务可以并⾏行行计算,彼此间⼏几乎没有依赖关系。
  • Reducer负责对map阶段的结果进行汇总。




Apache Hadoop YARN (Yet Another Resource Negotiator,另一种资源协调者)是一种新的 Hadoop 资源管理器,它是一个通用资源管理系统,可为上层应⽤用提供统一的资源管理和调度,它的引入为集群在利用率、资源统一管理和数据共享等方面带来了巨大好处。Hbase、Hive、Spark On Yarn mapReduce 都可以在该框架上运行


  • ResourceManager资源管理器 负责集群管理和资源管理调度并接收NodeManger的汇报监控NodeManger

  • NodeManager是每台机器器框架代理理,负责容器器,监视其资源使⽤用情况(CPU,内存,磁盘,⽹网络)并将其报告给ResourceManager / Scheduler。

  • App Master :Master负责任务计算过程中的任务监控、故障转移,每个Job只有一个。管理这一个MR任务

  • **Container:**表示一个计算进程容器(打包一系列的计算资源) 默认大小1G




1.run job

2.get new application

3.copy job resouce

4.submit job

5.init container

6.init mrappmaster

7.retrieve input splits

8.allocate resource

9.init container(计算容器)

10·retrieve job resource(接受任务资源 代码 配置 数据)

11·run map任务或者reduce任务


![](assets\Yarn 计算.png)



  • 修改 etc/hadoop/mapred-site.xml
    [root@node1 hadoop-2.6.0]# mv etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml
  • 修改 etc/hadoop/yarn-site.xml
  • 启动服务


    [root@hadoop ~]# hdfs namenode -format
    [root@hadoop ~]# start-dfs.sh
    # 启动hdfs
    # 启动Yarn
    [root@hadoop hadoop-2.6.0]# start-yarn.sh


  • Maven依赖



package com.baizhi.yarn;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

import java.io.IOException;

public class MyMapper extends Mapper<LongWritable,Text,Text,IntWritable>{
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String[] str = value.toString().split(" ");
        for (String s : str) {
            context.write(new Text(s),new IntWritable(1));


package com.baizhi.yarn;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

import java.io.IOException;

public class MyReduce extends Reducer<Text,IntWritable,Text,IntWritable> {
    protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
        int sum = 0;
        for (IntWritable value : values) {
            sum = value.get();
        context.write(key,new IntWritable(sum));


package com.baizhi.yarn;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

import java.io.IOException;

public class InitMR {
    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
        // 1 初始化MR任务对象
        Configuration configuration = new Configuration();
        Job job = Job.getInstance(configuration, "Word COUNT");
        // 2 设置数据的输入类型和输出类型
        // inputFormat 决定了如何切割数据集 如何读取切割后的数据
        // outputFormat 如何输出计算结果
        //3. 设置数据集的来源和计算结果的输出目的地
         * *************************** 一: 在虚拟机中运行
        /*TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
        TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result1"));*/

         * ************************** 一: 在idea中用主函数运行
         *  2.MR测试⽅方法⼆二:本地计算(用本地的Hadoop进行计算)+本地HDFS⽂建
        /*TextInputFormat.addInputPath(job,new Path("file:///E://WordCount.txt"));
        TextOutputFormat.setOutputPath(job,new Path("file:///E://result"));
         *  3.本地计算+远程HDFS⽂文件
        TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
        TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result2"));

        //4. 设置keyOut valueOut数据类型

        //5. 其它
        // 设置初始化MR程序的Map任务的实现类和Reduce任务的实现类

        //6. 提交MR程序




   将代码打包 拉入jvm 中运行
 //3. 设置数据集的来源和计算结果的输出目的地
         *  1 : 在虚拟机中运行
        TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
        TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result1"));
[root@node1 hadoop-2.6.0] bin/hadoop jar jar包路径(/mr_demo-1.0-SNAPSHOT.jar) 主函数名称(com.baizhi.yarn.InitMR)

  • –2.使用主函数运行
    • 本地计算+本地HDFS⽂文件

      在 InitMR.java 类中加⼊入以下代码

               *  2.MR测试⽅方法二:本地计算(用本地的Hadoop进行计算)+本地HDFS⽂建
            TextInputFormat.addInputPath(job,new Path("file:///E://WordCount.txt"));
            TextOutputFormat.setOutputPath(job,new Path("file:///E://result"));


      在项目录新建 org.apache.hadoop.io.nativeio包创建NativeIO类替换他的NativeIO类
      修改NativeIO源码的279行修改为true return access0(path, desiredAccess.accessRight()); 修改为 return true;

      // Source code recreated from a .class file by IntelliJ IDEA
      // (powered by Fernflower decompiler)
      package org.apache.hadoop.io.nativeio;
      import com.google.common.annotations.VisibleForTesting;
      import java.io.Closeable;
      import java.io.File;
      import java.io.FileDescriptor;
      import java.io.FileInputStream;
      import java.io.FileOutputStream;
      import java.io.IOException;
      import java.io.RandomAccessFile;
      import java.lang.reflect.Field;
      import java.nio.ByteBuffer;
      import java.nio.MappedByteBuffer;
      import java.nio.channels.FileChannel;
      import java.util.Map;
      import java.util.concurrent.ConcurrentHashMap;
      import org.apache.commons.logging.Log;
      import org.apache.commons.logging.LogFactory;
      import org.apache.hadoop.classification.InterfaceAudience.Private;
      import org.apache.hadoop.classification.InterfaceStability.Unstable;
      import org.apache.hadoop.conf.Configuration;
      import org.apache.hadoop.fs.HardLink;
      import org.apache.hadoop.io.IOUtils;
      import org.apache.hadoop.io.SecureIOUtils.AlreadyExistsException;
      import org.apache.hadoop.util.NativeCodeLoader;
      import org.apache.hadoop.util.PerformanceAdvisory;
      import org.apache.hadoop.util.Shell;
      import sun.misc.Cleaner;
      import sun.misc.Unsafe;
      import sun.nio.ch.DirectBuffer;
      public class NativeIO {
          private static boolean workaroundNonThreadSafePasswdCalls = false;
          private static final Log LOG = LogFactory.getLog(NativeIO.class);
          private static boolean nativeLoaded = false;
          private static final Map<Long, NativeIO.CachedUid> uidCache;
          private static long cacheTimeout;
          private static boolean initialized;
          public NativeIO() {
          public static boolean isAvailable() {
              return NativeCodeLoader.isNativeCodeLoaded() && nativeLoaded;
          private static native void initNative();
          static long getMemlockLimit() {
              return isAvailable() ? getMemlockLimit0() : 0L;
          private static native long getMemlockLimit0();
          static long getOperatingSystemPageSize() {
              try {
                  Field f = Unsafe.class.getDeclaredField("theUnsafe");
                  Unsafe unsafe = (Unsafe)f.get((Object)null);
                  return (long)unsafe.pageSize();
              } catch (Throwable var2) {
                  LOG.warn("Unable to get operating system page size.  Guessing 4096.", var2);
                  return 4096L;
          private static String stripDomain(String name) {
              int i = name.indexOf(92);
              if (i != -1) {
                  name = name.substring(i + 1);
              return name;
          public static String getOwner(FileDescriptor fd) throws IOException {
              if (Shell.WINDOWS) {
                  String owner = NativeIO.Windows.getOwner(fd);
                  owner = stripDomain(owner);
                  return owner;
              } else {
                  long uid = NativeIO.POSIX.getUIDforFDOwnerforOwner(fd);
                  NativeIO.CachedUid cUid = (NativeIO.CachedUid)uidCache.get(uid);
                  long now = System.currentTimeMillis();
                  if (cUid != null && cUid.timestamp + cacheTimeout > now) {
                      return cUid.username;
                  } else {
                      String user = NativeIO.POSIX.getUserName(uid);
                      LOG.info("Got UserName " + user + " for UID " + uid + " from the native implementation");
                      cUid = new NativeIO.CachedUid(user, now);
                      uidCache.put(uid, cUid);
                      return user;
          public static FileInputStream getShareDeleteFileInputStream(File f) throws IOException {
              if (!Shell.WINDOWS) {
                  return new FileInputStream(f);
              } else {
                  FileDescriptor fd = NativeIO.Windows.createFile(f.getAbsolutePath(), 2147483648L, 7L, 3L);
                  return new FileInputStream(fd);
          public static FileInputStream getShareDeleteFileInputStream(File f, long seekOffset) throws IOException {
              if (!Shell.WINDOWS) {
                  RandomAccessFile rf = new RandomAccessFile(f, "r");
                  if (seekOffset > 0L) {
                  return new FileInputStream(rf.getFD());
              } else {
                  FileDescriptor fd = NativeIO.Windows.createFile(f.getAbsolutePath(), 2147483648L, 7L, 3L);
                  if (seekOffset > 0L) {
                      NativeIO.Windows.setFilePointer(fd, seekOffset, 0L);
                  return new FileInputStream(fd);
          public static FileOutputStream getCreateForWriteFileOutputStream(File f, int permissions) throws IOException {
              FileDescriptor fd;
              if (!Shell.WINDOWS) {
                  try {
                      fd = NativeIO.POSIX.open(f.getAbsolutePath(), 193, permissions);
                      return new FileOutputStream(fd);
                  } catch (NativeIOException var3) {
                      if (var3.getErrno() == Errno.EEXIST) {
                          throw new AlreadyExistsException(var3);
                      } else {
                          throw var3;
              } else {
                  try {
                      fd = NativeIO.Windows.createFile(f.getCanonicalPath(), 1073741824L, 7L, 1L);
                      NativeIO.POSIX.chmod(f.getCanonicalPath(), permissions);
                      return new FileOutputStream(fd);
                  } catch (NativeIOException var4) {
                      if (var4.getErrorCode() == 80L) {
                          throw new AlreadyExistsException(var4);
                      } else {
                          throw var4;
          private static synchronized void ensureInitialized() {
              if (!initialized) {
                  cacheTimeout = (new Configuration()).getLong("hadoop.security.uid.cache.secs", 14400L) * 1000L;
                  LOG.info("Initialized cache for UID to User mapping with a cache timeout of " + cacheTimeout / 1000L + " seconds.");
                  initialized = true;
          public static void renameTo(File src, File dst) throws IOException {
              if (!nativeLoaded) {
                  if (!src.renameTo(dst)) {
                      throw new IOException("renameTo(src=" + src + ", dst=" + dst + ") failed.");
              } else {
                  renameTo0(src.getAbsolutePath(), dst.getAbsolutePath());
          public static void link(File src, File dst) throws IOException {
              if (!nativeLoaded) {
                  HardLink.createHardLink(src, dst);
              } else {
                  link0(src.getAbsolutePath(), dst.getAbsolutePath());
          private static native void renameTo0(String var0, String var1) throws NativeIOException;
          private static native void link0(String var0, String var1) throws NativeIOException;
          public static void copyFileUnbuffered(File src, File dst) throws IOException {
              if (nativeLoaded && Shell.WINDOWS) {
                  copyFileUnbuffered0(src.getAbsolutePath(), dst.getAbsolutePath());
              } else {
                  FileInputStream fis = null;
                  FileOutputStream fos = null;
                  FileChannel input = null;
                  FileChannel output = null;
                  try {
                      fis = new FileInputStream(src);
                      fos = new FileOutputStream(dst);
                      input = fis.getChannel();
                      output = fos.getChannel();
                      long remaining = input.size();
                      long position = 0L;
                      for(long transferred = 0L; remaining > 0L; position += transferred) {
                          transferred = input.transferTo(position, remaining, output);
                          remaining -= transferred;
                  } finally {
                      IOUtils.cleanup(LOG, new Closeable[]{output});
                      IOUtils.cleanup(LOG, new Closeable[]{fos});
                      IOUtils.cleanup(LOG, new Closeable[]{input});
                      IOUtils.cleanup(LOG, new Closeable[]{fis});
          private static native void copyFileUnbuffered0(String var0, String var1) throws NativeIOException;
          static {
              if (NativeCodeLoader.isNativeCodeLoaded()) {
                  try {
                      nativeLoaded = true;
                  } catch (Throwable var1) {
                      PerformanceAdvisory.LOG.debug("Unable to initialize NativeIO libraries", var1);
              uidCache = new ConcurrentHashMap();
              initialized = false;
          private static class CachedUid {
              final long timestamp;
              final String username;
              public CachedUid(String username, long timestamp) {
                  this.timestamp = timestamp;
                  this.username = username;
          public static class Windows {
              public static final long GENERIC_READ = 2147483648L;
              public static final long GENERIC_WRITE = 1073741824L;
              public static final long FILE_SHARE_READ = 1L;
              public static final long FILE_SHARE_WRITE = 2L;
              public static final long FILE_SHARE_DELETE = 4L;
              public static final long CREATE_NEW = 1L;
              public static final long CREATE_ALWAYS = 2L;
              public static final long OPEN_EXISTING = 3L;
              public static final long OPEN_ALWAYS = 4L;
              public static final long TRUNCATE_EXISTING = 5L;
              public static final long FILE_BEGIN = 0L;
              public static final long FILE_CURRENT = 1L;
              public static final long FILE_END = 2L;
              public static final long FILE_ATTRIBUTE_NORMAL = 128L;
              public Windows() {
              public static native FileDescriptor createFile(String var0, long var1, long var3, long var5) throws IOException;
              public static native long setFilePointer(FileDescriptor var0, long var1, long var3) throws IOException;
              private static native String getOwner(FileDescriptor var0) throws IOException;
              private static native boolean access0(String var0, int var1);
              public static boolean access(String path, NativeIO.Windows.AccessRight desiredAccess) throws IOException {
                  // hadoop源码的错误
                  return true;
              public static native void extendWorkingSetSize(long var0) throws IOException;
              static {
                  if (NativeCodeLoader.isNativeCodeLoaded()) {
                      try {
                          NativeIO.nativeLoaded = true;
                      } catch (Throwable var1) {
                          PerformanceAdvisory.LOG.debug("Unable to initialize NativeIO libraries", var1);
              public static enum AccessRight {
                  private final int accessRight;
                  private AccessRight(int access) {
                      this.accessRight = access;
                  public int accessRight() {
                      return this.accessRight;
          public static class POSIX {
              public static final int O_RDONLY = 0;
              public static final int O_WRONLY = 1;
              public static final int O_RDWR = 2;
              public static final int O_CREAT = 64;
              public static final int O_EXCL = 128;
              public static final int O_NOCTTY = 256;
              public static final int O_TRUNC = 512;
              public static final int O_APPEND = 1024;
              public static final int O_NONBLOCK = 2048;
              public static final int O_SYNC = 4096;
              public static final int O_ASYNC = 8192;
              public static final int O_FSYNC = 4096;
              public static final int O_NDELAY = 2048;
              public static final int POSIX_FADV_NORMAL = 0;
              public static final int POSIX_FADV_RANDOM = 1;
              public static final int POSIX_FADV_SEQUENTIAL = 2;
              public static final int POSIX_FADV_WILLNEED = 3;
              public static final int POSIX_FADV_DONTNEED = 4;
              public static final int POSIX_FADV_NOREUSE = 5;
              public static final int SYNC_FILE_RANGE_WAIT_BEFORE = 1;
              public static final int SYNC_FILE_RANGE_WRITE = 2;
              public static final int SYNC_FILE_RANGE_WAIT_AFTER = 4;
              private static final Log LOG = LogFactory.getLog(NativeIO.class);
              private static boolean nativeLoaded = false;
              private static boolean fadvisePossible = true;
              private static boolean syncFileRangePossible = true;
              static final String WORKAROUND_NON_THREADSAFE_CALLS_KEY = "hadoop.workaround.non.threadsafe.getpwuid";
              static final boolean WORKAROUND_NON_THREADSAFE_CALLS_DEFAULT = true;
              private static long cacheTimeout = -1L;
              private static NativeIO.POSIX.CacheManipulator cacheManipulator = new NativeIO.POSIX.CacheManipulator();
              private static final Map<Integer, NativeIO.POSIX.CachedName> USER_ID_NAME_CACHE;
              private static final Map<Integer, NativeIO.POSIX.CachedName> GROUP_ID_NAME_CACHE;
              public static final int MMAP_PROT_READ = 1;
              public static final int MMAP_PROT_WRITE = 2;
              public static final int MMAP_PROT_EXEC = 4;
              public POSIX() {
              public static NativeIO.POSIX.CacheManipulator getCacheManipulator() {
                  return cacheManipulator;
              public static void setCacheManipulator(NativeIO.POSIX.CacheManipulator cacheManipulator) {
                  cacheManipulator = cacheManipulator;
              public static boolean isAvailable() {
                  return NativeCodeLoader.isNativeCodeLoaded() && nativeLoaded;
              private static void assertCodeLoaded() throws IOException {
                  if (!isAvailable()) {
                      throw new IOException("NativeIO was not loaded");
              public static native FileDescriptor open(String var0, int var1, int var2) throws IOException;
              private static native NativeIO.POSIX.Stat fstat(FileDescriptor var0) throws IOException;
              private static native void chmodImpl(String var0, int var1) throws IOException;
              public static void chmod(String path, int mode) throws IOException {
                  if (!Shell.WINDOWS) {
                      chmodImpl(path, mode);
                  } else {
                      try {
                          chmodImpl(path, mode);
                      } catch (NativeIOException var3) {
                          if (var3.getErrorCode() == 3L) {
                              throw new NativeIOException("No such file or directory", Errno.ENOENT);
                          LOG.warn(String.format("NativeIO.chmod error (%d): %s", var3.getErrorCode(), var3.getMessage()));
                          throw new NativeIOException("Unknown error", Errno.UNKNOWN);
              static native void posix_fadvise(FileDescriptor var0, long var1, long var3, int var5) throws NativeIOException;
              static native void sync_file_range(FileDescriptor var0, long var1, long var3, int var5) throws NativeIOException;
              static void posixFadviseIfPossible(String identifier, FileDescriptor fd, long offset, long len, int flags) throws NativeIOException {
                  if (nativeLoaded && fadvisePossible) {
                      try {
                          posix_fadvise(fd, offset, len, flags);
                      } catch (UnsupportedOperationException var8) {
                          fadvisePossible = false;
                      } catch (UnsatisfiedLinkError var9) {
                          fadvisePossible = false;
              public static void syncFileRangeIfPossible(FileDescriptor fd, long offset, long nbytes, int flags) throws NativeIOException {
                  if (nativeLoaded && syncFileRangePossible) {
                      try {
                          sync_file_range(fd, offset, nbytes, flags);
                      } catch (UnsupportedOperationException var7) {
                          syncFileRangePossible = false;
                      } catch (UnsatisfiedLinkError var8) {
                          syncFileRangePossible = false;
              static native void mlock_native(ByteBuffer var0, long var1) throws NativeIOException;
              static void mlock(ByteBuffer buffer, long len) throws IOException {
                  if (!buffer.isDirect()) {
                      throw new IOException("Cannot mlock a non-direct ByteBuffer");
                  } else {
                      mlock_native(buffer, len);
              public static void munmap(MappedByteBuffer buffer) {
                  if (buffer instanceof DirectBuffer) {
                      Cleaner cleaner = ((DirectBuffer)buffer).cleaner();
              private static native long getUIDforFDOwnerforOwner(FileDescriptor var0) throws IOException;
              private static native String getUserName(long var0) throws IOException;
              public static NativeIO.POSIX.Stat getFstat(FileDescriptor fd) throws IOException {
                  NativeIO.POSIX.Stat stat = null;
                  if (!Shell.WINDOWS) {
                      stat = fstat(fd);
                      stat.owner = getName(NativeIO.POSIX.IdCache.USER, stat.ownerId);
                      stat.group = getName(NativeIO.POSIX.IdCache.GROUP, stat.groupId);
                  } else {
                      try {
                          stat = fstat(fd);
                      } catch (NativeIOException var3) {
                          if (var3.getErrorCode() == 6L) {
                              throw new NativeIOException("The handle is invalid.", Errno.EBADF);
                          LOG.warn(String.format("NativeIO.getFstat error (%d): %s", var3.getErrorCode(), var3.getMessage()));
                          throw new NativeIOException("Unknown error", Errno.UNKNOWN);
                  return stat;
              private static String getName(NativeIO.POSIX.IdCache domain, int id) throws IOException {
                  Map<Integer, NativeIO.POSIX.CachedName> idNameCache = domain == NativeIO.POSIX.IdCache.USER ? USER_ID_NAME_CACHE : GROUP_ID_NAME_CACHE;
                  NativeIO.POSIX.CachedName cachedName = (NativeIO.POSIX.CachedName)idNameCache.get(id);
                  long now = System.currentTimeMillis();
                  String name;
                  if (cachedName != null && cachedName.timestamp + cacheTimeout > now) {
                      name = cachedName.name;
                  } else {
                      name = domain == NativeIO.POSIX.IdCache.USER ? getUserName(id) : getGroupName(id);
                      if (LOG.isDebugEnabled()) {
                          String type = domain == NativeIO.POSIX.IdCache.USER ? "UserName" : "GroupName";
                          LOG.debug("Got " + type + " " + name + " for ID " + id + " from the native implementation");
                      cachedName = new NativeIO.POSIX.CachedName(name, now);
                      idNameCache.put(id, cachedName);
                  return name;
              static native String getUserName(int var0) throws IOException;
              static native String getGroupName(int var0) throws IOException;
              public static native long mmap(FileDescriptor var0, int var1, boolean var2, long var3) throws IOException;
              public static native void munmap(long var0, long var2) throws IOException;
              static {
                  if (NativeCodeLoader.isNativeCodeLoaded()) {
                      try {
                          Configuration conf = new Configuration();
                          NativeIO.workaroundNonThreadSafePasswdCalls = conf.getBoolean("hadoop.workaround.non.threadsafe.getpwuid", true);
                          nativeLoaded = true;
                          cacheTimeout = conf.getLong("hadoop.security.uid.cache.secs", 14400L) * 1000L;
                          LOG.debug("Initialized cache for IDs to User/Group mapping with a  cache timeout of " + cacheTimeout / 1000L + " seconds.");
                      } catch (Throwable var1) {
                          PerformanceAdvisory.LOG.debug("Unable to initialize NativeIO libraries", var1);
                  USER_ID_NAME_CACHE = new ConcurrentHashMap();
                  GROUP_ID_NAME_CACHE = new ConcurrentHashMap();
              private static enum IdCache {
                  private IdCache() {
              private static class CachedName {
                  final long timestamp;
                  final String name;
                  public CachedName(String name, long timestamp) {
                      this.name = name;
                      this.timestamp = timestamp;
              public static class Stat {
                  private int ownerId;
                  private int groupId;
                  private String owner;
                  private String group;
                  private int mode;
                  public static final int S_IFMT = 61440;
                  public static final int S_IFIFO = 4096;
                  public static final int S_IFCHR = 8192;
                  public static final int S_IFDIR = 16384;
                  public static final int S_IFBLK = 24576;
                  public static final int S_IFREG = 32768;
                  public static final int S_IFLNK = 40960;
                  public static final int S_IFSOCK = 49152;
                  public static final int S_IFWHT = 57344;
                  public static final int S_ISUID = 2048;
                  public static final int S_ISGID = 1024;
                  public static final int S_ISVTX = 512;
                  public static final int S_IRUSR = 256;
                  public static final int S_IWUSR = 128;
                  public static final int S_IXUSR = 64;
                  Stat(int ownerId, int groupId, int mode) {
                      this.ownerId = ownerId;
                      this.groupId = groupId;
                      this.mode = mode;
                  Stat(String owner, String group, int mode) {
                      if (!Shell.WINDOWS) {
                          this.owner = owner;
                      } else {
                          this.owner = NativeIO.stripDomain(owner);
                      if (!Shell.WINDOWS) {
                          this.group = group;
                      } else {
                          this.group = NativeIO.stripDomain(group);
                      this.mode = mode;
                  public String toString() {
                      return "Stat(owner='" + this.owner + "', group='" + this.group + "'" + ", mode=" + this.mode + ")";
                  public String getOwner() {
                      return this.owner;
                  public String getGroup() {
                      return this.group;
                  public int getMode() {
                      return this.mode;
              public static class NoMlockCacheManipulator extends NativeIO.POSIX.CacheManipulator {
                  public NoMlockCacheManipulator() {
                  public void mlock(String identifier, ByteBuffer buffer, long len) throws IOException {
                      NativeIO.POSIX.LOG.info("mlocking " + identifier);
                  public long getMemlockLimit() {
                      return 1125899906842624L;
                  public long getOperatingSystemPageSize() {
                      return 4096L;
                  public boolean verifyCanMlock() {
                      return true;
              public static class CacheManipulator {
                  public CacheManipulator() {
                  public void mlock(String identifier, ByteBuffer buffer, long len) throws IOException {
                      NativeIO.POSIX.mlock(buffer, len);
                  public long getMemlockLimit() {
                      return NativeIO.getMemlockLimit();
                  public long getOperatingSystemPageSize() {
                      return NativeIO.getOperatingSystemPageSize();
                  public void posixFadviseIfPossible(String identifier, FileDescriptor fd, long offset, long len, int flags) throws NativeIOException {
                      NativeIO.POSIX.posixFadviseIfPossible(identifier, fd, offset, len, flags);
                  public boolean verifyCanMlock() {
                      return NativeIO.isAvailable();


    • 本地计算+远程HDFS文件

      在 InitMR.java 类中加⼊入以下代码

               *  3.本地计算+远程HDFS⽂文件
              TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
              TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result2"));


      • 关闭HDFS权限检查
        • 修改hdfs-site.xml
        • 或指定虚拟机参数
  • 远程计算+远程HDFS⽂文件

  • 在 InitMR.java 类中加入以下代码,并将项目打成jar包 运行主函数

public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS", "hdfs://hadoop:9000/");
    conf.set("mapreduce.job.jar", "file:///E:\\训练营备课
    conf.set("mapreduce.framework.name", "yarn");
    conf.set("yarn.resourcemanager.hostname", "hadoop");
    conf.set("yarn.nodemanager.aux-services", "mapreduce_shuffle");
    conf.set("mapreduce.app-submission.cross-platform", "true");
    conf.set("dfs.replication", "1");
    // ......
    // MR测试⽅方法四:远程计算+远程HDFS⽂文件
    FileInputFormat.setInputPaths(job, "/user/word.txt");
    FileOutputFormat.setOutputPath(job, new Path("/user/result"));


wordCount 统计单词出现的次数
flow 流量统计案列
