---- 并行计算框架模型
Hadoop MapReduce是一个软件框架,基于该框架能够容易易地编写应⽤用程序,这些应用程序能够运行在由上千个商⽤用机器器组成的⼤大集群上,并以一种可靠的,具有容错能⼒力力的⽅方式并⾏行行地处理理上TB级别的海量数据集。这个定义里面有着这些关键词:
一是软件框架,二是并行处理,三是可靠且容错,四是大规模集群,五是海量数据集。
MapReduce长处理大数据,它为什么具有这种能力呢?这可由MapReduce的设计思想发觉。
MapReduce的思想就是“分而治之”。
Mapper负责“分”,即把复杂的任务分解为若干个“简单的任务”来处理理。“简单的任务”包含三层含义:
Reducer负责对map阶段的结果进行汇总。
-----分布式集群的资源管理和调度平台
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
Apache Hadoop YARN (Yet Another Resource Negotiator,另一种资源协调者)是一种新的 Hadoop 资源管理器,它是一个通用资源管理系统,可为上层应⽤用提供统一的资源管理和调度,它的引入为集群在利用率、资源统一管理和数据共享等方面带来了巨大好处。Hbase、Hive、Spark On Yarn mapReduce 都可以在该框架上运行
ResourceManager资源管理器 负责集群管理和资源管理调度并接收NodeManger的汇报监控NodeManger
NodeManager是每台机器器框架代理理,负责容器器,监视其资源使⽤用情况(CPU,内存,磁盘,⽹网络)并将其报告给ResourceManager / Scheduler。
App Master :Master负责任务计算过程中的任务监控、故障转移,每个Job只有一个。管理这一个MR任务
**Container:**表示一个计算进程容器(打包一系列的计算资源) 默认大小1G
1.run job
2.get new application
3.copy job resouce
4.submit job
5.init container
6.init mrappmaster
7.retrieve input splits
8.allocate resource
9.init container(计算容器)
10·retrieve job resource(接受任务资源 代码 配置 数据)
11·run map任务或者reduce任务
12·result
在HDFS环境上进行修改
[root@node1 hadoop-2.6.0]# mv etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml
<property>
<name>mapreduce.framework.namename>
<value>yarnvalue>
property>
<property>
<name>yarn.nodemanager.aux-servicesname>
<value>mapreduce_shufflevalue>
property>
<property>
<name>yarn.resourcemanager.hostnamename>
<value>主机名(hadoop)value>
property>
也需要把HDfs的服务启动
[root@hadoop ~]# hdfs namenode -format
#namenode格式化只需要在初次使⽤用hadoop的时候执行,以后无需每次启动执行
[root@hadoop ~]# start-dfs.sh
# 启动hdfs
# 启动Yarn
[root@hadoop hadoop-2.6.0]# start-yarn.sh
<dependency>
<groupId>org.apache.hadoopgroupId>
<artifactId>hadoop-commonartifactId>
<version>2.6.0version>
dependency>
<dependency>
<groupId>org.apache.hadoopgroupId>
<artifactId>hadoop-hdfsartifactId>
<version>2.6.0version>
dependency>
<dependency>
<groupId>org.apache.hadoopgroupId>
<artifactId>hadoop-mapreduce-client-commonartifactId>
<version>2.6.0version>
dependency>
<dependency>
<groupId>org.apache.hadoopgroupId>
<artifactId>hadoop-mapreduce-client-coreartifactId>
<version>2.6.0version>
dependency>
<dependency>
<groupId>org.apache.hadoopgroupId>
<artifactId>hadoop-mapreduce-client-jobclientartifactId>
<version>2.6.0version>
dependency>
1.创建Maven项目
2.创建Mapper程序
package com.baizhi.yarn;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import java.io.IOException;
public class MyMapper extends Mapper<LongWritable,Text,Text,IntWritable>{
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String[] str = value.toString().split(" ");
for (String s : str) {
context.write(new Text(s),new IntWritable(1));
}
}
}
3.创建Reducer程序
package com.baizhi.yarn;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
import java.io.IOException;
public class MyReduce extends Reducer<Text,IntWritable,Text,IntWritable> {
@Override
protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum = value.get();
}
context.write(key,new IntWritable(sum));
}
}
4,·定制入口类
package com.baizhi.yarn;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import java.io.IOException;
public class InitMR {
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
// 1 初始化MR任务对象
Configuration configuration = new Configuration();
Job job = Job.getInstance(configuration, "Word COUNT");
job.setJarByClass(InitMR.class);
// 2 设置数据的输入类型和输出类型
// inputFormat 决定了如何切割数据集 如何读取切割后的数据
// outputFormat 如何输出计算结果
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
//3. 设置数据集的来源和计算结果的输出目的地
/**
* *************************** 一: 在虚拟机中运行
*/
/*TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result1"));*/
/**
* ************************** 一: 在idea中用主函数运行
* 2.MR测试⽅方法⼆二:本地计算(用本地的Hadoop进行计算)+本地HDFS⽂建
*/
/*TextInputFormat.addInputPath(job,new Path("file:///E://WordCount.txt"));
TextOutputFormat.setOutputPath(job,new Path("file:///E://result"));
*/
/**
* 3.本地计算+远程HDFS⽂文件
*/
TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result2"));
//4. 设置keyOut valueOut数据类型
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
//5. 其它
// 设置初始化MR程序的Map任务的实现类和Reduce任务的实现类
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReduce.class);
//6. 提交MR程序
job.waitForCompletion(true);
}
}
5.使用4种运行方式其中的一种运行
–1.在hadoop环境中通过运行jar包,测试运行行MapReduce程序
将代码打包 拉入jvm 中运行
//3. 设置数据集的来源和计算结果的输出目的地
/**
* 1 : 在虚拟机中运行
*/
TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result1"));
[root@node1 hadoop-2.6.0] bin/hadoop jar jar包路径(/mr_demo-1.0-SNAPSHOT.jar) 主函数名称(com.baizhi.yarn.InitMR)
本地计算+本地HDFS⽂文件
在 InitMR.java 类中加⼊入以下代码
/**
* 2.MR测试⽅方法二:本地计算(用本地的Hadoop进行计算)+本地HDFS⽂建
*/
TextInputFormat.addInputPath(job,new Path("file:///E://WordCount.txt"));
TextOutputFormat.setOutputPath(job,new Path("file:///E://result"));
需要修改yar的源码
在项目录新建 org.apache.hadoop.io.nativeio包创建NativeIO类替换他的NativeIO类
修改NativeIO源码的279行修改为true return access0(path, desiredAccess.accessRight()); 修改为 return true;
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//
package org.apache.hadoop.io.nativeio;
import com.google.common.annotations.VisibleForTesting;
import java.io.Closeable;
import java.io.File;
import java.io.FileDescriptor;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.lang.reflect.Field;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.classification.InterfaceAudience.Private;
import org.apache.hadoop.classification.InterfaceStability.Unstable;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.HardLink;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.SecureIOUtils.AlreadyExistsException;
import org.apache.hadoop.util.NativeCodeLoader;
import org.apache.hadoop.util.PerformanceAdvisory;
import org.apache.hadoop.util.Shell;
import sun.misc.Cleaner;
import sun.misc.Unsafe;
import sun.nio.ch.DirectBuffer;
@Private
@Unstable
public class NativeIO {
private static boolean workaroundNonThreadSafePasswdCalls = false;
private static final Log LOG = LogFactory.getLog(NativeIO.class);
private static boolean nativeLoaded = false;
private static final Map<Long, NativeIO.CachedUid> uidCache;
private static long cacheTimeout;
private static boolean initialized;
public NativeIO() {
}
public static boolean isAvailable() {
return NativeCodeLoader.isNativeCodeLoaded() && nativeLoaded;
}
private static native void initNative();
static long getMemlockLimit() {
return isAvailable() ? getMemlockLimit0() : 0L;
}
private static native long getMemlockLimit0();
static long getOperatingSystemPageSize() {
try {
Field f = Unsafe.class.getDeclaredField("theUnsafe");
f.setAccessible(true);
Unsafe unsafe = (Unsafe)f.get((Object)null);
return (long)unsafe.pageSize();
} catch (Throwable var2) {
LOG.warn("Unable to get operating system page size. Guessing 4096.", var2);
return 4096L;
}
}
private static String stripDomain(String name) {
int i = name.indexOf(92);
if (i != -1) {
name = name.substring(i + 1);
}
return name;
}
public static String getOwner(FileDescriptor fd) throws IOException {
ensureInitialized();
if (Shell.WINDOWS) {
String owner = NativeIO.Windows.getOwner(fd);
owner = stripDomain(owner);
return owner;
} else {
long uid = NativeIO.POSIX.getUIDforFDOwnerforOwner(fd);
NativeIO.CachedUid cUid = (NativeIO.CachedUid)uidCache.get(uid);
long now = System.currentTimeMillis();
if (cUid != null && cUid.timestamp + cacheTimeout > now) {
return cUid.username;
} else {
String user = NativeIO.POSIX.getUserName(uid);
LOG.info("Got UserName " + user + " for UID " + uid + " from the native implementation");
cUid = new NativeIO.CachedUid(user, now);
uidCache.put(uid, cUid);
return user;
}
}
}
public static FileInputStream getShareDeleteFileInputStream(File f) throws IOException {
if (!Shell.WINDOWS) {
return new FileInputStream(f);
} else {
FileDescriptor fd = NativeIO.Windows.createFile(f.getAbsolutePath(), 2147483648L, 7L, 3L);
return new FileInputStream(fd);
}
}
public static FileInputStream getShareDeleteFileInputStream(File f, long seekOffset) throws IOException {
if (!Shell.WINDOWS) {
RandomAccessFile rf = new RandomAccessFile(f, "r");
if (seekOffset > 0L) {
rf.seek(seekOffset);
}
return new FileInputStream(rf.getFD());
} else {
FileDescriptor fd = NativeIO.Windows.createFile(f.getAbsolutePath(), 2147483648L, 7L, 3L);
if (seekOffset > 0L) {
NativeIO.Windows.setFilePointer(fd, seekOffset, 0L);
}
return new FileInputStream(fd);
}
}
public static FileOutputStream getCreateForWriteFileOutputStream(File f, int permissions) throws IOException {
FileDescriptor fd;
if (!Shell.WINDOWS) {
try {
fd = NativeIO.POSIX.open(f.getAbsolutePath(), 193, permissions);
return new FileOutputStream(fd);
} catch (NativeIOException var3) {
if (var3.getErrno() == Errno.EEXIST) {
throw new AlreadyExistsException(var3);
} else {
throw var3;
}
}
} else {
try {
fd = NativeIO.Windows.createFile(f.getCanonicalPath(), 1073741824L, 7L, 1L);
NativeIO.POSIX.chmod(f.getCanonicalPath(), permissions);
return new FileOutputStream(fd);
} catch (NativeIOException var4) {
if (var4.getErrorCode() == 80L) {
throw new AlreadyExistsException(var4);
} else {
throw var4;
}
}
}
}
private static synchronized void ensureInitialized() {
if (!initialized) {
cacheTimeout = (new Configuration()).getLong("hadoop.security.uid.cache.secs", 14400L) * 1000L;
LOG.info("Initialized cache for UID to User mapping with a cache timeout of " + cacheTimeout / 1000L + " seconds.");
initialized = true;
}
}
public static void renameTo(File src, File dst) throws IOException {
if (!nativeLoaded) {
if (!src.renameTo(dst)) {
throw new IOException("renameTo(src=" + src + ", dst=" + dst + ") failed.");
}
} else {
renameTo0(src.getAbsolutePath(), dst.getAbsolutePath());
}
}
public static void link(File src, File dst) throws IOException {
if (!nativeLoaded) {
HardLink.createHardLink(src, dst);
} else {
link0(src.getAbsolutePath(), dst.getAbsolutePath());
}
}
private static native void renameTo0(String var0, String var1) throws NativeIOException;
private static native void link0(String var0, String var1) throws NativeIOException;
public static void copyFileUnbuffered(File src, File dst) throws IOException {
if (nativeLoaded && Shell.WINDOWS) {
copyFileUnbuffered0(src.getAbsolutePath(), dst.getAbsolutePath());
} else {
FileInputStream fis = null;
FileOutputStream fos = null;
FileChannel input = null;
FileChannel output = null;
try {
fis = new FileInputStream(src);
fos = new FileOutputStream(dst);
input = fis.getChannel();
output = fos.getChannel();
long remaining = input.size();
long position = 0L;
for(long transferred = 0L; remaining > 0L; position += transferred) {
transferred = input.transferTo(position, remaining, output);
remaining -= transferred;
}
} finally {
IOUtils.cleanup(LOG, new Closeable[]{output});
IOUtils.cleanup(LOG, new Closeable[]{fos});
IOUtils.cleanup(LOG, new Closeable[]{input});
IOUtils.cleanup(LOG, new Closeable[]{fis});
}
}
}
private static native void copyFileUnbuffered0(String var0, String var1) throws NativeIOException;
static {
if (NativeCodeLoader.isNativeCodeLoaded()) {
try {
initNative();
nativeLoaded = true;
} catch (Throwable var1) {
PerformanceAdvisory.LOG.debug("Unable to initialize NativeIO libraries", var1);
}
}
uidCache = new ConcurrentHashMap();
initialized = false;
}
private static class CachedUid {
final long timestamp;
final String username;
public CachedUid(String username, long timestamp) {
this.timestamp = timestamp;
this.username = username;
}
}
public static class Windows {
public static final long GENERIC_READ = 2147483648L;
public static final long GENERIC_WRITE = 1073741824L;
public static final long FILE_SHARE_READ = 1L;
public static final long FILE_SHARE_WRITE = 2L;
public static final long FILE_SHARE_DELETE = 4L;
public static final long CREATE_NEW = 1L;
public static final long CREATE_ALWAYS = 2L;
public static final long OPEN_EXISTING = 3L;
public static final long OPEN_ALWAYS = 4L;
public static final long TRUNCATE_EXISTING = 5L;
public static final long FILE_BEGIN = 0L;
public static final long FILE_CURRENT = 1L;
public static final long FILE_END = 2L;
public static final long FILE_ATTRIBUTE_NORMAL = 128L;
public Windows() {
}
public static native FileDescriptor createFile(String var0, long var1, long var3, long var5) throws IOException;
public static native long setFilePointer(FileDescriptor var0, long var1, long var3) throws IOException;
private static native String getOwner(FileDescriptor var0) throws IOException;
private static native boolean access0(String var0, int var1);
public static boolean access(String path, NativeIO.Windows.AccessRight desiredAccess) throws IOException {
// hadoop源码的错误
return true;
}
public static native void extendWorkingSetSize(long var0) throws IOException;
static {
if (NativeCodeLoader.isNativeCodeLoaded()) {
try {
NativeIO.initNative();
NativeIO.nativeLoaded = true;
} catch (Throwable var1) {
PerformanceAdvisory.LOG.debug("Unable to initialize NativeIO libraries", var1);
}
}
}
public static enum AccessRight {
ACCESS_READ(1),
ACCESS_WRITE(2),
ACCESS_EXECUTE(32);
private final int accessRight;
private AccessRight(int access) {
this.accessRight = access;
}
public int accessRight() {
return this.accessRight;
}
}
}
public static class POSIX {
public static final int O_RDONLY = 0;
public static final int O_WRONLY = 1;
public static final int O_RDWR = 2;
public static final int O_CREAT = 64;
public static final int O_EXCL = 128;
public static final int O_NOCTTY = 256;
public static final int O_TRUNC = 512;
public static final int O_APPEND = 1024;
public static final int O_NONBLOCK = 2048;
public static final int O_SYNC = 4096;
public static final int O_ASYNC = 8192;
public static final int O_FSYNC = 4096;
public static final int O_NDELAY = 2048;
public static final int POSIX_FADV_NORMAL = 0;
public static final int POSIX_FADV_RANDOM = 1;
public static final int POSIX_FADV_SEQUENTIAL = 2;
public static final int POSIX_FADV_WILLNEED = 3;
public static final int POSIX_FADV_DONTNEED = 4;
public static final int POSIX_FADV_NOREUSE = 5;
public static final int SYNC_FILE_RANGE_WAIT_BEFORE = 1;
public static final int SYNC_FILE_RANGE_WRITE = 2;
public static final int SYNC_FILE_RANGE_WAIT_AFTER = 4;
private static final Log LOG = LogFactory.getLog(NativeIO.class);
private static boolean nativeLoaded = false;
private static boolean fadvisePossible = true;
private static boolean syncFileRangePossible = true;
static final String WORKAROUND_NON_THREADSAFE_CALLS_KEY = "hadoop.workaround.non.threadsafe.getpwuid";
static final boolean WORKAROUND_NON_THREADSAFE_CALLS_DEFAULT = true;
private static long cacheTimeout = -1L;
private static NativeIO.POSIX.CacheManipulator cacheManipulator = new NativeIO.POSIX.CacheManipulator();
private static final Map<Integer, NativeIO.POSIX.CachedName> USER_ID_NAME_CACHE;
private static final Map<Integer, NativeIO.POSIX.CachedName> GROUP_ID_NAME_CACHE;
public static final int MMAP_PROT_READ = 1;
public static final int MMAP_PROT_WRITE = 2;
public static final int MMAP_PROT_EXEC = 4;
public POSIX() {
}
public static NativeIO.POSIX.CacheManipulator getCacheManipulator() {
return cacheManipulator;
}
public static void setCacheManipulator(NativeIO.POSIX.CacheManipulator cacheManipulator) {
cacheManipulator = cacheManipulator;
}
public static boolean isAvailable() {
return NativeCodeLoader.isNativeCodeLoaded() && nativeLoaded;
}
private static void assertCodeLoaded() throws IOException {
if (!isAvailable()) {
throw new IOException("NativeIO was not loaded");
}
}
public static native FileDescriptor open(String var0, int var1, int var2) throws IOException;
private static native NativeIO.POSIX.Stat fstat(FileDescriptor var0) throws IOException;
private static native void chmodImpl(String var0, int var1) throws IOException;
public static void chmod(String path, int mode) throws IOException {
if (!Shell.WINDOWS) {
chmodImpl(path, mode);
} else {
try {
chmodImpl(path, mode);
} catch (NativeIOException var3) {
if (var3.getErrorCode() == 3L) {
throw new NativeIOException("No such file or directory", Errno.ENOENT);
}
LOG.warn(String.format("NativeIO.chmod error (%d): %s", var3.getErrorCode(), var3.getMessage()));
throw new NativeIOException("Unknown error", Errno.UNKNOWN);
}
}
}
static native void posix_fadvise(FileDescriptor var0, long var1, long var3, int var5) throws NativeIOException;
static native void sync_file_range(FileDescriptor var0, long var1, long var3, int var5) throws NativeIOException;
static void posixFadviseIfPossible(String identifier, FileDescriptor fd, long offset, long len, int flags) throws NativeIOException {
if (nativeLoaded && fadvisePossible) {
try {
posix_fadvise(fd, offset, len, flags);
} catch (UnsupportedOperationException var8) {
fadvisePossible = false;
} catch (UnsatisfiedLinkError var9) {
fadvisePossible = false;
}
}
}
public static void syncFileRangeIfPossible(FileDescriptor fd, long offset, long nbytes, int flags) throws NativeIOException {
if (nativeLoaded && syncFileRangePossible) {
try {
sync_file_range(fd, offset, nbytes, flags);
} catch (UnsupportedOperationException var7) {
syncFileRangePossible = false;
} catch (UnsatisfiedLinkError var8) {
syncFileRangePossible = false;
}
}
}
static native void mlock_native(ByteBuffer var0, long var1) throws NativeIOException;
static void mlock(ByteBuffer buffer, long len) throws IOException {
assertCodeLoaded();
if (!buffer.isDirect()) {
throw new IOException("Cannot mlock a non-direct ByteBuffer");
} else {
mlock_native(buffer, len);
}
}
public static void munmap(MappedByteBuffer buffer) {
if (buffer instanceof DirectBuffer) {
Cleaner cleaner = ((DirectBuffer)buffer).cleaner();
cleaner.clean();
}
}
private static native long getUIDforFDOwnerforOwner(FileDescriptor var0) throws IOException;
private static native String getUserName(long var0) throws IOException;
public static NativeIO.POSIX.Stat getFstat(FileDescriptor fd) throws IOException {
NativeIO.POSIX.Stat stat = null;
if (!Shell.WINDOWS) {
stat = fstat(fd);
stat.owner = getName(NativeIO.POSIX.IdCache.USER, stat.ownerId);
stat.group = getName(NativeIO.POSIX.IdCache.GROUP, stat.groupId);
} else {
try {
stat = fstat(fd);
} catch (NativeIOException var3) {
if (var3.getErrorCode() == 6L) {
throw new NativeIOException("The handle is invalid.", Errno.EBADF);
}
LOG.warn(String.format("NativeIO.getFstat error (%d): %s", var3.getErrorCode(), var3.getMessage()));
throw new NativeIOException("Unknown error", Errno.UNKNOWN);
}
}
return stat;
}
private static String getName(NativeIO.POSIX.IdCache domain, int id) throws IOException {
Map<Integer, NativeIO.POSIX.CachedName> idNameCache = domain == NativeIO.POSIX.IdCache.USER ? USER_ID_NAME_CACHE : GROUP_ID_NAME_CACHE;
NativeIO.POSIX.CachedName cachedName = (NativeIO.POSIX.CachedName)idNameCache.get(id);
long now = System.currentTimeMillis();
String name;
if (cachedName != null && cachedName.timestamp + cacheTimeout > now) {
name = cachedName.name;
} else {
name = domain == NativeIO.POSIX.IdCache.USER ? getUserName(id) : getGroupName(id);
if (LOG.isDebugEnabled()) {
String type = domain == NativeIO.POSIX.IdCache.USER ? "UserName" : "GroupName";
LOG.debug("Got " + type + " " + name + " for ID " + id + " from the native implementation");
}
cachedName = new NativeIO.POSIX.CachedName(name, now);
idNameCache.put(id, cachedName);
}
return name;
}
static native String getUserName(int var0) throws IOException;
static native String getGroupName(int var0) throws IOException;
public static native long mmap(FileDescriptor var0, int var1, boolean var2, long var3) throws IOException;
public static native void munmap(long var0, long var2) throws IOException;
static {
if (NativeCodeLoader.isNativeCodeLoaded()) {
try {
Configuration conf = new Configuration();
NativeIO.workaroundNonThreadSafePasswdCalls = conf.getBoolean("hadoop.workaround.non.threadsafe.getpwuid", true);
NativeIO.initNative();
nativeLoaded = true;
cacheTimeout = conf.getLong("hadoop.security.uid.cache.secs", 14400L) * 1000L;
LOG.debug("Initialized cache for IDs to User/Group mapping with a cache timeout of " + cacheTimeout / 1000L + " seconds.");
} catch (Throwable var1) {
PerformanceAdvisory.LOG.debug("Unable to initialize NativeIO libraries", var1);
}
}
USER_ID_NAME_CACHE = new ConcurrentHashMap();
GROUP_ID_NAME_CACHE = new ConcurrentHashMap();
}
private static enum IdCache {
USER,
GROUP;
private IdCache() {
}
}
private static class CachedName {
final long timestamp;
final String name;
public CachedName(String name, long timestamp) {
this.name = name;
this.timestamp = timestamp;
}
}
public static class Stat {
private int ownerId;
private int groupId;
private String owner;
private String group;
private int mode;
public static final int S_IFMT = 61440;
public static final int S_IFIFO = 4096;
public static final int S_IFCHR = 8192;
public static final int S_IFDIR = 16384;
public static final int S_IFBLK = 24576;
public static final int S_IFREG = 32768;
public static final int S_IFLNK = 40960;
public static final int S_IFSOCK = 49152;
public static final int S_IFWHT = 57344;
public static final int S_ISUID = 2048;
public static final int S_ISGID = 1024;
public static final int S_ISVTX = 512;
public static final int S_IRUSR = 256;
public static final int S_IWUSR = 128;
public static final int S_IXUSR = 64;
Stat(int ownerId, int groupId, int mode) {
this.ownerId = ownerId;
this.groupId = groupId;
this.mode = mode;
}
Stat(String owner, String group, int mode) {
if (!Shell.WINDOWS) {
this.owner = owner;
} else {
this.owner = NativeIO.stripDomain(owner);
}
if (!Shell.WINDOWS) {
this.group = group;
} else {
this.group = NativeIO.stripDomain(group);
}
this.mode = mode;
}
public String toString() {
return "Stat(owner='" + this.owner + "', group='" + this.group + "'" + ", mode=" + this.mode + ")";
}
public String getOwner() {
return this.owner;
}
public String getGroup() {
return this.group;
}
public int getMode() {
return this.mode;
}
}
@VisibleForTesting
public static class NoMlockCacheManipulator extends NativeIO.POSIX.CacheManipulator {
public NoMlockCacheManipulator() {
}
public void mlock(String identifier, ByteBuffer buffer, long len) throws IOException {
NativeIO.POSIX.LOG.info("mlocking " + identifier);
}
public long getMemlockLimit() {
return 1125899906842624L;
}
public long getOperatingSystemPageSize() {
return 4096L;
}
public boolean verifyCanMlock() {
return true;
}
}
@VisibleForTesting
public static class CacheManipulator {
public CacheManipulator() {
}
public void mlock(String identifier, ByteBuffer buffer, long len) throws IOException {
NativeIO.POSIX.mlock(buffer, len);
}
public long getMemlockLimit() {
return NativeIO.getMemlockLimit();
}
public long getOperatingSystemPageSize() {
return NativeIO.getOperatingSystemPageSize();
}
public void posixFadviseIfPossible(String identifier, FileDescriptor fd, long offset, long len, int flags) throws NativeIOException {
NativeIO.POSIX.posixFadviseIfPossible(identifier, fd, offset, len, flags);
}
public boolean verifyCanMlock() {
return NativeIO.isAvailable();
}
}
}
}
右键运行main方法,测试运⾏行行MapReduce程序
在 InitMR.java 类中加⼊入以下代码
/**
* 3.本地计算+远程HDFS⽂文件
*/
TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result2"));
可能出现权限问题
<property>
<name>dfs.permissions.enabledname>
<value>falsevalue>
property>
-DHADOOP_USER_NAME=root
远程计算+远程HDFS⽂文件
在 InitMR.java 类中加入以下代码,并将项目打成jar包 运行主函数
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
//===============================================================
conf.set("fs.defaultFS", "hdfs://hadoop:9000/");
conf.set("mapreduce.job.jar", "file:///E:\\训练营备课
\\20180313_hadoop\\mr_demo\\target\\mr_demo-1.0-SNAPSHOT.jar");
conf.set("mapreduce.framework.name", "yarn");
conf.set("yarn.resourcemanager.hostname", "hadoop");
conf.set("yarn.nodemanager.aux-services", "mapreduce_shuffle");
conf.set("mapreduce.app-submission.cross-platform", "true");
conf.set("dfs.replication", "1");
//===============================================================
// ......
// MR测试⽅方法四:远程计算+远程HDFS⽂文件
FileInputFormat.setInputPaths(job, "/user/word.txt");
FileOutputFormat.setOutputPath(job, new Path("/user/result"));
}
练习实例
wordCount 统计单词出现的次数
flow 流量统计案列
自定义Writable