Mapreduce 是Hadoop上一个进行分布式数据运算和统计的框架,但是每次运行程序的时候都需要将程序打包并上传的集群环境中运行,这就会让程序的调试变得十分不方便。所以在这里写下这篇博客和大家交流学习如何在本地调试Mapreduce程序。
我的本地开发环境是Mac10.11.4, Hadoop 2.6.4, 集群操作系统是centos6.7
Configuration conf = new Configuration();
Job job = Job.getInstance(conf);
………………
FileInputFormat.setInputPaths(job, new Path("/Users/admin/Desktop/Telephone_Summary"));
FileOutputFormat.setOutputPath(job, new Path("/Users/admin/Desktop/mapreduceTestOutput3"));
conf.set("mapred.job.tracker", "local");
conf.set("fs.default.name", "local");
………………
FileInputFormat.setInputPaths(job, new Path("/Users/admin/Desktop/Telephone_Summary"));
FileOutputFormat.setOutputPath(job, new Path("/Users/admin/Desktop/mapreduceTestOutput3"));
conf.set("mapred.job.tracker", "local");
conf.set("fs.defaultFS", "hdfs://Hadoop:9000");
………………
FileInputFormat.setInputPaths(job, new Path("/Users/admin/Telephone_Summary")); //hdfs的文件路径
FileOutputFormat.setOutputPath(job, new Path("/Users/admin/mapreduceTestOutput"));//hdfs的文件路径
conf.set("mapred.job.tracker","local");
conf.set("fs.defaultFS", "hdfs://Hadoop:9000");
conf.set("mapreduce.framework.name", "yarn");
conf.set("yarn.resoucemanager.hostname", "Hadoop");
conf.set("yarn.resourcemanager.address", "172.16.124.130:8032");
System.setProperty("hadoop.home.dir","/Users/admin/Downloads/systemSoftware/Linux/hadoop-2.6.4");
Job job = Job.getInstance(conf);
job.setJar("/Users/admin/Desktop/hadoopBasic.jar");
………………
FileInputFormat.setInputPaths(job, new Path("/Users/admin/Telephone_Summary"));
FileOutputFormat.setOutputPath(job, new Path("/Users/admin/mapreduceTestOutput"));
hadoop.root.logger=DEBUG, console
log4j.rootLogger = DEBUG, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.out
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n