使用hadoop接口的java程序示例

使用hadoop接口的java程序示例

实验环境

ubuntu18.0.4,hadoop2,eclipse

再进行实验前,需要启动hadoop2,并且把相关jar包导入eclipse的项目下,具体操作参考https://dblab.xmu.edu.cn/blog/290-2/

实验内容

FileSystemCat.java

FileSystemCat.java代码:

// cc FileSystemCat Displays files from a Hadoop filesystem on standard output by using the FileSystem directly

// 直接使用FileSystem以标准输出格式显示Hadoop文件系统中的文件

import java.io.InputStream;
import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;

// vv FileSystemCat
public class FileSystemCat {

  public static void main(String[] args) throws Exception {
    String uri = "input/readme.txt";
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS","hdfs://localhost:9000");
    conf.set("fs.hdfs.impl","org.apache.hadoop.hdfs.DistributedFileSystem");
    FileSystem fs = FileSystem.get(conf);
    InputStream in = null;
    try {
      in = fs.open(new Path(uri));
      IOUtils.copyBytes(in, System.out, 4096, false);
    } finally {
      IOUtils.closeStream(in);
    }
  }
}
// ^^ FileSystemCat

运行结果:

使用hadoop接口的java程序示例_第1张图片

成功输出Hello World

FileCopyWithProgress.java

FileCopyWithProgress.java代码:

// cc FileCopyWithProgress Copies a local file to a Hadoop filesystem, and shows progress
// 将本地复制到hadoop文件系统

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.util.Progressable;

// vv FileCopyWithProgress
public class FileCopyWithProgress {
  public static void main(String[] args) throws Exception {
    String localSrc = args[0];
    String dst = args[1];

    InputStream in = new BufferedInputStream(new FileInputStream(localSrc));

    Configuration conf = new Configuration();
    conf.set("fs.defaultFS","hdfs://localhost:9000");
    conf.set("fs.hdfs.impl","org.apache.hadoop.hdfs.DistributedFileSystem");
    FileSystem fs = FileSystem.get(conf);
    OutputStream out = fs.create(new Path(dst), new Progressable() {
      public void progress() {
        System.out.print(".");
      }
    });

    IOUtils.copyBytes(in, out, 4096, true);
  }
}
// ^^ FileCopyWithProgress

传入args的参数为

/usr/local/hadoop/README.txt input/readme.txt

运行结果:

使用hadoop接口的java程序示例_第2张图片
运行HDFSFileExist.java文件查看input/readme.txt是否存在:

使用hadoop接口的java程序示例_第3张图片

文件存在,成功。

ListStatus.java

ListStatus.java代码:

// cc ListStatus Shows the file statuses for a collection of paths in a Hadoop filesystem 
// 显示Hadoop文件系统中的一组路径文件系统


import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FileUtil;
import org.apache.hadoop.fs.Path;

// vv ListStatus
public class ListStatus {

  public static void main(String[] args) throws Exception {
    String [] args1=new String[1];
    args1[0]="input/test";
    String uri = args1[0];
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS","hdfs://localhost:9000");
    conf.set("fs.hdfs.impl","org.apache.hadoop.hdfs.DistributedFileSystem");
    FileSystem fs = FileSystem.get(conf);

    Path[] paths = new Path[args1.length];
    for (int i = 0; i < paths.length; i++) {
      paths[i] = new Path(args1[i]);
    }

    FileStatus[] status = fs.listStatus(paths);
    Path[] listedPaths = FileUtil.stat2Paths(status);
    for (Path p : listedPaths) {
      System.out.println(p);
    }
  }
}
// ^^ ListStatus

运行结果:

使用hadoop接口的java程序示例_第4张图片

出现的问题

  • 运行报错显示文件路径错误

解决方案

  • hdfs://localhost:9000默认的hdfs系统的位置是/user/hadoop/下,所以文件路径应该写相对路径,如input/test

你可能感兴趣的:(使用hadoop接口的java程序示例)