0. Prerequisite
One hadoop-2.8.0 server has started on a remote ubuntu server
$ netstat -lnpt | grep -i TCP | grep `jps | grep -w NameNode | awk '{print $1}'` | grep "LISTEN"
tcp 0 0 192.168.55.250:8020 0.0.0.0:* LISTEN 319922/java
tcp 0 0 192.168.55.250:50070 0.0.0.0:* LISTEN 319922/java
Note:
8020 is port for hadoop default file system uri, the url used by client should be "hdfs://192.168.55.250:8020"
50070 is port where dfs namenode web ui listens on
1. Install hadoop-2.8.0 client
1) download hadoop client binary from https://jar-download.com/artifacts/org.apache.hadoop/hadoop-client?p=4
2) extract binary
$ cd ~/learn/java/java8/
$ tar xvf hadoop-client-2.8.0.tar -C lib/
2. Write and test the Java Hdfs client
1) edit Java code
$ cat Hdfs.java
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.*;
public class Hdfs {
FileSystem fileSystem;
public Hdfs(String host, int port) throws IOException {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", String.format("hdfs://%s:%d", host, port));
fileSystem = FileSystem.get(conf);
}
public void close() throws IOException {
fileSystem.close();
}
public void createFile(String filePath, String text) throws IOException {
java.nio.file.Path path = java.nio.file.Paths.get(filePath);
Path dir = new Path(path.getParent().toString());
fileSystem.mkdirs(dir);
OutputStream os = fileSystem.create(new Path(dir, path.getFileName().toString()));
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(os));
writer.write(text);
writer.close();
os.close();
}
public String readFile(String filePath) throws IOException {
InputStream in = fileSystem.open(new Path(filePath));
byte[] buffer = new byte[256];
int bytesRead = in.read(buffer);
return new String(buffer, 0, bytesRead);
}
public boolean delFile(String filePath) throws IOException {
return fileSystem.delete(new Path(filePath), false);
}
public static void main(String[] args) throws IOException {
Hdfs fs = new Hdfs("192.168.55.250", 8020);
fs.createFile("/tmp/output/hello.txt", "Hello Hadoop");
System.out.println(fs.readFile("/tmp/output/hello.txt"));
System.out.println(fs.delFile("/tmp/output/myfile.txt"));
fs.close();
}
}
2) compile class
$ javac -cp "lib/hadoop-client-2.8.0/*" Hdfs.java
3) run the test
$ java -cp "lib/hadoop-client-2.8.0/*:." Hdfs
...
Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=sun_xo, access=WRITE, inode="/tmp":sunxo:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:310)
...
4) The error is due to user on client is different from user on server, the easiest way is change directory mode on server side
$ hdfs dfs -chmod 777 /tmp
$ hdfs dfs -ls /
drwxrwxrwx - sunxo supergroup 0 2023-05-12 09:11 /tmp
drwxr-xr-x - sunxo supergroup 0 2022-06-14 10:44 /user
5) rerun test, it works as following
$ java -cp "lib/hadoop-client-2.8.0/*:." Hdfs
Hello Hadoop
false
6) check result from server side
$ hdfs dfs -ls /tmp/output
-rw-r--r-- 3 sun_xo supergroup 12 2023-05-12 09:26 /tmp/output/hello.txt
$ hdfs dfs -cat /tmp/output/hello.txt
Hello Hadoop
The result can be checked at http://ubuntu:50070/explorer.html#/tmp/output as well