通过日志查看出现了下面的错误:
java.lang.IllegalStateException: Cannot close TFile in the middle of key-value insertion.
at org.apache.hadoop.io.file.tfile.TFile$Writer.close(TFile.java:310)
at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.close(AggregatedLogFormat.java:456)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:326)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:429)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:388)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$2.run(LogAggregationService.java:384)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
底层debug出现了下面的错误,但是不知道为何没有打印出来:
java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$POSIX.fstat(Ljava/io/FileDescriptor;)Lorg/apache/hadoop/io/nativeio/NativeIO$POSIX$Stat;
简单思考是没有加载到Hadoop的native库导致的
和native相关的日志只有这一个:
[2024-02-09 12:47:08] [WARN] (org.apache.hadoop.util.NativeCodeLoader:62) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
就是没有加载到native库,难道就是因为没有加载到native库导致无法生成日志吗
开始简单百度macos的hadoop native库和linux的native的差别
发现差别还是很大的!
由于hadoop没有提供mac平台上的包,所以如果我们想要在mac上使用hadoop的native库,那么就只能通过手动构建hadoop了
开始构建…(我这边编译的hadoop版本是2.7.3,稍微更麻烦点)
下载安装maven
安装需要的环境包
bash brew install gcc autoconf automake libtool cmake snappy gzip bzip2 zlib
安装1.0.2版本的openssl(必须要是1.0.2,参考https://issues.apache.org/jira/browse/HADOOP-14597)
brew install rbenv/tap/[email protected]
安装protobuf
获取protobuf2.5.0的源码,获取完后需要改权限
## 版本必须和Hadoop源码中版本保持一致
grep 'protobuf.version' */pom.xml
sudo wget https://github.com/protocolbuffers/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz
获取到protobuf2.5.0的源码后,必须要修改./google/protobuf/stubs/atomicops_internals_macosx.h文件里面的内容(参考:https://github.com/protocolbuffers/protobuf/issues/8836)
找到以下代码:
#else
#error Host architecture was not detected as supported by protobuf
在上面添加:
#elif defined(__arm64__)
#define GOOGLE_PROTOBUF_ARCH_ARM 1
#define GOOGLE_PROTOBUF_ARCH_64_BIT 1
开始编译和安装protoc
cd protobuf-2.5.0/
./configure
make && make install
protoc --version #libprotoc 2.5.0
修改hadoop项目的hadoop-common-project/hadoop-common项目的pom.xml文件
找到以下代码:
<exec executable="cmake" dir="${project.build.directory}/native" failonerror="true">
<arg line="${basedir}/src/ -DGENERATED_JAVAH=${project.build.directory}/native/javah -DJVM_ARCH_DATA_MODEL=${sun.arch.data.model} -DREQUIRE_BZIP2=${require.bzip2} -DREQUIRE_SNAPPY=${require.snappy} -DCUSTOM_SNAPPY_PREFIX=${snappy.prefix} -DCUSTOM_SNAPPY_LIB=${snappy.lib} -DCUSTOM_SNAPPY_INCLUDE=${snappy.include} -DREQUIRE_OPENSSL=${require.openssl} -DCUSTOM_OPENSSL_PREFIX=${openssl.prefix} -DCUSTOM_OPENSSL_LIB=${openssl.lib} -DCUSTOM_OPENSSL_INCLUDE=${openssl.include} -DEXTRA_LIBHADOOP_RPATH=${extra.libhadoop.rpath}"/>
exec>
在arg的line参数后面再加上zlib参数:
-DZLIB_LIBRARY=/opt/homebrew/Cellar/zlib/1.3.1/lib
开始编译Hadoop2.7.3
# 设置1.0.2版本的ssl环境,不能使用高版本的ssl
export PATH=/opt/homebrew/Cellar/[email protected]/1.0.2u:$PATH
mvn clean package -DskipTests -Pdist,native -Dtar
Dopenssl.prefix=/opt/homebrew/Cellar/[email protected]/1.0.2u是自己的openssl目录
通过上面的方法应该已经可以完成hadoop的编译和构建
编译完后在hadoop的hadoop-dist
目录下应该就可以看见Hadoop-2.7.3目录,也就是hadoop应用程序了,可以通过判断启动hadoop任意组件是否会发出没有加载native库警告来判断是否生效(注意:通过hadoop checknative -a仍会报错,因为brew上面的库还是无法加载到,但是不影响hadoop启动时加载native库)
在Java应用程序中通过System.setProperty设置java.library.path属性,然后再次启动hadoop
发现还是出现上面的没有加载native库的警告
使用本地编译完的hadoop是不会出现这个警告的,那为什么通过System.setProperty无法设置库地址呢
研究发现ClassLoader是通过usr_paths变量来加载库文件的,但是只会在java程序启动初始化时加载一次
static void initLibraryPaths() {
usr_paths = initializePath("java.library.path");
sys_paths = initializePath("sun.boot.library.path");
}
static void loadLibrary(Class<?> fromClass, String name,
boolean isAbsolute) {
ClassLoader loader =
(fromClass == null) ? null : fromClass.getClassLoader();
if (sys_paths == null) {
usr_paths = initializePath("java.library.path");
sys_paths = initializePath("sun.boot.library.path");
}
assert sys_paths != null : "should be initialized at this point";
assert usr_paths != null : "should be initialized at this point";
如果我们要实现动态加载库功能的话,只能通过反射加载:
public static void addLibraryDir(String libraryPath) throws IOException {
try {
Field field = ClassLoader.class.getDeclaredField("usr_paths");
field.setAccessible(true);
String[] paths = (String[]) field.get(null);
for (int i = 0; i < paths.length; i++) {
if (libraryPath.equals(paths[i])) {
return;
}
}
String[] tmp = new String[paths.length + 1];
System.arraycopy(paths, 0, tmp, 0, paths.length);
tmp[paths.length] = libraryPath;
field.set(null, tmp);
} catch (IllegalAccessException e) {
throw new IOException(
"Failedto get permissions to set library path");
} catch (NoSuchFieldException e) {
throw new IOException(
"Failedto get field handle to set library path");
}
}
加载完后没有再出现警告了,日志也能正常生成了!