hadoop(阿里云内网环境)集成springboot及一些坑

引入依赖

<dependency>
    <groupId>org.apache.hadoop</groupId>
     <artifactId>hadoop-client</artifactId>
     <version>2.7.3</version>
     <exclusions>
         <exclusion>
             <groupId>org.slf4j</groupId>
             <artifactId>slf4j-log4j12</artifactId>
         </exclusion>
         <exclusion>
             <groupId>javax.servlet</groupId>
             <artifactId>servlet-api</artifactId>
         </exclusion>
     </exclusions>
 </dependency>

 <dependency>
     <groupId>org.apache.hadoop</groupId>
     <artifactId>hadoop-common</artifactId>
     <version>2.7.3</version>
     <exclusions>
         <exclusion>
             <groupId>org.slf4j</groupId>
             <artifactId>slf4j-log4j12</artifactId>
         </exclusion>
         <exclusion>
             <groupId>javax.servlet</groupId>
             <artifactId>servlet-api</artifactId>
         </exclusion>
     </exclusions>
 </dependency>

 <dependency>
     <groupId>org.apache.hadoop</groupId>
     <artifactId>hadoop-hdfs</artifactId>
     <version>2.7.3</version>
     <exclusions>
         <exclusion>
             <groupId>org.slf4j</groupId>
             <artifactId>slf4j-log4j12</artifactId>
         </exclusion>
         <exclusion>
             <groupId>javax.servlet</groupId>
             <artifactId>servlet-api</artifactId>
         </exclusion>
     </exclusions>
 </dependency>

编写集成配置

@Configuration
public class HadoopConfig {
     

   
    @Value("${hadoop.node}") //这个就是你的hadoop的地址
    //hadoop.node=hdfs://阿里外网ip:9000
    private String hadoopNode;

    @Bean("fileSystem")
    public FileSystem createFs() throws Exception {
     
        //读取配置文件
        org.apache.hadoop.conf.Configuration conf = new org.apache.hadoop.conf.Configuration();
        conf.set("fs.defalutFS", hadoopNode);
        conf.set("dfs.client.use.datanode.hostname","true");
        conf.set("dfs.replication", "1");
        FileSystem fs = null;
        // 返回指定的文件系统,如果在本地测试,需要使用此种方法获取文件系统
        fs = FileSystem.get(new URI(hadoopNode), conf, "root");
        return  fs;
    }

}

测试代码

@SpringBootTest
class SpringbootIntegrationTestApplicationTests {
     

    @Autowired
    private FileSystem fileSystem;

    @Test
    void contextLoads() {
     
    }

    @Test
    void hadoopTest() throws Exception {
     

        //本地文件上传到hdfs中
        fileSystem.copyFromLocalFile(
                new Path("/Users/wuxinxin/IdeaProjects/springboot-integration-test/src/main/resources/application.properties"),
                new Path("/user"));

    }

}

坑及解决方案

1.设置为非安全模式
hadoop(阿里云内网环境)集成springboot及一些坑_第1张图片

错误---Name node is in safe mode
解决---hadoop dfsadmin -safemode leave

2.暴漏的节点使用域名,否则返回阿里云内网地址,无法访问datanode
hadoop(阿里云内网环境)集成springboot及一些坑_第2张图片
hadoop(阿里云内网环境)集成springboot及一些坑_第3张图片

错误----Connection refused
   -----Excluding datanode DatanodeInfoWithStorage
   [172.24.55.121:50010,DS-2b89f9a8-037a-499b-807b-7ea54bc99205,DISK]
解决
1.使用域名配置方式--etc/hadoop/core-site.xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop:9000</value>
    </property>
</configuration>
2.在阿里镜像修改hosts--
172.24.55.121(阿里内网地址) hadoop
3.要求namenode返回datanode的域名,而不是内网地址,否则访问不了datanode
Configuration conf = new Configuration();
conf.set("dfs.client.use.datanode.hostname","true");

3.解析不了返回的datanode的域名
hadoop(阿里云内网环境)集成springboot及一些坑_第4张图片

错误---DataStreamer Exception
   ---UnresolvedAddressException
解决---本地host修改,对datanode域名进行公网ip映射
39.10.62.158(阿里云公网ip) hadoop   

查看效果

hadoop(阿里云内网环境)集成springboot及一些坑_第5张图片
hadoop(阿里云内网环境)集成springboot及一些坑_第6张图片

你可能感兴趣的:(hadoop,hadoop阿里云安装,springboot集成)