Hadoop集成Spring Springboot

在Spring中集成Hadoop流程梳理:

(1)maven添加spring-data-hadoop依赖


    org.springframework.data
    spring-data-hadoop
    2.5.0.RELEASE

(2)resources中的beans.xml定义命名空间

     从官网拷贝内容粘贴:

   https://docs.spring.io/spring-hadoop/docs/2.5.0.RELEASE/reference/html/springandhadoop-config.html


       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:hdp="http://www.springframework.org/schema/hadoop"
       xsi:schemaLocation="
        http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
        http://www.springframework.org/schema/hadoop
        http://www.springframework.org/schema/hadoop/spring-hadoop.xsd
        http://www.springframework.org/schema/beans ">

   
        fs.defaultFS=hdfs://hadoop000:8020
   

   


可变的地方放进文件中application.properties,比如:spring.hadoop.fsUri=fs.defaultFS=hdfs://hadoop000:8020

(3)fileSystem通过spring注入进来,其他方式不变

使用Spring Hadoop访问HDFS系统
Test下创建spring包,创建springHadoopApp

private ApplicationContext ctx;
private FileSystem fileSystem;

setup()
{
   ctx = new ClassPathXmlApplicationContext("beans.xml");
   fileSystem = (FileSystem)ctx.getBean("fileSystem");
}
tearDown(){
   ctx = null;
}
//创建文件夹
public void testMkdir() throws Exception{
    fileSystem.mkdirs(new Path("/springhdfs/"));
}

//读取hdfs文件内容
public void testCat() throws Exception{
    FSDataInputStream in = fileSystem.open(new Path("/springhdfs/hello.txt"));
    IOUtiles.copyBytes(in,System.out,1024);
    in.close();
}

令附:Springboot访问hdfs

(1)添加依赖


      org.springframework.data
      spring-data-hadoop-boot
      2.5.0.RELEASE-hadoop25
 

(2)注入FsShell

@SpringBootApplication
public class springBootHDFSApp implements CommandLineRunner

@Autowired
FsShell fsShell;//有很多方法

public void run(String... strings) throws Exception{
    for(FileStatus fileStatus:fsShell.lsr("/springhdfs")){
        //打印
    }
}

public static void main(String[] args){
    SpringApplication.run(SpringBootHDFSApp.class,args);
}

自己熟悉MapReduce,Hive

你可能感兴趣的:(Spring,大数据,基础配置)