大数据入门学习笔记(玖)- Hadoop整合Spring的使用

文章目录

    • Spring Hadoop概述
    • Spring Hadoop开发环境搭建及访问HDFS
    • Spring Boot访问HDFS

Spring Hadoop概述

官方链接
http://spring.io/projects/spring-hadoop
官方文档
https://docs.spring.io/spring-hadoop/docs/2.5.0.RELEASE/reference/html/
Hadoop Configuration官方文档:
https://docs.spring.io/spring-hadoop/docs/2.5.0.RELEASE/reference/html/springandhadoop-config.html

Spring Hadoop开发环境搭建及访问HDFS

添加pom文件

 
        <dependency>
            <groupId>org.springframework.datagroupId>
            <artifactId>spring-data-hadoopartifactId>
            <version>2.5.0.RELEASEversion>
        dependency>

在resources下添加beans.xml配置文件
大数据入门学习笔记(玖)- Hadoop整合Spring的使用_第1张图片
大数据入门学习笔记(玖)- Hadoop整合Spring的使用_第2张图片


<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:hdp="http://www.springframework.org/schema/hadoop"
       xmlns:context="http://www.springframework.org/schema/context"
       xmlns:util="http://www.springframework.org/schema/util"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
        http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
        http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd
        http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd">
    <bean/>
    <hdp:configuration id="hadoopConfiguration">
        fs.defaultFS=${spring.hadoop.fsUri}
    hdp:configuration>

    <context:property-placeholder location="application.properties"/>

    <hdp:file-system id="fileSystem" configuration-ref="hadoopConfiguration" user="hadoop"/>
beans>

xml中file-system为bean下文要用到,里面的hadoopConfiguration对应hdp:configuration的id为hadoopConfiguration的配置信息,user对应具有读写权限的hadoop的用户。

这些东西可在官方网站找到配置格式开头已给出官网链接

创建application.properties
这个文件是hadoop的uri

spring.hadoop.fsUri=hdfs://hadoop:9000
package com.kun.hadoop.spring;


import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.ClassPathXmlApplicationContext;

public class SpringHadoopHDFSApp {
    private ApplicationContext ctx;
    private FileSystem fileSystem;

    /**
     * 创建hdfs文件夹
     */
    @Test
    public void testMkdir() throws Exception{
        fileSystem.mkdirs(new Path("/springhdfs"));
    }

    /**
     * 读取HDFS文件内容
     */
    @Test
    public void testText() throws Exception{
        FSDataInputStream in = fileSystem.open(new Path("/springhdfs/test.txt"));
        IOUtils.copyBytes(in, System.out, 1024);
        in.close();
    }


    @Before
    public void setUp(){
        ctx= new ClassPathXmlApplicationContext("bean.xml");
        fileSystem = (FileSystem)ctx.getBean("fileSystem");//得到bean对象
    }

    @After
    public void tearDown(){
        ctx=null;
    }
}

Spring Boot访问HDFS

添加pom依赖

     
      <dependency>
          <groupId>org.springframework.datagroupId>
          <artifactId>spring-data-hadoop-bootartifactId>
          <version>2.5.0.RELEASEversion>
      dependency>

使用

package com.kun.hadoop.spring;

import org.apache.hadoop.fs.FileStatus;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.CommandLineRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.data.hadoop.fs.FsShell;

/**
 * 使用Spring Boot的方式访问HDFS
 */
@SpringBootApplication
public class SpringBootHDFSApp implements CommandLineRunner {

    @Autowired
    FsShell fsShell;

    public void run(String... strings) throws Exception{
        for(FileStatus fileStatus:fsShell.lsr("/springboothdfs")){
            System.out.println(">"+ fileStatus.getPath());
        }
    }

    public static void main(String[] args) {
        SpringApplication.run(SpringBootHDFSApp.class,args);
    }
}

同时整合mapreduce和hive等都可以通过下面链接进行配置
https://docs.spring.io/spring-hadoop/docs/2.5.0.RELEASE/reference/html/springandhadoop.html
大数据入门学习笔记(玖)- Hadoop整合Spring的使用_第3张图片

你可能感兴趣的:(Hadoop学习笔记)