Spark Java版 windows本地开发环境

安装IntelliJ IDEA

下载地址:https://www.jetbrains.com/idea/download/#section=windows

选择Community版本安装

安装好后启动,我这里选择UI主题

Spark Java版 windows本地开发环境_第1张图片

默认Plugins.

Spark Java版 windows本地开发环境_第2张图片

安装scala插件.

Spark Java版 windows本地开发环境_第3张图片


配置hadoop环境变量

下载winutils.exe

https://github.com/steveloughran/winutils

我这里面选择hadoop2.7.1版本

在D盘新建文件D:\hadoop-2.7.1\bin\winutils.exe

配置windows环境变量

Spark Java版 windows本地开发环境_第4张图片

用户变量:
添加HADOOP_HOME=D:\hadoop-2.7.1
系统变量:
Path添加%HADOOP_HOME%\bin

新建maven项目

Spark Java版 windows本地开发环境_第5张图片

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0modelVersion>

    <groupId>com.sparkgroupId>
    <artifactId>sparktestartifactId>
    <version>2.2.0version>
    <packaging>jarpackaging>

    <name>sparktestname>
    <url>http://maven.apache.orgurl>

    <properties>
        <project.build.sourceEncoding>UTF-8project.build.sourceEncoding>
        <spark.version>2.2.0spark.version>
        <hadoop.version>2.7.1hadoop.version>
    properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.sparkgroupId>
            <artifactId>spark-sql_2.11artifactId>
            <version>${spark.version}version>
        dependency>

        <dependency>
            <groupId>org.apache.sparkgroupId>
            <artifactId>spark-core_2.11artifactId>
            <version>${spark.version}version>
        dependency>

        <dependency>
            <groupId>org.apache.sparkgroupId>
            <artifactId>spark-hive_2.11artifactId>
            <version>${spark.version}version>
        dependency>
        <dependency>
            <groupId>org.apache.sparkgroupId>
            <artifactId>spark-streaming-kafka-0-10_2.11artifactId>
            <version>${spark.version}version>
        dependency>

        <dependency>
            <groupId>org.apache.sparkgroupId>
            <artifactId>spark-streaming_2.11artifactId>
            <version>${spark.version}version>
        dependency>

        <dependency>
            <groupId>org.apache.hadoopgroupId>
            <artifactId>hadoop-commonartifactId>
            <version>${hadoop.version}version>
        dependency>

        <dependency>
            <groupId>org.apache.sparkgroupId>
            <artifactId>spark-sql-kafka-0-10_2.11artifactId>
            <version>${spark.version}version>
        dependency>

        <dependency>
            <groupId>junitgroupId>
            <artifactId>junitartifactId>
            <version>3.8.1version>
            <scope>testscope>
        dependency>
    dependencies>
project>

拷贝
https://github.com/apache/spark/blob/master/examples/src/main/resources/employees.json 文件到项目中

Spark Java版 windows本地开发环境_第6张图片

测试代码

package com.spark;

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;

/**
 * Hello world!
 *
 */
public class App 
{
    public static void main( String[] args )
    {
        SparkSession spark= SparkSession.builder().appName("spark-test").master("local[3]").getOrCreate();
        Dataset result=spark.read().json("employees.json");
        result.show();
        result.printSchema();
        spark.stop();
    }
}

运行结果
Spark Java版 windows本地开发环境_第7张图片


完成!

你可能感兴趣的:(Spark)