IDE Maven配置Spark环境(Scala版本)

建立新项目

IDE Maven配置Spark环境(Scala版本)_第1张图片

配置项目信息

IDE Maven配置Spark环境(Scala版本)_第2张图片

配置pom文件

IDE Maven配置Spark环境(Scala版本)_第3张图片
复制下方的pom配置代码(5-8行需要修改)

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>此处需要修改(你的模型版本,原pom文件有)</modelVersion>
    <groupId>此处需要修改(你的组ID,原pom文件有)</groupId>
    <artifactId>此处需要修改(ID,原pom文件有)</artifactId>
    <version>此处需要修改(版本,原pom文件有)</version>

    <repositories>
        <repository>
            <id>Akka repository</id>
            <url>http://repo.akka.io/releases</url>
        </repository>
    </repositories>

    <build>
        <sourceDirectory>src/main/scala/</sourceDirectory>
        <testSourceDirectory>src/test/scala/</testSourceDirectory>

        <plugins>
            <plugin>
                <groupId>org.scala-tools</groupId>
                <artifactId>maven-scala-plugin</artifactId>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
                <configuration>
                    <scalaVersion>2.11.4</scalaVersion>
                </configuration>
            </plugin>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.4.3</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <filters>
                                <filter>
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF
                                        META-INF/*.DSA
                                        META-INF/*.RSA
                                    
                                
                            
                            

                                
                                    reference.conf
                                

                                
                                    
                                        
                                    
                                

                            
                        
                    
                
            
            
                org.apache.maven.plugins
                maven-compiler-plugin
                
                    1.6
                    1.6
                
            
        
    

    
        
            org.apache.spark
            spark-core_2.11
            2.2.1
        

        
            org.apache.spark
            spark-hive_2.11
            2.2.1
        

        
            org.apache.hadoop
            hadoop-client
            2.7.2
        

        
            org.apache.spark
            spark-streaming_2.11
            2.2.1
        

        
            org.apache.spark
            spark-sql_2.11
            2.2.1
        

        
            org.apache.hive
            hive-exec
            1.2.1
        

        
            org.apache.hive
            hive-jdbc
            1.2.1
        

        
            redis.clients
            jedis
            2.2.1
            jar
            compile
        
        
            org.apache.hbase
            hbase-client
            1.2.1
        
        
            org.apache.hbase
            hbase-common
            1.2.1
        
        
            org.apache.kafka
            kafka-clients
            0.8.2.2
        

        
            mysql
            mysql-connector-java
            5.1.37
        

        
            org.apache.kafka
            kafka_2.11
            0.8.2.2
        
        
        
            org.apache.spark
            spark-mllib_2.11
            2.2.1
        

    


配置setting.xml(默认在“/用户/.m2/”文件夹下)

IDE Maven配置Spark环境(Scala版本)_第4张图片
新建setting.xml 复制下方的代码(第11行需要更改)

<?xml version="1.0" encoding="UTF-8"?>
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd">
    
    <pluginGroups />
    <proxies />
    <servers />
    
    <!-- maven自动下载的jar包,会存放到该目录下 -->
    <localRepository>此处需要修改(你想把下载的包放在哪?)</localRepository>
    
    <mirrors>

        <mirror>
            <id>alimaven</id>
            <mirrorOf>central</mirrorOf>
            <name>aliyun maven</name>
            <url>http://maven.aliyun.com/nexus/content/repositories/central/</url>
        </mirror>

        <mirror>
            <id>alimaven</id>
            <name>aliyun maven</name>
            <url>http://maven.aliyun.com/nexus/content/groups/public/</url>
            <mirrorOf>central</mirrorOf>
        </mirror>

        <mirror>
            <id>central</id>
            <name>Maven Repository Switchboard</name>
            <url>http://repo1.maven.org/maven2/</url>
            <mirrorOf>central</mirrorOf>
        </mirror>

        <mirror>
            <id>repo2</id>
            <mirrorOf>central</mirrorOf>
            <name>Human Readable Name for this Mirror.</name>
            <url>http://repo2.maven.org/maven2/</url>
        </mirror>

        <mirror>
            <id>ibiblio</id>
            <mirrorOf>central</mirrorOf>
            <name>Human Readable Name for this Mirror.</name>
            <url>http://mirrors.ibiblio.org/pub/mirrors/maven2/</url>
        </mirror>

        <mirror>
            <id>jboss-public-repository-group</id>
            <mirrorOf>central</mirrorOf>
            <name>JBoss Public Repository Group</name>
            <url>http://repository.jboss.org/nexus/content/groups/public</url>
        </mirror>

        <mirror>
            <id>google-maven-central</id>
            <name>Google Maven Central</name>
            <url>https://maven-central.storage.googleapis.com
            </url>
            <mirrorOf>central</mirrorOf>
        </mirror>
        
        <!-- 中央仓库在中国的镜像 -->
        <mirror>
            <id>maven.net.cn</id>
            <name>oneof the central mirrors in china</name>
            <url>http://maven.net.cn/content/groups/public/</url>
            <mirrorOf>central</mirrorOf>
        </mirror>
    </mirrors>
    
</settings>

配置scala文件夹

IDE Maven配置Spark环境(Scala版本)_第5张图片
IDE Maven配置Spark环境(Scala版本)_第6张图片
右击即可建立文件夹

IDE Maven配置Spark环境(Scala版本)_第7张图片

Reimport(同步jar包)

导入ScalaSDK

IDE Maven配置Spark环境(Scala版本)_第8张图片IDE Maven配置Spark环境(Scala版本)_第9张图片

WordCount(java中的hello word)

IDE Maven配置Spark环境(Scala版本)_第10张图片
IDE Maven配置Spark环境(Scala版本)_第11张图片

import java.text.SimpleDateFormat
import java.util.Date

import org.apache.log4j.{Level, Logger}
import org.apache.spark.{SparkConf, SparkContext}

object wordCount {
  def main(args: Array[String]): Unit = {
    //设置日志输出级别
    Logger.getLogger("org").setLevel(Level.WARN)
    //配置RDD环境
    val dataNow = new SimpleDateFormat("yyyy-MM-dd-HH:mm ").format(new Date)
    val sparkconf = new SparkConf().setAppName(dataNow).setMaster("local[*]")
    val sparkcontext = new SparkContext(sparkconf)
    //读取文件
    val filePath = "/Users/apple/IdeaProjects/SparkInBigData/src/main/scala/wordCount.txt"
    val rdd1 = sparkcontext.textFile(filePath)
    val counts = rdd1.flatMap(t => t.split(" "))
      .map(word => (word, 1))
      .reduceByKey(_ + _) //第n个数加第n+1个数
      .sortBy(_._2, false) //按照第二个元素排序 降序
      .collect().foreach(println) //collect收集、foreach循环、println输出
    sparkcontext.stop()

  }
}

数据源

Everyone has their own dreams I am the same But my
dream is not a lawyer not a doctor not actors not
even an industry Perhaps my dream big people will
find it ridiculous but this has been my pursuit
My dream is to want to have a folk life I want it
to become a beautiful painting it is not only sharp
colors but also the colors are bleak I do not rule
out the painting is part of the black but I will
treasure these bleak colors Not yet how about a
colorful painting if not bleak add color how can
it more prominent American Life is like painting
painting the bright red color represents life beautiful
happy moments Painting a bleak color represents life
difficult unpleasant time You may find a flat with
a beautiful road is not very good yet  but I do not
think it will If a person lives flat then what is
the point Life is only a short few decades  I want
it to go Finally Each memory is a solid

结果

IDE Maven配置Spark环境(Scala版本)_第12张图片

你可能感兴趣的:(IDE Maven配置Spark环境(Scala版本))