大数据IMF传奇行动 java maven工程(pom.xml配置) 本地模式运行词频统计

1、下载 eclipse
    登录 www.eclipse.org/downloads
    下载Eclipse IDE for Java EE Developers版本

2、java 1.8版本
   scala 2.10.4

3、解压 Eclipse IDE for Java

4、新建maven工程  File-other-maven project

5、选择mavenarchetype-quickstart 1.1

6、输入
Group id: com.dt.spark
artifact id:SparkApps

7、jre system library (j2se-1.5)修改:
   build path -configure build path-workspace default jre 1.8.0-65

8、新建包 com.dt.spark.SparkApps.cores

9 、新建类WordCount

10、配置 pom.xml
    官网看文档 spark.apache.org
    maven依赖代码配置
     http://maven.outofmemory.cn/org.apache.spark/

11、配置pom.xml会下载依赖包

12、运行内存不够
找到eclispe 中window->preferences->Java->Installed JRE ,点击右侧的

Edit 按钮,在编辑界面中的 “Default VM Arguments ”选项中,填入如下值

即可-Xms128m -Xmx512m

13、
 JavaRDD<String> lines = sc.textFile(

"G://IMFBigDataSpark2016//Bigdata_Software//spark-1.6.0-bin-

hadoop2.6//spark-1.6.0-bin-hadoop2.6//spark-1.6.0-bin-

hadoop2.6//README.md");
 
运行ok

16/01/16 20:20:19 INFO ShuffleBlockFetcherIterator: Started 0 remote

fetches in 129 ms
package : 1
For : 2
Programs : 1
processing. : 1
Because : 1
The : 1
cluster. : 1
its : 1
[run : 1
APIs : 1

14、pom.xml配置


<project xmlns="http://maven.apache.org/POM/4.0.0"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0

http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.dt.spark</groupId>
  <artifactId>SparkApps</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <packaging>jar</packaging>

  <name>SparkApps</name>
  <url>http://maven.apache.org</url>

  <properties>
    <project.build.sourceEncoding>UTF-

8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency> 
   <groupId>org.apache.spark</groupId>
   <artifactId>spark-core_2.10</artifactId>
   <version>1.6.0</version>
 </dependency>
 <dependency>
   <groupId>org.apache.spark</groupId>
   <artifactId>spark-sql_2.10</artifactId>
   <version>1.6.0</version>
   </dependency>
 <dependency>
   <groupId>org.apache.spark</groupId>
   <artifactId>spark-hive_2.10</artifactId>
   <version>1.6.0</version>
 </dependency>
 <dependency>
   <groupId>org.apache.spark</groupId>
   <artifactId>spark-streaming_2.10</artifactId>
   <version>1.6.0</version>
 </dependency>
 <dependency>
   <groupId>org.apache.hadoop</groupId>
   <artifactId>hadoop-client</artifactId>
   <version>2.6.0</version>
 </dependency>
 <dependency>
   <groupId>org.apache.spark</groupId>
   <artifactId>spark-streaming-kafka_2.10</artifactId>
   <version>1.6.0</version>
 </dependency>
    <dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-graphx_2.10</artifactId>
    <version>1.6.0</version>
</dependency>
   
  </dependencies>
 
   <build>
    <sourceDirectory>src/main/java</sourceDirectory>
    <testSourceDirectory>src/main/test</testSourceDirectory>

    <plugins>
      <plugin>
        <artifactId>maven-assembly-plugin</artifactId>
        <configuration>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
          <archive>
            <manifest>
              <mainClass></mainClass>
            </manifest>
          </archive>
        </configuration>
        <executions>
          <execution>
            <id>make-assembly</id>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
          </execution>
        </executions>
      </plugin>

      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>exec-maven-plugin</artifactId>
        <version>1.2.1</version>
        <executions>
          <execution>
            <goals>
              <goal>exec</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <executable>java</executable>
         

<includeProjectDependencies>true</includeProjectDependencies>
         

<includePluginDependencies>false</includePluginDependencies>
          <classpathScope>compile</classpathScope>
          <mainClass>com.dt.spark.App</mainClass>
        </configuration>
      </plugin>

      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <configuration>
          <source>1.6</source>
          <target>1.6</target>
        </configuration>
      </plugin>

    </plugins>
  </build>
</project>

你可能感兴趣的:(大数据IMF传奇行动 java maven工程(pom.xml配置) 本地模式运行词频统计)