HBase实战(6):使用Spark 2.2.1 直接操作HBASE 1.2.0数据库

HBase实战(6):使用Spark 2.2.1 直接操作HBASE 1.2.0数据库

之前对于Hbase系统已实验成功的内容:

       Hbase分布式集群搭建:点击打开链接

  1. 直接使用python API连接Hbase操作数据。点击打开链接
  2. 直接使用Java API 连接Hbase操作数据。点击打开链接
  3. 使用spark-sql 工具通过Hive间接操作Hbase的数据。点击打开链接
  4. 使用Hive-sql 操作Hbase数据。点击打开链接

本次大数据实验室的内容:

     5.直接使用Spark 2.2.1 操作HBase 1.2.0的数据。    

编写测试代码:

package HbaseTest.sparkconnectHbase;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.TableInputFormat;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.VoidFunction;
import org.apache.spark.rdd.RDD;
import org.apache.spark.sql.SparkSession;
import scala.Function1;
import scala.Tuple2;
import scala.collection.Iterator;
import scala.runtime.BoxedUnit;

/***
 * 使用Spark 2.2.1 直接连接 Hbase 1.2.0 数据库。
 * */
public class SparkConnectHbaseTest {

    public static void main(String[] args) {
        Configuration confhbase = HBaseConfiguration.create();

        confhbase.set("hbase.zookeeper.property.clientPort", "2181");
        confhbase.set("hbase.zookeeper.quorum", "192.168.189.1,192.168.189.2,192.168.189.3");
        confhbase.set("hbase.master", "192.168.189.1:60000");

        confhbase.set(TableInputFormat.INPUT_TABLE, "db_res:wtb_ow_operation");
        SparkConf conf = new SparkConf().
                setAppName("Spark_Connect_Hbase_Test");
        JavaSparkContext sc = new JavaSparkContext(conf);
        JavaPairRDD resultRDD = sc.newAPIHadoopRDD(confhbase, TableInputFormat.class, ImmutableBytesWritable.class, Result.class);

        long count = resultRDD.count();
        System.out.print("************SPARK from hbase  count ***************      " + count + "                 ");

        resultRDD.foreach(new VoidFunction>() {
            @Override
            public void call(Tuple2 v1) throws Exception {
                String key = Bytes.toString(v1._2().getRow());
                String operate_begin_time = Bytes.toString(v1._2().getValue(Bytes.toBytes("info"), Bytes.toBytes("operate_begin_time")));
                System.out.print("==================spark from hbase  record=========== :  " + key + "  " + operate_begin_time);
            }
        });
        while (true) {

        }
    }
}

pom文件:



    4.0.0

    noc_hbase_test
    noc_hbase_test
    1.0-SNAPSHOT

    
        2.11.8
        2.2.1
        2.8.2
        1.2.14
        9.2.5.v20141112
        2.17
        1.8
        1.2.0
    


    
        
            scala-tools.org
            Scala-Tools Maven2 Repository
            http://scala-tools.org/repo-releases
        
    

    
        
            scala-tools.org
            Scala-Tools Maven2 Repository
            http://scala-tools.org/repo-releases
        
    

    
        
            org.scala-lang
            scala-library
            ${scala.version}
        
        
            org.scala-lang
            scala-compiler
            ${scala.version}
        
        
            org.scala-lang
            scala-reflect
            ${scala.version}
        

        
            org.scala-lang
            scalap
            ${scala.version}
        
        
            org.apache.spark
            spark-core_2.10
            ${spark.version}
        
        
            org.apache.spark
            spark-launcher_2.10
            ${spark.version}
        
        
            org.apache.spark
            spark-network-shuffle_2.10
            ${spark.version}
        
        
            org.apache.spark
            spark-sql_2.10
            ${spark.version}
        
        
            org.apache.spark
            spark-hive_2.10
            ${spark.version}
        
        
            org.apache.spark
            spark-catalyst_2.10
            ${spark.version}
        


        
            org.apache.spark
            spark-repl_2.10
            ${spark.version}
        

        
            org.apache.hive
            hive-jdbc
            1.2.1
        


        

        
        
            org.apache.hbase
            hbase-client
            ${hbase.version}
            
                
                    org.slf4j
                    slf4j-log4j12
                
            
        
        
            org.apache.hbase
            hbase-common
            ${hbase.version}
            
                
                    org.slf4j
                    slf4j-log4j12
                
            
        
        
            org.apache.hbase
            hbase-server
            ${hbase.version}
            
                
                    org.slf4j
                    slf4j-log4j12
                
            
        

        
        
            org.apache.hadoop
            hadoop-common
            2.6.0
        

        
            org.apache.hadoop
            hadoop-client
            2.6.0
        

        
        
            org.apache.hadoop
            hadoop-hdfs
            2.6.0
        


    

    
        
            
                maven-assembly-plugin
                
                    dist
                    true
                    
                        jar-with-dependencies
                    
                
                
                    
                        make-assembly
                        package
                        
                            single
                        
                    
                
            

            
                maven-compiler-plugin
                
                    1.7
                    1.7
                
            

            
                net.alchim31.maven
                scala-maven-plugin
                3.2.2
                
                    
                        scala-compile-first
                        process-resources
                        
                            compile
                        
                    
                
                
                    ${scala.version}
                    incremental
                    true
                    
                        -unchecked
                        -deprecation
                        -feature
                    
                    
                        -Xms1024m
                        -Xmx1024m
                    
                    
                        -source
                        ${java.version}
                        -target
                        ${java.version}
                        -Xlint:all,-serial,-path
                    
                
            

            
                org.antlr
                antlr4-maven-plugin
                4.3
                
                    
                        antlr
                        
                            antlr4
                        
                        none
                    
                
                
                    src/test/java
                    true
                    true
                
            
        
    

在spark集群中提交运行:

root@master:~# spark-submit --name noc_hbase_test   --class HbaseTest.sparkconnectHbase.SparkConnectHbaseTest --master  spark://master:7077  --jars /usr/local/apache-hive-1.2.1/lib/mysql-connector-java-5.1.13-bin.jar,/usr/local/apache-hive-1.2.1/lib/hive-hbase-handler-1.2.1.jar,/usr/local/hbase-1.2.0/lib/hbase-client-1.2.0.jar,/usr/local/hbase-1.2.0/lib/hbase-common-1.2.0.jar,/usr/local/hbase-1.2.0/lib/hbase-protocol-1.2.0.jar,/usr/local/hbase-1.2.0/lib/hbase-server-1.2.0.jar,/usr/local/hbase-1.2.0/lib/htrace-core-3.1.0-incubating.jar,/usr/local/hbase-1.2.0/lib/metrics-core-2.2.0.jar,/usr/local/hbase-1.2.0/lib/hbase-hadoop2-compat-1.2.0.jar,/usr/local/hbase-1.2.0/lib/guava-12.0.1.jar,/usr/local/hbase-1.2.0/lib/protobuf-java-2.5.0.jar    --executor-memory 512m  --total-executor-cores 2   /usr/local/setup_tools/noc_hbase_test.jar

spark运行成功,运行结果如下:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/spark-2.2.1-bin-hadoop2.6/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
18/06/15 14:52:57 INFO spark.SparkContext: Running Spark version 2.2.1
18/06/15 14:52:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/06/15 14:52:58 INFO spark.SparkContext: Submitted application: Spark_Connect_Hbase_Test
18/06/15 14:52:58 INFO spark.SecurityManager: Changing view acls to: root
18/06/15 14:52:58 INFO spark.SecurityManager: Changing modify acls to: root
18/06/15 14:52:58 INFO spark.SecurityManager: Changing view acls groups to: 
18/06/15 14:52:58 INFO spark.SecurityManager: Changing modify acls groups to: 
18/06/15 14:52:58 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
18/06/15 14:52:58 INFO util.Utils: Successfully started service 'sparkDriver' on port 46964.
18/06/15 14:52:58 INFO spark.SparkEnv: Registering MapOutputTracker
18/06/15 14:52:58 INFO spark.SparkEnv: Registering BlockManagerMaster
18/06/15 14:52:58 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/06/15 14:52:58 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/06/15 14:52:58 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-4ac96a51-bf1d-4c35-b9d7-53e481274c63
18/06/15 14:52:58 INFO memory.MemoryStore: MemoryStore started with capacity 413.9 MB
18/06/15 14:52:59 INFO spark.SparkEnv: Registering OutputCommitCoordinator
18/06/15 14:52:59 INFO util.log: Logging initialized @2617ms
18/06/15 14:52:59 INFO server.Server: jetty-9.3.z-SNAPSHOT
18/06/15 14:52:59 INFO server.Server: Started @2799ms
18/06/15 14:52:59 INFO server.AbstractConnector: Started ServerConnector@2ca308df{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
18/06/15 14:52:59 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@70e0accd{/jobs,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@65f87a2c{/jobs/json,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6ce1f601{/jobs/job,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@d816dde{/jobs/job/json,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6c451c9c{/stages,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@372b0d86{/stages/json,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3113a37{/stages/stage,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@20312893{/stages/stage/json,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@c41709a{/stages/pool,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@54ec8cc9{/stages/pool/json,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5528a42c{/storage,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1a6f5124{/storage/json,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@ec2bf82{/storage/rdd,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6cc0bcf6{/storage/rdd/json,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@32f61a31{/environment,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@669253b7{/environment/json,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@51a06cbe{/executors,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@49a64d82{/executors/json,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@66d23e4a{/executors/threadDump,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4d9d1b69{/executors/threadDump/json,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@251f7d26{/static,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@37d3d232{/,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@581d969c{/api,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5851bd4f{/jobs/job/kill,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2f40a43{/stages/stage/kill,null,AVAILABLE,@Spark}
18/06/15 14:52:59 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://master:4040
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/apache-hive-1.2.1/lib/mysql-connector-java-5.1.13-bin.jar at spark://master:46964/jars/mysql-connector-java-5.1.13-bin.jar with timestamp 1529045579564
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/apache-hive-1.2.1/lib/hive-hbase-handler-1.2.1.jar at spark://master:46964/jars/hive-hbase-handler-1.2.1.jar with timestamp 1529045579571
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/hbase-client-1.2.0.jar at spark://master:46964/jars/hbase-client-1.2.0.jar with timestamp 1529045579572
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/hbase-common-1.2.0.jar at spark://master:46964/jars/hbase-common-1.2.0.jar with timestamp 1529045579574
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/hbase-protocol-1.2.0.jar at spark://master:46964/jars/hbase-protocol-1.2.0.jar with timestamp 1529045579575
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/hbase-server-1.2.0.jar at spark://master:46964/jars/hbase-server-1.2.0.jar with timestamp 1529045579577
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/htrace-core-3.1.0-incubating.jar at spark://master:46964/jars/htrace-core-3.1.0-incubating.jar with timestamp 1529045579578
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/metrics-core-2.2.0.jar at spark://master:46964/jars/metrics-core-2.2.0.jar with timestamp 1529045579579
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/hbase-hadoop2-compat-1.2.0.jar at spark://master:46964/jars/hbase-hadoop2-compat-1.2.0.jar with timestamp 1529045579581
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/guava-12.0.1.jar at spark://master:46964/jars/guava-12.0.1.jar with timestamp 1529045579583
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/hbase-1.2.0/lib/protobuf-java-2.5.0.jar at spark://master:46964/jars/protobuf-java-2.5.0.jar with timestamp 1529045579584
18/06/15 14:52:59 INFO spark.SparkContext: Added JAR file:/usr/local/setup_tools/noc_hbase_test.jar at spark://master:46964/jars/noc_hbase_test.jar with timestamp 1529045579585
18/06/15 14:52:59 INFO client.StandaloneAppClient$ClientEndpoint: Connecting to master spark://master:7077...
18/06/15 14:52:59 INFO client.TransportClientFactory: Successfully created connection to master/192.168.189.1:7077 after 40 ms (0 ms spent in bootstraps)
18/06/15 14:53:00 INFO cluster.StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20180615145300-0004
18/06/15 14:53:00 INFO client.StandaloneAppClient$ClientEndpoint: Executor added: app-20180615145300-0004/0 on worker-20180615140035-worker1-39457 (worker1:39457) with 1 cores
18/06/15 14:53:00 INFO cluster.StandaloneSchedulerBackend: Granted executor ID app-20180615145300-0004/0 on hostPort worker1:39457 with 1 cores, 512.0 MB RAM
18/06/15 14:53:00 INFO client.StandaloneAppClient$ClientEndpoint: Executor added: app-20180615145300-0004/1 on worker-20180615140043-worker3-56574 (worker3:56574) with 1 cores
18/06/15 14:53:00 INFO cluster.StandaloneSchedulerBackend: Granted executor ID app-20180615145300-0004/1 on hostPort worker3:56574 with 1 cores, 512.0 MB RAM
18/06/15 14:53:00 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43464.
18/06/15 14:53:00 INFO netty.NettyBlockTransferService: Server created on master:43464
18/06/15 14:53:00 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/06/15 14:53:00 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, master, 43464, None)
18/06/15 14:53:00 INFO storage.BlockManagerMasterEndpoint: Registering block manager master:43464 with 413.9 MB RAM, BlockManagerId(driver, master, 43464, None)
18/06/15 14:53:00 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, master, 43464, None)
18/06/15 14:53:00 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, master, 43464, None)
18/06/15 14:53:00 INFO client.StandaloneAppClient$ClientEndpoint: Executor updated: app-20180615145300-0004/0 is now RUNNING
18/06/15 14:53:00 INFO client.StandaloneAppClient$ClientEndpoint: Executor updated: app-20180615145300-0004/1 is now RUNNING
18/06/15 14:53:00 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@d02f8d{/metrics/json,null,AVAILABLE,@Spark}
18/06/15 14:53:00 INFO cluster.StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
18/06/15 14:53:01 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 300.0 KB, free 413.6 MB)
18/06/15 14:53:01 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 26.5 KB, free 413.6 MB)
18/06/15 14:53:01 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on master:43464 (size: 26.5 KB, free: 413.9 MB)
18/06/15 14:53:01 INFO spark.SparkContext: Created broadcast 0 from newAPIHadoopRDD at SparkConnectHbaseTest.java:35
18/06/15 14:53:01 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x895416d connecting to ZooKeeper ensemble=192.168.189.1:2181,192.168.189.2:2181,192.168.189.3:2181
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:host.name=master
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_60
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/local/jdk1.8.0_60/jre
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar:/usr/local/spark-2.2.1-bin-hadoop2.6/conf/:。。。。。。。。。
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:java.compiler=
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:os.version=3.16.0-30-generic
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:user.name=root
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Client environment:user.dir=/root
18/06/15 14:53:02 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=192.168.189.1:2181,192.168.189.2:2181,192.168.189.3:2181 sessionTimeout=90000 watcher=hconnection-0x895416d0x0, quorum=192.168.189.1:2181,192.168.189.2:2181,192.168.189.3:2181, baseZNode=/hbase
18/06/15 14:53:02 INFO zookeeper.ClientCnxn: Opening socket connection to server 192.168.189.3/192.168.189.3:2181. Will not attempt to authenticate using SASL (unknown error)
18/06/15 14:53:02 INFO zookeeper.ClientCnxn: Socket connection established to 192.168.189.3/192.168.189.3:2181, initiating session
18/06/15 14:53:02 INFO zookeeper.ClientCnxn: Session establishment complete on server 192.168.189.3/192.168.189.3:2181, sessionid = 0x3640207247f0009, negotiated timeout = 40000
18/06/15 14:53:04 INFO util.RegionSizeCalculator: Calculating region sizes for table "db_res:wtb_ow_operation".
18/06/15 14:53:05 INFO client.ConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
18/06/15 14:53:05 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x3640207247f0009
18/06/15 14:53:05 INFO zookeeper.ClientCnxn: EventThread shut down
18/06/15 14:53:05 INFO zookeeper.ZooKeeper: Session: 0x3640207247f0009 closed
18/06/15 14:53:05 INFO spark.SparkContext: Starting job: count at SparkConnectHbaseTest.java:37
18/06/15 14:53:05 INFO scheduler.DAGScheduler: Got job 0 (count at SparkConnectHbaseTest.java:37) with 1 output partitions
18/06/15 14:53:05 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (count at SparkConnectHbaseTest.java:37)
18/06/15 14:53:05 INFO scheduler.DAGScheduler: Parents of final stage: List()
18/06/15 14:53:05 INFO scheduler.DAGScheduler: Missing parents: List()
18/06/15 14:53:05 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (NewHadoopRDD[0] at newAPIHadoopRDD at SparkConnectHbaseTest.java:35), which has no missing parents
18/06/15 14:53:05 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 2040.0 B, free 413.6 MB)
18/06/15 14:53:05 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1278.0 B, free 413.6 MB)
18/06/15 14:53:05 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on master:43464 (size: 1278.0 B, free: 413.9 MB)
18/06/15 14:53:05 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
18/06/15 14:53:05 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (NewHadoopRDD[0] at newAPIHadoopRDD at SparkConnectHbaseTest.java:35) (first 15 tasks are for partitions Vector(0))
18/06/15 14:53:05 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
18/06/15 14:53:20 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
18/06/15 14:53:33 INFO cluster.CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.189.2:36455) with ID 0
18/06/15 14:53:33 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, worker1, executor 0, partition 0, NODE_LOCAL, 4879 bytes)
18/06/15 14:53:34 INFO storage.BlockManagerMasterEndpoint: Registering block manager worker1:56820 with 117.0 MB RAM, BlockManagerId(0, worker1, 56820, None)
18/06/15 14:53:34 INFO cluster.CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.189.4:45624) with ID 1
18/06/15 14:53:35 INFO storage.BlockManagerMasterEndpoint: Registering block manager worker3:38924 with 117.0 MB RAM, BlockManagerId(1, worker3, 38924, None)
18/06/15 14:53:42 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on worker1:56820 (size: 1278.0 B, free: 117.0 MB)
18/06/15 14:53:43 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on worker1:56820 (size: 26.5 KB, free: 116.9 MB)
18/06/15 14:53:57 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 23730 ms on worker1 (executor 0) (1/1)
18/06/15 14:53:57 INFO scheduler.DAGScheduler: ResultStage 0 (count at SparkConnectHbaseTest.java:37) finished in 51.886 s
18/06/15 14:53:57 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
18/06/15 14:53:58 INFO scheduler.DAGScheduler: Job 0 finished: count at SparkConnectHbaseTest.java:37, took 52.827846 s
************SPARK from hbase  count ***************      1                 18/06/15 14:53:58 INFO spark.SparkContext: Starting job: foreach at SparkConnectHbaseTest.java:40
18/06/15 14:53:58 INFO scheduler.DAGScheduler: Got job 1 (foreach at SparkConnectHbaseTest.java:40) with 1 output partitions
18/06/15 14:53:58 INFO scheduler.DAGScheduler: Final stage: ResultStage 1 (foreach at SparkConnectHbaseTest.java:40)
18/06/15 14:53:58 INFO scheduler.DAGScheduler: Parents of final stage: List()
18/06/15 14:53:58 INFO scheduler.DAGScheduler: Missing parents: List()
18/06/15 14:53:58 INFO scheduler.DAGScheduler: Submitting ResultStage 1 (NewHadoopRDD[0] at newAPIHadoopRDD at SparkConnectHbaseTest.java:35), which has no missing parents
18/06/15 14:53:58 INFO memory.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.2 KB, free 413.6 MB)
18/06/15 14:53:58 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1430.0 B, free 413.6 MB)
18/06/15 14:53:58 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on master:43464 (size: 1430.0 B, free: 413.9 MB)
18/06/15 14:53:58 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
18/06/15 14:53:58 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (NewHadoopRDD[0] at newAPIHadoopRDD at SparkConnectHbaseTest.java:35) (first 15 tasks are for partitions Vector(0))
18/06/15 14:53:58 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
18/06/15 14:53:58 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, worker1, executor 0, partition 0, NODE_LOCAL, 4879 bytes)
18/06/15 14:53:58 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on worker1:56820 (size: 1430.0 B, free: 116.9 MB)
18/06/15 14:53:59 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 507 ms on worker1 (executor 0) (1/1)
18/06/15 14:53:59 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
18/06/15 14:53:59 INFO scheduler.DAGScheduler: ResultStage 1 (foreach at SparkConnectHbaseTest.java:40) finished in 0.508 s
18/06/15 14:53:59 INFO scheduler.DAGScheduler: Job 1 finished: foreach at SparkConnectHbaseTest.java:40, took 0.533378 s
18/06/15 14:58:02 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on master:43464 in memory (size: 1430.0 B, free: 413.9 MB)
18/06/15 14:58:02 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on worker1:56820 in memory (size: 1430.0 B, free: 116.9 MB) 

console截图如下:

HBase实战(6):使用Spark 2.2.1 直接操作HBASE 1.2.0数据库_第1张图片HBase实战(6):使用Spark 2.2.1 直接操作HBASE 1.2.0数据库_第2张图片


spark web截图如下:

HBase实战(6):使用Spark 2.2.1 直接操作HBASE 1.2.0数据库_第3张图片

HBase实战(6):使用Spark 2.2.1 直接操作HBASE 1.2.0数据库_第4张图片

HBase实战(6):使用Spark 2.2.1 直接操作HBASE 1.2.0数据库_第5张图片




你可能感兴趣的:(AI,&,Big,Data案例实战课程)