hbase coprocessor 实践:observer

hbase 的 coprocessor分为observer和endpoint两种方式,具体的区别和应用场景大概如下:

observer 类似于dbms的触发器,工作在服务器端,可以实现权限管理,监控,创建二级索引等

endpoint类似于存储过程,工作在服务器端和客户端,可以实现一些统计型的工作例如:max、min、total等计算

coprocessor 还分为系统级和表级的,系统级的coprocessor会作用于所有的table所有region,而表级的coprocessor只会作用于某个表的region

我们想为表创建一些二级索引表,并且想针对不同的表使用不同的策略。所以使用到了表级的observer的coprocessor。

observer   coprocessor 又分为三类:RegionObserver,WALObserver和MasterObserver。

    RegionObserver: Provides hooks for data manipulation events, Get, Put, Delete, Scan, and so on. There is an instance of a RegionObserver coprocessor
                    for every table region and the scope of the observations they can make is constrained to that region.

    WALObserver: Provides hooks for write-ahead log (WAL) related operations. This is a way to observe or intercept WAL writing and reconstruction events. A WALObserver runs in the context of WAL processing. There is one such context per region server.

    MasterObserver: Provides hooks for DDL-type operation, i.e., create, delete, modify table, etc. The MasterObserver runs within the context of the 
                    HBase master.



通过查找相关api和资料,相信写一个observer的coprocessor不会太难,我们在这个过程中遇到的难点是写好的coprocessor加载问题。

1、coprocessor的加载方式

       coprocessor有两种加载方式:

        一种是通过配置

    <property>
        <name>hbase.coprocessor.region.classes</name>
        <value>you coprocessor class </value>
     </property>
      这种方式适合于系统级的coprocessor的加载,会作用于所有的region,而且需要重启regionserver才能生效

        一种是通过shell命令加载,这种方式不需要重启regionserver,可以指定作用的表,挺方便的

        加载命令:

      

   alter 'tablename',METHOD => 'table_att','coprocessor'=>'hdfs:///foo.jar|com.foo.FooObserver|1001|arg1=1,arg2=2'

      既然有加载命令,就应该有卸载命令:

   alter 'tablename',METHOD => 'table_att_unset',NAME => 'coprocessor$1'

2、加载遇到的问题:

java.io.IOException: java.lang.InstantiationException: com.datateam.hbase.IndexCoprocessor
        at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadInstance(CoprocessorHost.java:258)
        at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.load(CoprocessorHost.java:218)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.loadTableCoprocessors(RegionCoprocessorHost.java:207)
        at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.<init>(RegionCoprocessorHost.java:163)
        at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:622)
        at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:529)
        at sun.reflect.GeneratedConstructorAccessor10.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:4183)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4494)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4467)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4423)
        at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4374)
        at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:465)
        at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:139)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.InstantiationException: com.datateam.hbase.IndexCoprocessor
        at java.lang.Class.newInstance(Class.java:359)
        at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadInstance(CoprocessorHost.java:255)
        ... 19 more
当出现这个问题的时候,还把regionserver搞挂了,重启也不好使,还好是在测试环境下搞,要不死定了。后来把测试表drop掉,把数据清掉,重启才好使。

针对coprocessor发生异常搞挂regionserver的情况,可以通过下面的设置避免:

        <property>
                <name>hbase.coprocessor.abortonerror</name>
                <value>false</value>
        </property>

这个问题是说没有办法生成IndexCoprocessor的实例,hbase加载coprocessor其实是通过反射机制生成的。通过查找源码和文档

java.lang.Class:

        try {
		Class[] empty = {};
                final Constructor<T> c = getConstructor0(empty, Member.DECLARED);
                // Disable accessibility checks on the constructor
                // since we have to do the security check here anyway
                // (the stack depth is wrong for the Constructor's
                // security check to work)
                java.security.AccessController.doPrivileged
                    (new java.security.PrivilegedAction() {
                            public Object run() {
                                c.setAccessible(true);
                                return null;
                            }
                        });
                cachedConstructor = c;
            } catch (NoSuchMethodException e) {
                throw new InstantiationException(getName());
            }
问题的所在是,在实现IndexCoprocessor类的时候写了一个带参数的构造函数,而coprocessor在反射的时候是没有提供参数的,所以怎么都起不来。

最后,还有一个问题要注意:

如果jar包重新编译(比如之前的jar包有问题,或者增加新的功能等),需要重新命名,因为之前加载过这个jar包,名字一样的话,你会发现新的jar包根本没有起作用。











你可能感兴趣的:(hbase coprocessor 实践:observer)