hbase 的 coprocessor分为observer和endpoint两种方式,具体的区别和应用场景大概如下:
observer 类似于dbms的触发器,工作在服务器端,可以实现权限管理,监控,创建二级索引等
endpoint类似于存储过程,工作在服务器端和客户端,可以实现一些统计型的工作例如:max、min、total等计算
coprocessor 还分为系统级和表级的,系统级的coprocessor会作用于所有的table所有region,而表级的coprocessor只会作用于某个表的region
我们想为表创建一些二级索引表,并且想针对不同的表使用不同的策略。所以使用到了表级的observer的coprocessor。
observer coprocessor 又分为三类:RegionObserver,WALObserver和MasterObserver。
RegionObserver: Provides hooks for data manipulation events, Get, Put, Delete, Scan, and so on. There is an instance of a RegionObserver coprocessor for every table region and the scope of the observations they can make is constrained to that region. WALObserver: Provides hooks for write-ahead log (WAL) related operations. This is a way to observe or intercept WAL writing and reconstruction events. A WALObserver runs in the context of WAL processing. There is one such context per region server. MasterObserver: Provides hooks for DDL-type operation, i.e., create, delete, modify table, etc. The MasterObserver runs within the context of the HBase master.
通过查找相关api和资料,相信写一个observer的coprocessor不会太难,我们在这个过程中遇到的难点是写好的coprocessor加载问题。
1、coprocessor的加载方式
coprocessor有两种加载方式:
一种是通过配置
<property> <name>hbase.coprocessor.region.classes</name> <value>you coprocessor class </value> </property>这种方式适合于系统级的coprocessor的加载,会作用于所有的region,而且需要重启regionserver才能生效
一种是通过shell命令加载,这种方式不需要重启regionserver,可以指定作用的表,挺方便的
加载命令:
alter 'tablename',METHOD => 'table_att','coprocessor'=>'hdfs:///foo.jar|com.foo.FooObserver|1001|arg1=1,arg2=2'
既然有加载命令,就应该有卸载命令:
alter 'tablename',METHOD => 'table_att_unset',NAME => 'coprocessor$1'
2、加载遇到的问题:
java.io.IOException: java.lang.InstantiationException: com.datateam.hbase.IndexCoprocessor at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadInstance(CoprocessorHost.java:258) at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.load(CoprocessorHost.java:218) at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.loadTableCoprocessors(RegionCoprocessorHost.java:207) at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.<init>(RegionCoprocessorHost.java:163) at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:622) at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:529) at sun.reflect.GeneratedConstructorAccessor10.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:4183) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4494) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4467) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4423) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4374) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:465) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:139) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.InstantiationException: com.datateam.hbase.IndexCoprocessor at java.lang.Class.newInstance(Class.java:359) at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadInstance(CoprocessorHost.java:255) ... 19 more当出现这个问题的时候,还把regionserver搞挂了,重启也不好使,还好是在测试环境下搞,要不死定了。后来把测试表drop掉,把数据清掉,重启才好使。
针对coprocessor发生异常搞挂regionserver的情况,可以通过下面的设置避免:
<property> <name>hbase.coprocessor.abortonerror</name> <value>false</value> </property>
这个问题是说没有办法生成IndexCoprocessor的实例,hbase加载coprocessor其实是通过反射机制生成的。通过查找源码和文档
java.lang.Class:
try { Class[] empty = {}; final Constructor<T> c = getConstructor0(empty, Member.DECLARED); // Disable accessibility checks on the constructor // since we have to do the security check here anyway // (the stack depth is wrong for the Constructor's // security check to work) java.security.AccessController.doPrivileged (new java.security.PrivilegedAction() { public Object run() { c.setAccessible(true); return null; } }); cachedConstructor = c; } catch (NoSuchMethodException e) { throw new InstantiationException(getName()); }问题的所在是,在实现IndexCoprocessor类的时候写了一个带参数的构造函数,而coprocessor在反射的时候是没有提供参数的,所以怎么都起不来。
最后,还有一个问题要注意:
如果jar包重新编译(比如之前的jar包有问题,或者增加新的功能等),需要重新命名,因为之前加载过这个jar包,名字一样的话,你会发现新的jar包根本没有起作用。