Hbase使用filter快速高效查询

  1. 博客是hbase使用filter快速高效查询的方法,我会慢慢补齐

几大Filters
1、Comparision Filters
1.1 RowFilter
1.2 FamilyFilter
1.3 QualifierFilter
1.4 ValueFilter
1.5 DependentColumnFilter
2、Dedicated Filters
2.1 SingleColumnValueFilter
2.2 SingleColumnValueExcludeFilter
2.3 PrefixFilter
2.4 PageFilter
2.5 KeyOnlyFilter
2.6 FirstKeyOnlyFilter
2.7 TimestampsFilter
2.8 RandomRowFilter
3、Decorating Filters
3.1 SkipFilter
3.2 WhileMatchFilters

一个简单的示例 SingleColumnValueFilter

  1. publicstaticvoidselectByFilter(Stringtablename,List<String>arr)throwsIOException{
  2. HTabletable=newHTable(hbaseConfig,tablename);
  3. FilterListfilterList=newFilterList();
  4. Scans1=newScan();
  5. for(Stringv:arr){//各个条件之间是“与”的关系
  6. String[]s=v.split(",");
  7. filterList.addFilter(newSingleColumnValueFilter(Bytes.toBytes(s[0]),
  8. Bytes.toBytes(s[1]),
  9. CompareOp.EQUAL,Bytes.toBytes(s[2])
  10. )
  11. );
  12. //添加下面这一行后,则只返回指定的cell,同一行中的其他cell不返回
  13. //s1.addColumn(Bytes.toBytes(s[0]),Bytes.toBytes(s[1]));
  14. }
  15. s1.setFilter(filterList);
  16. ResultScannerResultScannerFilterList=table.getScanner(s1);
  17. for(Resultrr=ResultScannerFilterList.next();rr!=null;rr=ResultScannerFilterList.next()){
  18. for(KeyValuekv:rr.list()){
  19. System.out.println("row:"+newString(kv.getRow()));
  20. System.out.println("column:"+newString(kv.getColumn()));
  21. System.out.println("value:"+newString(kv.getValue()));
  22. }
  23. }
  24. }


MultipleColumnPrefixFilter

api上介绍如下

  1. Thisfilterisusedforselectingonlythosekeyswithcolumnsthatmatchesaparticularprefix.Forexample,ifprefixis'an',itwillpasskeyswillcolumnslike'and','anti'butnotkeyswithcolumnslike'ball','act'.

构造方法如下

  1. publicMultipleColumnPrefixFilter(byte[][]prefixes)

传入多个prefix
源码里说明如下

  1. publicMultipleColumnPrefixFilter(finalbyte[][]prefixes){
  2. if(prefixes!=null){
  3. for(inti=0;i<prefixes.length;i++){
  4. if(!sortedPrefixes.add(prefixes[i]))
  5. thrownewIllegalArgumentException("prefixesmustbedistinct");
  6. }
  7. }
  8. }

示例代码如下:是我从网上找的,看了,没啥难理解的,

  1. +publicclassTestMultipleColumnPrefixFilter{
  2. +
  3. +privatefinalstaticHBaseTestingUtilityTEST_UTIL=new
  4. +HBaseTestingUtility();
  5. +
  6. +@Test
  7. +publicvoidtestMultipleColumnPrefixFilter()throwsIOException{
  8. +Stringfamily="Family";
  9. +HTableDescriptorhtd=newHTableDescriptor("TestMultipleColumnPrefixFilter");
  10. +htd.addFamily(newHColumnDescriptor(family));
  11. +//HRegionInfoinfo=newHRegionInfo(htd,null,null,false);
  12. +HRegionInfoinfo=newHRegionInfo(htd.getName(),null,null,false);
  13. +HRegionregion=HRegion.createHRegion(info,HBaseTestingUtility.
  14. +getTestDir(),TEST_UTIL.getConfiguration(),htd);
  15. +
  16. +List<String>rows=generateRandomWords(100,"row");
  17. +List<String>columns=generateRandomWords(10000,"column");
  18. +longmaxTimestamp=2;
  19. +
  20. +List<KeyValue>kvList=newArrayList<KeyValue>();
  21. +
  22. +Map<String,List<KeyValue>>prefixMap=newHashMap<String,
  23. +List<KeyValue>>();
  24. +
  25. +prefixMap.put("p",newArrayList<KeyValue>());
  26. +prefixMap.put("q",newArrayList<KeyValue>());
  27. +prefixMap.put("s",newArrayList<KeyValue>());
  28. +
  29. +StringvalueString="ValueString";
  30. +
  31. +for(Stringrow:rows){
  32. +Putp=newPut(Bytes.toBytes(row));
  33. +for(Stringcolumn:columns){
  34. +for(longtimestamp=1;timestamp<=maxTimestamp;timestamp++){
  35. +KeyValuekv=KeyValueTestUtil.create(row,family,column,timestamp,
  36. +valueString);
  37. +p.add(kv);
  38. +kvList.add(kv);
  39. +for(Strings:prefixMap.keySet()){
  40. +if(column.startsWith(s)){
  41. +prefixMap.get(s).add(kv);
  42. +}
  43. +}
  44. +}
  45. +}
  46. +region.put(p);
  47. +}
  48. +
  49. +MultipleColumnPrefixFilterfilter;
  50. +Scanscan=newScan();
  51. +scan.setMaxVersions();
  52. +byte[][]filter_prefix=newbyte[2][];
  53. +filter_prefix[0]=newbyte[]{'p'};
  54. +filter_prefix[1]=newbyte[]{'q'};
  55. +
  56. +filter=newMultipleColumnPrefixFilter(filter_prefix);
  57. +scan.setFilter(filter);
  58. +List<KeyValue>results=newArrayList<KeyValue>();
  59. +InternalScannerscanner=region.getScanner(scan);
  60. +while(scanner.next(results));
  61. +assertEquals(prefixMap.get("p").size()+prefixMap.get("q").size(),results.size());
  62. +}
  63. +
  64. +@Test
  65. +publicvoidtestMultipleColumnPrefixFilterWithManyFamilies()throwsIOException{
  66. +Stringfamily1="Family1";
  67. +Stringfamily2="Family2";
  68. +HTableDescriptorhtd=newHTableDescriptor("TestMultipleColumnPrefixFilter");
  69. +htd.addFamily(newHColumnDescriptor(family1));
  70. +htd.addFamily(newHColumnDescriptor(family2));
  71. +HRegionInfoinfo=newHRegionInfo(htd.getName(),null,null,false);
  72. +HRegionregion=HRegion.createHRegion(info,HBaseTestingUtility.
  73. +getTestDir(),TEST_UTIL.getConfiguration(),htd);
  74. +
  75. +List<String>rows=generateRandomWords(100,"row");
  76. +List<String>columns=generateRandomWords(10000,"column");
  77. +longmaxTimestamp=3;
  78. +
  79. +List<KeyValue>kvList=newArrayList<KeyValue>();
  80. +
  81. +Map<String,List<KeyValue>>prefixMap=newHashMap<String,
  82. +List<KeyValue>>();
  83. +
  84. +prefixMap.put("p",newArrayList<KeyValue>());
  85. +prefixMap.put("q",newArrayList<KeyValue>());
  86. +prefixMap.put("s",newArrayList<KeyValue>());
  87. +
  88. +StringvalueString="ValueString";
  89. +
  90. +for(Stringrow:rows){
  91. +Putp=newPut(Bytes.toBytes(row));
  92. +for(Stringcolumn:columns){
  93. +for(longtimestamp=1;timestamp<=maxTimestamp;timestamp++){
  94. +doublerand=Math.random();
  95. +KeyValuekv;
  96. +if(rand<0.5)
  97. +kv=KeyValueTestUtil.create(row,family1,column,timestamp,
  98. +valueString);
  99. +else
  100. +kv=KeyValueTestUtil.create(row,family2,column,timestamp,
  101. +valueString);
  102. +p.add(kv);
  103. +kvList.add(kv);
  104. +for(Strings:prefixMap.keySet()){
  105. +if(column.startsWith(s)){
  106. +prefixMap.get(s).add(kv);
  107. +}
  108. +}
  109. +}
  110. +}
  111. +region.put(p);
  112. +}
  113. +
  114. +MultipleColumnPrefixFilterfilter;
  115. +Scanscan=newScan();
  116. +scan.setMaxVersions();
  117. +byte[][]filter_prefix=newbyte[2][];
  118. +filter_prefix[0]=newbyte[]{'p'};
  119. +filter_prefix[1]=newbyte[]{'q'};
  120. +
  121. +filter=newMultipleColumnPrefixFilter(filter_prefix);
  122. +scan.setFilter(filter);
  123. +List<KeyValue>results=newArrayList<KeyValue>();
  124. +InternalScannerscanner=region.getScanner(scan);
  125. +while(scanner.next(results));
  126. +assertEquals(prefixMap.get("p").size()+prefixMap.get("q").size(),results.size());
  127. +}
  128. +
  129. +@Test
  130. +publicvoidtestMultipleColumnPrefixFilterWithColumnPrefixFilter()throwsIOException{
  131. +Stringfamily="Family";
  132. +HTableDescriptorhtd=newHTableDescriptor("TestMultipleColumnPrefixFilter");
  133. +htd.addFamily(newHColumnDescriptor(family));
  134. +HRegionInfoinfo=newHRegionInfo(htd.getName(),null,null,false);
  135. +HRegionregion=HRegion.createHRegion(info,HBaseTestingUtility.
  136. +getTestDir(),TEST_UTIL.getConfiguration(),htd);
  137. +
  138. +List<String>rows=generateRandomWords(100,"row");
  139. +List<String>columns=generateRandomWords(10000,"column");
  140. +longmaxTimestamp=2;
  141. +
  142. +StringvalueString="ValueString";
  143. +
  144. +for(Stringrow:rows){
  145. +Putp=newPut(Bytes.toBytes(row));
  146. +for(Stringcolumn:columns){
  147. +for(longtimestamp=1;timestamp<=maxTimestamp;timestamp++){
  148. +KeyValuekv=KeyValueTestUtil.create(row,family,column,timestamp,
  149. +valueString);
  150. +p.add(kv);
  151. +}
  152. +}
  153. +region.put(p);
  154. +}
  155. +
  156. +MultipleColumnPrefixFiltermultiplePrefixFilter;
  157. +Scanscan1=newScan();
  158. +scan1.setMaxVersions();
  159. +byte[][]filter_prefix=newbyte[1][];
  160. +filter_prefix[0]=newbyte[]{'p'};
  161. +
  162. +multiplePrefixFilter=newMultipleColumnPrefixFilter(filter_prefix);
  163. +scan1.setFilter(multiplePrefixFilter);
  164. +List<KeyValue>results1=newArrayList<KeyValue>();
  165. +InternalScannerscanner1=region.getScanner(scan1);
  166. +while(scanner1.next(results1));
  167. +
  168. +ColumnPrefixFiltersinglePrefixFilter;
  169. +Scanscan2=newScan();
  170. +scan2.setMaxVersions();
  171. +singlePrefixFilter=newColumnPrefixFilter(Bytes.toBytes("p"));
  172. +
  173. +scan2.setFilter(singlePrefixFilter);
  174. +List<KeyValue>results2=newArrayList<KeyValue>();
  175. +InternalScannerscanner2=region.getScanner(scan1);
  176. +while(scanner2.next(results2));
  177. +
  178. +assertEquals(results1.size(),results2.size());
  179. +}
  180. +
  181. +List<String>generateRandomWords(intnumberOfWords,Stringsuffix){
  182. +Set<String>wordSet=newHashSet<String>();
  183. +for(inti=0;i<numberOfWords;i++){
  184. +intlengthOfWords=(int)(Math.random()*2)+1;
  185. +char[]wordChar=newchar[lengthOfWords];
  186. +for(intj=0;j<wordChar.length;j++){
  187. +wordChar[j]=(char)(Math.random()*26+97);
  188. +}
  189. +Stringword;
  190. +if(suffix==null){
  191. +word=newString(wordChar);
  192. +}else{
  193. +word=newString(wordChar)+suffix;
  194. +}
  195. +wordSet.add(word);
  196. +}
  197. +List<String>wordList=newArrayList<String>(wordSet);
  198. +returnwordList;
  199. +}
  200. +}
  201. +
  202. .


ColumnPrefixFilter

  1. publicclassColumnPrefixFilterextendsFilterBaseThisfilterisusedforselectingonlythosekeyswithcolumnsthatmatchesaparticularprefix.Forexample,ifprefixis'an',itwillpasskeyswillcolumnslike'and','anti'butnotkeyswithcolumnslike'ball','act'.

上面是类的说明
只有一个有参构造 ColumnPrefixFilter(byte[]prefix)
这个类用法很简单,就是匹配前缀是prefix的rowkey,但是,不知道大家用了之后有什么感觉,我是用了,但是不起作用,有起作用的大牛告诉我下。

无奈之下,只好选择PrefixFilter

PrefixFilter

类说明 :

Pass results that have same row prefix.

同样的构造方法,跟ColumnPrefixFilter一模一样,用法也相同,

基本上几个Filter就是这些了,慢慢的我再更新这个文章

上段代码,我自己写的,使用中的代码

  1. publicstaticStringgetKeywordTableRowkeyUseFilter(StringfilterString1,StringfilterString2){
  2. FilterListfilterList=newFilterList();
  3. StringrowkeyValue="";
  4. Scans1=newScan();
  5. String[]sf1=filterString1.split(",");
  6. filterList.addFilter(newSingleColumnValueFilter(Bytes.toBytes(sf1[0]),
  7. Bytes.toBytes(sf1[1]),
  8. CompareOp.EQUAL,Bytes.toBytes(sf1[2])
  9. ));
  10. String[]sf2=filterString2.split(",");
  11. filterList.addFilter(newSingleColumnValueFilter(Bytes.toBytes(sf2[0]),
  12. Bytes.toBytes(sf2[1]),
  13. CompareOp.EQUAL,Bytes.toBytes(sf2[2])
  14. ));
  15. filterList.addFilter(newColumnPrefixFilter(Bytes.toBytes("3274980668:")));
  16. filterList.addFilter(newPrefixFilter(Bytes.toBytes("3274980668:")));
  17. s1.setFilter(filterList);
  18. ResultScannerResultScannerFilterList;
  19. try{
  20. ResultScannerFilterList=tableKeyword.getScanner(s1);
  21. for(Resultrr=ResultScannerFilterList.next();rr!=null;rr=ResultScannerFilterList.next()){
  22. StringrowkeyValueTmp=newString(rr.getRow());
  23. rowkeyValue=rowkeyValue+"##"+rowkeyValueTmp;
  24. }
  25. }catch(IOExceptione){
  26. //TODOAuto-generatedcatchblock
  27. e.printStackTrace();
  28. }
  29. log.warn("rowkeyValue"+rowkeyValue);
  30. returnrowkeyValue;
  31. }


PrefixFilter和ColumnPrefixFilter的用法几乎一样,但是在开发中,建议使用PrefixFilter


你可能感兴趣的:(filter)