hbase实现分页查询

[Author]: kwu 

hbase实现分页查询,实现按时间查询最新的15条,hbase的查询主要是通过rowkey来进行的,保证查询的高效。column的filter查询效率较低。


1、rowkey的设计

以时间的倒序进行查询,如20150818_152130来看,设计rowkey

20150818的hash64值+(999999-152130)。

1)MurmurHash的64的冲突非常小,并实现rowkey的散列。

2)日分秒与999999的差值,可以实现逆序


2、插入数据的操作

String[] splited = jsData.split("`");

SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmss");
Date date = new Date();
String format = sdf.format(date);

String dateday = format.substring(0, 8);
String datehour = format.substring(8, 10);
String dateMin = format.substring(10, 12);
String dateSec = format.substring(12, 14);

int datehourSub = 99 - new Integer(datehour);
int dateMinSub = 99 - new Integer(dateMin);
int dateSecSub = 99 - new Integer(dateSec);

String dateHash = MurmurHash.hash64(dateday) + "_" + datehourSub  + dateMinSub  + dateSecSub;

Put put = new Put((dateHash + "_" + splited[0]).getBytes());
put.add("cf1".getBytes(), "bdcCookieId".getBytes(), splited[0].getBytes());
put.add("cf1".getBytes(), "pcScreenRatio".getBytes(), splited[1].getBytes());
put.add("cf1".getBytes(), "objClassName".getBytes(), splited[2].getBytes());
put.add("cf1".getBytes(), "pageOpenTime".getBytes(), splited[3].getBytes());
put.add("cf1".getBytes(), "pageLoadCompleteTime".getBytes(), splited[4].getBytes());
put.add("cf1".getBytes(), "pageCloseTime".getBytes(), splited[5].getBytes());
put.add("cf1".getBytes(), "category".getBytes(), splited[6].getBytes());
put.add("cf1".getBytes(), "browserVersion".getBytes(), splited[7].getBytes());
put.add("cf1".getBytes(), "currentURL".getBytes(), splited[8].getBytes());
put.add("cf1".getBytes(), "targetParentURL".getBytes(), splited[9].getBytes());
put.add("cf1".getBytes(), "referrer".getBytes(), splited[10].getBytes());
put.add("cf1".getBytes(), "clickX".getBytes(), splited[11].getBytes());
put.add("cf1".getBytes(), "clickY".getBytes(), splited[12].getBytes());
put.add("cf1".getBytes(), "targetURL".getBytes(), splited[13].getBytes());
put.add("cf1".getBytes(), "objID".getBytes(), splited[14].getBytes());
put.add("cf1".getBytes(), "objName".getBytes(), splited[15].getBytes());
put.add("cf1".getBytes(), "objValue".getBytes(), objValue.getBytes());
put.add("cf1".getBytes(), "siteCookieId".getBytes(), splited[17].getBytes());
put.add("cf1".getBytes(), "scrollValue".getBytes(), splited[18].getBytes());
put.add("cf1".getBytes(), "sessionId".getBytes(), splited[19].getBytes());
put.add("cf1".getBytes(), "ip".getBytes(), splited[20].getBytes());
put.add("cf1".getBytes(), "macAdress".getBytes(), splited[21].getBytes());
put.add("cf1".getBytes(), "gaId".getBytes(), splited[22].getBytes());
put.add("cf1".getBytes(), "baiduId".getBytes(), splited[23].getBytes());
put.add("cf1".getBytes(), "activeTime".getBytes(), splited[24].getBytes());
put.add("cf1".getBytes(), "status".getBytes(), splited[25].getBytes());
put.add("cf1".getBytes(), "triggertime".getBytes(), splited[26].getBytes());

List list = new ArrayList();
list.add(put);

HTable table = new HTable(configuration, "jsActionPage");
table.put(list);

table.close();


3、查询 利用scan 时间倒序

HTable table = new HTable(configuration, "jsActionPage");
Scan scan = new Scan();
scan.setStartRow(startRow.getBytes());
scan.setStopRow(stopRow.getBytes());

JSONObject resultJson = new JSONObject();

JSONObject temp = null;
ResultScanner scanner = table.getScanner(scan);

for (Result result : scanner) {
	JSONArray jsonArray = new JSONArray();

	String rowkey = "";
	for (KeyValue keyValue : result.raw()) {
		byte[] row = keyValue.getRow();
		rowkey = new String(row);

		String qualifier = new String(keyValue.getQualifier());
		String value = new String(keyValue.getValue());

		temp = new JSONObject();
		temp.put(qualifier, value);
		jsonArray.add(temp);
	}
	resultJson.put(rowkey, jsonArray);
}

table.close();

最后返回json的数据格式


你可能感兴趣的:(hbase实现分页查询)