Case Study - Customer/Order

Assume that HBase is used to store customer and order information. There are two core record-types being ingested: a Customer record type, and Order record type.

The Customer record type would include all the things that you’d typically expect:

Customer number

Customer name

Address (e.g., city, state, zip)

Phone numbers, etc.

The Order record type would include things like:

Customer number

Order number

Sales date

A series of nested objects for shipping locations and line-items (see Order Object Design for details)

Assuming that the combination of customer number and sales order uniquely identify an order, these two attributes will compose the rowkey, and specifically a composite key such as:

[customer number][order number]

for an ORDER table. However, there are more design decisions to make: are the raw values the best choices for rowkeys?

The same design questions in the Log Data use-case confront us here. What is the keyspace of the customer number, and what is the format (e.g., numeric? alphanumeric?) As it is advantageous to use fixed-length keys in HBase, as well as keys that can support a reasonable spread in the keyspace, similar options appear:

Composite Rowkey With Hashes:

[MD5 of customer number] = 16 bytes

[MD5 of order number] = 16 bytes

Composite Numeric/Hash Combo Rowkey:

[substituted long for customer number] = 8 bytes

[MD5 of order number] = 16 bytes

你可能感兴趣的:(hbase)