Dynamo service对PrimaryKey的规则定义:
1. 可以只有一列作为primary key,则该列作为hash的输入,也就是用来分Partition
2. 可以由两列作为composite primary key,则第一列仍然作为hash输入,用来分partition,而第二列及其之后的列就可以提供range的查询了
Simple Hash Key
。这个Key对应的Attribute在每个Item里都必须存在而且唯一。
Composite Hash Key with Range Keys
。顾名思义,在Hask Key的基础上,用户可以增加一个Attribute作为范围查询的Key。此时不需要保证Hash Key的唯一性,只需要这两个组合Key是唯一的就可以。这在我们做时间范围查询时非常有用,比如某个用户在24小时内访问过的网页。
DynamoDB的数据会在不同的地理位置机房保存3份。3份数据的同步通常在1秒内完成。
Read consistency
l 最终一致性读((Default): 写操作之后立刻发起读操作返回的可能不是最新数据。只是保证数据最终同步成一致。
最终一致性读操作是强一致性读的半价.
l 强一致性读: 写操作之后立刻发起读操作返回的就是最新数据.
支持单记录原子操作(Atomic Counters):
支持条件更新,支持更新时返回所有属性旧/新值、被更新属性旧/新值
支持基于非主键查询: 使用scan扫描全表,效率较差。
Provisioned Throughput:读写操作的速率限制。用户要指定每张表第秒能提供多少次读写操作(以1KB大小的记录为基准).
Units ofCapacity required for writes = Number of item writes per second x item size(rounded up to the nearest KB)
Units ofCapacity required for reads* = Number of item reads per second x item size(rounded up to the nearest KB), 以实际读取的记录数为准, 和API调用次数无关。 例如想从一张表中每秒读取500条记录, 不管调用的是50个BatchGetItem(每个call返回10条记录)还是500个GetItem, Throughput都必须设为500.
如果读写速度超过了设定的上限, 超出部分读写将失败。
限制:
1. Item size < 64KB(include attribute name and value binary length (UTF-8 length))
2. Attribute values: Attributevalues cannot be null or empty.
3. Hash primary key attribute value< 2048 bytes
4. Range primary key attributevalue < 1024 bytes
5. Query result < 1MBper API call
6. Scan data set size < 1MBper API call(每次只能扫描1MB数据集)
In case of a scan operation, it is not the size of items returned by scan,rather it is the size of items evaluated by Amazon DynamoDB. That is, for ascan request, Amazon DynamoDB evaluates up to 1 MB of items and returns onlythe items that satisfy the scan condition.
a single scan request consumes up to 1 MB / 1 KB = 500 capacity units(because scan returns only eventually consistent result which takes half the capacity units of a consistent read), which is a sudden burst of usage of the configured capacity units for the table. This sudden use of capacity units by a scan starves your other potentially more important requests for the same table from using the available capacity units. As a result, you likely get the "ProvisionedThroughputExceeded" exception for those requests.
You should configure your application to retry any request that receives a response code that indicates you have exceeded your provisioned throughput, or increase the provisioned throughput for your table using the UpdateTable API. If you have temporary spikes in your workload that cause your throughput to exceed, occasionally, beyond the provisioned level, retry the request with exponential backoff.
Error Retries and Exponential Backoff
Numerous components on a network, such as DNS servers, switches, load-balancers, and others can generate errors anywhere in the life of a given request.
The usual technique for dealing with these error responses in a networked environment is to implement retries in the client application. This technique increases the reliability of the application and reduces operational costs for the developer.
Each AWS SDK supporting Amazon DynamoDB implements retry logic, automatically.The AWS SDK for Java automatically retries requests, and you can configure the retry settings using the ClientConfiguration class.For example, in some cases, such as a web page making a request with minimal latency and no retries, you might want to turn off the retry logic. Use the ClientConfiguration class and provide a maxErrorRetry value of 0 to turn off the retries. For more information, see Using the AWS SDKs with Amazon DynamoDB.
http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/ErrorHandling.html#APIRetries
http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/WorkingWithDDTables.html#CapacityUnitCalculations