DynamoDB简介

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.CoreComponents.html

DynamoDB是AWS提供的key/value和document型数据库。DynamoDB既是key/value数据库,也是document数据库,原因在下面解释。

1,所有的数据都存储在SSD中,可以实现极高的数据吞吐。

2,DynamoDB支持ACID;默认加密所有数据;备份可达数百TB,可恢复到35天内的任意时刻;无服务器,无管理工作,可实现基于性能的自动扩展。

3,DynamoDB是典型的NOSQL数据库,不支持类SQL语言,因此通常用DynamoDB用于程序数据存储,而不用于交互式数据查询或分析。

DynamoDB的数据结构

DynamoDB中的数据是以JSON为结构的document数据。数据组织在table中;一行数据称为一个item;一个attribute就是一个字段。

DynamoDB简介_第1张图片

1,Each item has a primary key. 在上例中是PersonID。primary key是item的唯一标识,既可以是单个attribute,也可以多个attribute组成。正因为如此,才说DynamoDB既是key/value数据库,也是document数据库。

一个attribute作为primary key称为partition key,两个attribute作为primary key称为partition key and sort key,具体解释见下文。primary key就是key-value模型中的key。

2,除了primary key之外的其他attribute都是schemaless的。

3,支持nested attribute。

 

  • Partition key – A simple primary key, composed of one attribute known as the partition key.

    DynamoDB uses the partition key's value as input to an internal hash function. The output from the hash function determines the partition (physical storage internal to DynamoDB) in which the item will be stored.

    In a table that has only a partition key, no two items can have the same partition key value.

    The People table described in Tables, Items, and Attributes is an example of a table with a simple primary key (PersonID). You can access any item in the People table directly by providing the PersonId value for that item.

  • Partition key and sort key – Referred to as a composite primary key, this type of key is composed of two attributes. The first attribute is the partition key, and the second attribute is the sort key.

    DynamoDB uses the partition key value as input to an internal hash function. The output from the hash function determines the partition (physical storage internal to DynamoDB) in which the item will be stored. All items with the same partition key value are stored together, in sorted order by sort key value.

    In a table that has a partition key and a sort key, it's possible for two items to have the same partition key value. However, those two items must have different sort key values.

DynamoDB与AWS EMR

Amazon EMR is the industry leading cloud-native big data platform, allowing teams to process vast amounts of data quickly, and cost-effectively at scale. Using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, and Presto, coupled with the dynamic scalability of Amazon EC2 and scalable storage of Amazon S3

使用EMR需要在EC2上搭建Hadoop cluster。

DynamoDB可通过EMR,实现与Spark,Hive, HBase的数据的连接与交换。

DynamoDB与数据共享

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/OtherServices.html

如何在不同的组件中共享数据时很重要的,DynamoDB实现了:

1,将HDFS中的数据,定义为Hive table后,拷贝到DynamoDB中。

2,仅需一条命令,即可将DynamoDB表中的数据,拷贝到S3中。

3,在DynamoDB中定义外部表,并指向数据在S3中的位置。将S3表中的数据,load到DynamoDB中。

4,将DynamoDB中数据放置在Redshift,可现在对数据的交互式查询。在Redshift中直接运行语句,即可将DynamoDB中的数据load到Redshift中。

以上方式总的来说是数据在不同服务间的物理上的移动,而不是直接的查询。

你可能感兴趣的:(AWS,NoSQL数据库)