Amazon S3 provides developers and IT teams with secure, durable, and highly-scalable cloud storage.
Common use cases for Amazon S3 storage include:
Backup and archive for on-premises or cloud data
Content, media, and software storage and distribution
Big data analytics
Static website hosting
Cloud-native mobile and Internet application hosting
Disaster recovery
Each Amazon S3 object contains both data and metadata.
Objects reside in containers called buckets,and each object is identified by a unique user-specified key (filename). Buckets are a simple flat folder with no file system hierarchy.
data in an Amazon S3 bucket is stored in that region unless you explicitly copy it to another bucket located in a different region.
Each object consists of data (the file itself) and metadata (data about the file).
Objects can range in size from 0 bytes up to 5TB
This means that Amazon S3 can store a virtually unlimited amount of data.
The metadata associated with an Amazon S3 object is a set of name/value pairs that describe the object.
Every object stored in an S3 bucket is identified by a unique identifier called a key.
The combination of bucket, key, and optional version ID uniquely identifies an
Amazon S3 object.
Amazon S3 is storage for the Internet, and every Amazon S3 object can be addressed by a unique URL formed using the web services endpoint, the bucket name, and the object key.
For example, with the URL:
http://mybucket.s3.amazonaws.com/jack.doc
Amazon S3 standard storage is designed for 99.999999999% durability and 99.99% availability of objects over a given year.
If you need to store non-critical or easily reproducible derived data (such as image thumbnails) that doesn’t require this high level of durability, you can choose to use Reduced Redundancy Storage (RRS) at a lower cost. RRS offers 99.99%
Amazon S3 is an eventually consistent system.
Amazon S3 provides read-after write consistency.
Amazon S3 provides both coarse-grained access controls (Amazon S3 Access Control Lists [ACLs]), and fine-grained access controls (Amazon S3 bucket policies, AWS Identity and Access Management [IAM] policies, and query-string authentication).
A very common use case for Amazon S3 storage is static website hosting.
Storage Classes
Amazon S3 Standard offers high durability, high availability, low latency, and high performance object storage for general purpose use
Amazon S3 Standard – Infrequent Access (Standard-IA) long-lived, less
frequently accessed data.
Amazon S3 Reduced Redundancy Storage (RRS) offers slightly lower durability (4 nines).
the Amazon Glacier storage class offers secure, durable, and extremely low-cost cloud storage for data that does not require real-time access, such as archives and long-term backups.
Object Lifecycle Management
Amazon S3 Object Lifecycle Management is roughly equivalent to automated storage tiering in traditional IT storage infrastructures.
Store backup data initially in Amazon S3 Standard.
After 30 days, transition to Amazon Standard-IA.
After 90 days, transition to Amazon Glacier.
After 3 years, delete.
Encryption
SSE-S3 (AWS-Managed Keys)
SSE-KMS (AWS KMS Keys)
SSE-C (Customer-Provided Keys)
Client-Side Encryption
Versioning
Versioning is turned on at the bucket level. Once enabled, versioning cannot be removed from a bucket; it can only be suspended.
MFA Delete
MFA Delete adds another layer of data protection on top of bucket versioning
Pre-Signed URLs
All Amazon S3 objects by default are private, meaning that only the owner has access.
However, the object owner can optionally share objects with others by creating a pre-signed URL, using their own security credentials to grant time-limited permission to download the objects.
The pre-signed URLs are valid only for the specified duration.
Multipart Upload
Amazon S3 provides the Multipart Upload API. This allows you to upload large objects as a set of parts, which generally gives better network utilization (through parallel transfers), the ability to pause and resume, and the ability to upload objects where the size is initially unknown.
Range GETs
It is possible to download (GET) only a portion of an object in both Amazon S3 and Amazon Glacier by using something called a Range GET.
This can be useful in dealing with large objects when you have poor
connectivity or to download only a known portion of a large Amazon Glacier backup.
Cross-Region Replication
Cross-region replication is a feature of Amazon S3 that allows you to asynchronously replicate all new objects in the source bucket in one AWS region to a target bucket in another region.
Logging
In order to track requests to your Amazon S3 bucket, you can enable Amazon S3 server access logs.
Event Notifications
Amazon S3 event notifications can be sent in response to actions taken on objects uploaded or stored in Amazon S3.
Amazon Glacier
Amazon Glacier is designed for infrequently accessed data where a retrieval time of three to five hours is acceptable.
Common use cases for Amazon Glacier include replacement of traditional tape solutions for long-term backup and archive and storage of data required for compliance purposes. In most cases, the data stored in Amazon Glacier consists of large TAR (Tape Archive) or ZIP files.
In Amazon Glacier, data is stored in archives. An archive can contain up to 40TB of data,you cannot specify a user-friendly archive name.
Vaults
Vaults are containers for archives. Each AWS account can have up to 1,000 vaults. You can control access to your vaults and the actions allowed using IAM policies or vault access policies.
Vaults Locks
You can easily deploy and enforce compliance controls for individual Amazon Glacier vaults with a vault lock policy.
Data Retrieval
You can retrieve up to 5% of your data stored in Amazon Glacier for free each month
Amazon Glacier versus Amazon Simple Storage Service (Amazon S3)
Amazon Glacier is similar to Amazon S3, but it differs in several key aspects. Amazon Glacier supports 40TB archives versus 5TB objects in Amazon S3. Archives in Amazon Glacier are identified by system-generated archive IDs, while Amazon S3 lets you use “friendly” key names. Amazon Glacier archives are automatically encrypted, while encryption at rest is optional in Amazon S3. However, by using Amazon Glacier as an Amazon S3 storage class
together with object lifecycle policies, you can use the Amazon S3 interface to get most of the benefits of Amazon Glacier without learning a new interface.