Amazon S3

Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services that provides object storage through a web service interface.

Buckets and Objects

Amazon S3 allows people to store objects (files) in buckets (directories). Buckets must have a global unique name and are defined at the region level.

An object consists of data, key (assigned name), and metadata. A bucket is used to store objects. When data is added to a bucket, Amazon S3 creates a unique version ID and allocates it to the object.

Object values are the content of the body, maximum object size is 5TB, if uploading more than 5GB, must use multi-part upload feature.

S3 Versioning

Versioning in Amazon S3 is a means of keeping multiple variants of an object in the same bucket. You can use the S3 Versioning feature to preserve, retrieve, and restore every version of every object stored in your buckets.

Versioning is enabled at the bucket level.

Please refer this link to understand how versioning work.

S3 Encryption for Objects

There are 4 methods of encrypting object in S3

SSE-S3: encrypt S3 object using keys handled and managed by AWS S3
Object is encrypted server side
AES-256 encryption type
Must set header: "x-amz-server-side-encryption":"AES256
SSE-KMS: leverage AWS Key Management Service to manage encryption keys
Object is encrypted server side
Must set header: "x-amz-server-side-encryption":"aws:kms
SSE-C: when you want to manage your own encryption key
Server side encryption using data keys full managed by the customer outside AWS
AWS does not store the encryption key
HTTPS must be used
Encryption key must provided in HTTP headers, for every HTTP request made
Client side encryption
Client library such as Amazon S3 Encryption Client
Client must encrypt data themselves before sending to S3
Client must decrypt data themselves when retrieving to S3
Customer fully manages they keys and encryption cycle

S3 Security

User based
IMA policies - which API calls should be allowed for a specific user from IAM console.
Resource based
Bucket policies: bucket wide rules from S3 console - allow cross account
Object access Control List (ACL) - finer grain
Bucket access Control List (ACL) - less common

Note: an IAM principal can access an S3 object if the user IMA permissions allow it OR the resource policy ALLOW it; AND there's no DENY

S3 CORS

CORS means Cross-Origin Resource Sharing. You can find more details about CORS here

If a client does a cross-origin request on our S3 bucket, we need to enable the correct CORS headers.

How to enable CORS on S3

Open the Amazon S3 console.
Select the bucket that contains your resources.
Select Permissions.
Scroll down to Cross-origin resource sharing (CORS) and select Edit.
Insert the CORS configuration in JSON format. You can find an example configuration here
Select Save changes to save your configuration.

S3 Consistency Model

Amazon S3 delivers strong read-after-write consistency automatically for all applications, for more information you can find here

S3 Replication

Must enable versioning in source and destination
Cross Region Replication
Same Region Replication
Buckets can be in different account
Copying is asynchronous
Must give proper IAM permissions to S3
There is no chaining of replication; eg: bucket1 has replication into bucket2, which has replication into bucket3; then if objects is created in bucket1 are not replicated to bucket3

S3 Pre-Signed URLs

A presigned URL is a URL that you can provide to your users to grant temporary access to a specific S3 object.

Using the URL, a user can either READ the object or WRITE an Object (or update an existing object).

The URL contains specific parameters which are set by your application. A pre-signed URL uses three parameters to limit the access to the user:

Bucket: The bucket that the object is in (or will be in)
Key: The name of the object
Expires: The amount of time that the URL is valid

The URL itself is constructed using various parameters, which are created automatically through the AWS SDK. These include;

X-AMZ-Algorithm
X-AMZ-Credential
X-AMZ-Date
X-AMZ-Expires
X-AMZ-Signature
X-AMZ-SignedHeaders

https://presignedurldemo.s3.eu-west-2.amazonaws.com/image.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAJJWZ7B6WCRGMKFGQ%2F20180210%2Feu-west-2%2Fs3%2Faws4_request&X-Amz-Date=20180210T171315Z&X-Amz-Expires=1800&X-Amz-Signature=12b74b0788aa036bc7c3d03b3f20c61f1f91cc9ad8873e3314255dc479a25351&X-Amz-SignedHeaders=host

S3 storage classes

Amazon S3 Standard for frequent data access: Suitable for a use case where the latency should below. Example: Frequently accessed data will be the data of students’ attendance, which should be retrieved quickly.
Amazon S3 Standard for infrequent data access: Can be used where the data is long-lived and less frequently accessed. Example: Students’ academic records will not be needed daily, but if they have any requirement, their details should be retrieved quickly.
Amazon Glacier: Can be used where the data has to be archived, and high performance is not required. Example: Ex-student’s old record (like admission fee) will not be needed daily, and even if it is necessary, low latency is not required.
One Zone-IA Storage Class: It can be used where the data is infrequently accessed and stored in a single region. Example: Student’s report card is not used daily and stored in a single availability region (i.e., school).
Amazon S3 Standard Reduced Redundancy storage: Suitable for a use case where the data is non-critical and reproduced quickly. Example: Books in the library are non-critical data and can be replaced if lost.