Amazon S3
Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services that provides object storage through a web service interface.
Buckets and Objects
Amazon S3 allows people to store objects (files) in buckets (directories). Buckets must have a global unique name and are defined at the region level.
An object consists of data, key (assigned name), and metadata. A bucket is used to store objects. When data is added to a bucket, Amazon S3 creates a unique version ID and allocates it to the object.
Object values are the content of the body, maximum object size is 5TB, if uploading more than 5GB, must use multi-part
upload feature.
S3 Versioning
Versioning in Amazon S3 is a means of keeping multiple variants of an object in the same bucket. You can use the S3 Versioning feature to preserve, retrieve, and restore every version of every object stored in your buckets.
Versioning is enabled at the bucket level.
Please refer this link to understand how versioning work.
S3 Encryption for Objects
There are 4 methods of encrypting object in S3
-
SSE-S3: encrypt S3 object using keys handled and managed by AWS S3
-
Object is encrypted server side
- AES-256 encryption type
-
Must set header:
"x-amz-server-side-encryption":"AES256
-
SSE-KMS: leverage AWS Key Management Service to manage encryption keys
-
Object is encrypted server side
-
Must set header:
"x-amz-server-side-encryption":"aws:kms
-
SSE-C: when you want to manage your own encryption key
-
Server side encryption using data keys full managed by the customer outside AWS
- AWS does not store the encryption key
- HTTPS must be used
-
Encryption key must provided in HTTP headers, for every HTTP request made
-
Client side encryption
-
Client library such as Amazon S3 Encryption Client
- Client must encrypt data themselves before sending to S3
- Client must decrypt data themselves when retrieving to S3
- Customer fully manages they keys and encryption cycle
S3 Security
-
User based
-
IMA policies - which API calls should be allowed for a specific user from IAM console.
-
Resource based
-
Bucket policies: bucket wide rules from S3 console - allow cross account
- Object access Control List (ACL) - finer grain
- Bucket access Control List (ACL) - less common
Note: an IAM principal can access an S3 object if the user IMA permissions allow it OR the resource policy ALLOW it; AND there's no DENY
S3 CORS
CORS means Cross-Origin Resource Sharing. You can find more details about CORS here
If a client does a cross-origin request on our S3 bucket, we need to enable the correct CORS headers.
How to enable CORS on S3
- Open the Amazon S3 console.
- Select the bucket that contains your resources.
- Select Permissions.
- Scroll down to Cross-origin resource sharing (CORS) and select Edit.
- Insert the CORS configuration in JSON format. You can find an example configuration here
- Select Save changes to save your configuration.
S3 Consistency Model
Amazon S3 delivers strong read-after-write consistency automatically for all applications, for more information you can find here
S3 Replication
- Must enable versioning in source and destination
- Cross Region Replication
- Same Region Replication
- Buckets can be in different account
- Copying is asynchronous
- Must give proper IAM permissions to S3
- There is no
chaining
of replication; eg: bucket1 has replication into bucket2, which has replication into bucket3; then if objects is created in bucket1 are not replicated to bucket3
S3 Pre-Signed URLs
A presigned URL is a URL that you can provide to your users to grant temporary access to a specific S3 object.
Using the URL, a user can either READ the object or WRITE an Object (or update an existing object).
The URL contains specific parameters which are set by your application. A pre-signed URL uses three parameters to limit the access to the user:
- Bucket: The bucket that the object is in (or will be in)
- Key: The name of the object
- Expires: The amount of time that the URL is valid
The URL itself is constructed using various parameters, which are created automatically through the AWS SDK. These include;
- X-AMZ-Algorithm
- X-AMZ-Credential
- X-AMZ-Date
- X-AMZ-Expires
- X-AMZ-Signature
- X-AMZ-SignedHeaders
1 |
|
S3 storage classes
-
Amazon S3 Standard for frequent data access: Suitable for a use case where the latency should below. Example: Frequently accessed data will be the data of students’ attendance, which should be retrieved quickly.
-
Amazon S3 Standard for infrequent data access: Can be used where the data is long-lived and less frequently accessed. Example: Students’ academic records will not be needed daily, but if they have any requirement, their details should be retrieved quickly.
-
Amazon Glacier: Can be used where the data has to be archived, and high performance is not required. Example: Ex-student’s old record (like admission fee) will not be needed daily, and even if it is necessary, low latency is not required.
-
One Zone-IA Storage Class: It can be used where the data is infrequently accessed and stored in a single region. Example: Student’s report card is not used daily and stored in a single availability region (i.e., school).
-
Amazon S3 Standard Reduced Redundancy storage: Suitable for a use case where the data is non-critical and reproduced quickly. Example: Books in the library are non-critical data and can be replaced if lost.