Buckets
Overview
- Buckets allow you to persistently store files.
- A Bucket is a collection of "objects". An object is a file and any metadata (headers, tags,...) that describes that file.
- Buckets are easy to use, reliable, durable and highly available. They are powered by AWS S3 object storage service.
- Buckets are serverless. They scale automatically to meet your needs and you are paying only for the files saved (not unused capacity).
- Buckets have a flat structure instead of a hierarchy like you would find in a file system. However, you can simulate a
"folder hierarchy" by using a common prefix. For example, all objects stored with a name that starts with
photos/
prefix will be shown in the same "folder" in the AWS console.
Under the hood
Buckets are abstraction of AWS S3 Buckets.
When to use
Buckets can be used to host websites, store user-generated content, store data for Big data analytics, serve as data lakes, backup and restore, archives, etc.
Because of performance reasons, object storage in general is not optimal for use-cases that require very low read/write latency.
Advantages
Easy to use - AWS S3 has a simple, HTTP-based API. You can also easily manipulate your objects using an AWS SDK.
Serverless - Bucket automatically scales to meet your needs. You don't have to worry about scaling your storage size and are paying only for the files saved.
Reliable & highly available - S3 Buckets are designed for 99.999999999% (11 9’s) of data durability. Files (objects) are stored across multiple physical locations (Availability zones).
Storage flexibility - You can store your files in multiple storage classes. Different storage classes have different latencies, pricing and durability.
Access control - You can easily control which compute resources can access your bucket.
Supports encryption - Supports server-side encryption.
Integrations - You can easily trigger a function or a batch-job in a reaction to a bucket event.
Disadvantages
- Performance - Compared to using a block storage (physical disk attached to a machine), reading/write operations are significantly slower.
Basic usage
Copy
resources:myBucket:type: bucketmyFunction:type: functionproperties:packaging:type: stacktape-lambda-buildpackproperties:entryfilePath: path/to/my/lambda.tsenvironment:- name: BUCKET_NAMEvalue: $ResourceParam('myBucket', 'arn')connectTo:- myBucket
Lambda function connected to a bucket
Copy
import { S3 } from '@aws-sdk/client-s3';const s3 = new S3({});// getObject returns a readable stream, so we need to transform it to stringconst streamToString = (stream) => {const chunks = [];return new Promise((resolve, reject) => {stream.on('data', (chunk) => chunks.push(Buffer.from(chunk)));stream.on('error', (err) => reject(err));stream.on('end', () => resolve(Buffer.concat(chunks).toString('utf8')));});};const handler = async (event, context) => {await s3.putObject({Bucket: process.env.BUCKET_NAME,Key: 'my-file.json',Body: JSON.stringify({ message: 'hello' }) // or fs.createReadStream('my-source-file.json')});const res = await s3.getObject({Bucket: process.env.BUCKET_NAME,Key: 'my-file.json'});const body = await streamToString(res.Body);};export default handler;
Lambda function that uploads and downloads a file from a bucket
Directory upload
- Allows you to upload a specified directory to the bucket on every deployment
- After the upload is finished, your bucket will contain the contents of the local folder.
- Files are uploaded using parallel, multipart uploads.
Existing contents of the bucket will be deleted and replaced with the contents of the local directory.
You should not use directoryUpload
for buckets with application-generated or user-generated content.
Copy
resources:myBucket:type: bucketproperties:directoryUpload:directoryPath: ../public
Uploads the public
folder to the bucket on every deployment
Adding metadata
- You can add metadata to uploaded files. You can configure
headers
andtags
. - Filters allow you to configure properties of files (objects) that match the filter pattern.
- Configurable properties:
headers
: when a client makes a request for an object, these headers will be added to the HTTP response. If you are using a CDN, all headers will be forwarded. If you are using the bucket to host a website, this can be for example used to add cache-control headers on a per-file basis (different cache behavior for different file types).tags
: can be used to filter specific objects, for example in Lifecycle rules.
Encryption
- If enabled, all objects uploaded to the bucket will be server-side encrypted using the AES256 algorithm.
Copy
resources:myBucket:type: bucketproperties:encryption: true
Cors
Web browsers use CORS (Cross-Origin Resource Sharing) to block the website from making requests to a different origin (server) than the one the website is served from. This means that if you make a request from a website served from
https://my-website.s3.eu-west-1.amazonaws.com/
tohttps://my-api.my-domain.com
, the request will fail.If you enable CORS and do not specify any cors rules, the default rule with following configuration is used:
- AllowedMethods:
GET
,PUT
,HEAD
,POST
,DELETE
- AllowedOrigins: '*'
- AllowedHeaders:
Authorization
,Content-Length
,Content-Type
,Content-MD5
,Date
,Expect
,Host
,x-amz-content-sha256
,x-amz-date
,x-amz-security-token
- AllowedMethods:
When the bucket receives a preflight request from a browser, it evaluates the CORS configuration for the bucket and uses the first CORS rule that matches the incoming browser request to enable a cross-origin request. For a rule to match, the following conditions must be met:
- The request's Origin header must match one of allowedOrigins element.
- The request method (for example,
GET
orPUT
) or theAccess-Control-Request-Method
header in case of a preflightOPTIONS
request must be one of the allowedMethods. - Every header listed in the request's
Access-Control-Request-Headers
header on the preflight request must match one of headers allowedHeaders.
Copy
resources:myBucket:type: bucketproperties:cors:enabled: true
Versioning
- If enabled, bucket keeps multiple variants of an object.
- This can help you to recover objects from an accidental deletion/overwrite, or to store multiple objects with the same name.
Copy
resources:myBucket:type: bucketproperties:versioning: true
CDN
You can use CDN with bucket.
By putting CDN in front of a bucket, you can distribute content of the bucket to hundreds of edge locations all around the world.
This is useful in cases if you want to for example serve a static website from your bucket.
Stacktape can:
- automatically upload your content (e.g. static website) to the bucket
- configure CDN and deliver your content with minimal latency across the world
For information about using CDN refer to our CDN docs.
Copy
resources:myBucket:type: bucketproperties:directoryUpload:directoryPath: my-web/buildheadersPreset: static-websitecdn:enabled: true
Bucket with CDN and directory upload enabled
If you wish to deploy/serve static content (frontend/website) see also hosting bucket resource, which is a bucket pre-configured for these use-cases.
Object lifecycle rules
- Lifecycle rules allow you to configure what happens to your objects after a configured period of time.
- They can be deleted, transitioned to another storage class, etc.
- These rules can be applied to only a subset of objects in the bucket using path prefix and object tags.
Storage class transition
- By default, all objects are in the standard (general purpose) class.
- Depending on your access patterns, you can transition your objects to a different storage class to save costs.
- To better understand differences between storage classes, refer to AWS Docs
- To learn more about storage class transitions, refer to AWS Docs
Copy
resources:myBucket:type: bucketproperties:lifecycleRules:- type: class-transitionproperties:daysAfterUpload: 90storageClass: 'GLACIER'
Bucket configured to transfer all objects to GLACIER storage class 90 days after being uploaded
Expiration
Allows you to delete objects from the bucket after the specified amount of days after upload.
This can be useful in cases if objects become irrelevant for you after some time. These objects can be deleted, thus saving you storage costs.
The following example shows:
- All uploaded objects are transferred to GLACIER storage class 90 days after upload.
- After 365 days, objects are completely deleted.
Copy
resources:myBucket:type: bucketproperties:lifecycleRules:- type: class-transitionproperties:daysAfterUpload: 90storageClass: 'GLACIER'- type: expirationproperties:daysAfterUpload: 365
Non-current version class transition
Allows you to transition versioned objects into a different storage class.
Same as class transition rule but applied to old versions of objects. This can be useful when you want to archive old versions of an object after some time to save you costs.
The following example shows:
- all versioned objects are transferred to GLACIER storage class 10 days after they are versioned (become non-current version).
Copy
resources:myBucket:type: bucketproperties:versioning: truelifecycleRules:- type: non-current-version-class-transitionproperties:daysAfterVersioned: 10storageClass: 'DEEP_ARCHIVE'
Non-current version expiration
Allows you to delete old versions of objects from the bucket after the specified amount of days after they are versioned.
This can be useful if you only need to keep older versions of objects for some time and then they can be deleted thus saving you storage costs.
The following example shows:
- all versioned objects are deleted ten days after being version (become non-current version).
Copy
resources:myBucket:type: bucketproperties:versioning: truelifecycleRules:- type: non-current-version-expirationproperties:daysAfterVersioned: 10
Abort incomplete multipart upload
Allows you to stop multipart uploads that do not complete within a specified number of days after being initiated.
When a multipart upload is not completed within the time-frame, it becomes eligible for an abort operation, and Amazon S3 stops the multipart upload (and deletes the parts associated with the multipart upload, thus saving you costs).
The following example shows:
- all incomplete uploads are stopped (and parts deleted) 5 days after initiation of multipart upload.
Copy
resources:myBucket:type: bucketproperties:lifecycleRules:- type: abort-incomplete-multipart-uploadproperties:daysAfterInitiation: 5
Accessibility
- Configures who can access the bucket.
Accessibility modes
- Allows you to easily configure the most commonly used access patterns.
- Available modes:
public-read-write
- Everyone can read from and write to the bucket.public-read
- Everyone can read from the bucket. Only compute resources and entities with sufficient IAM permissions can write to the bucket.private
- (default) Only compute resources and entities with sufficient IAM permissions can read from or write to the bucket.
- For functions, batch jobs and container workloads, you can grant required IAM permissions to read/write from
the bucket using
allowsAccessTo
oriamRoleStatements
in their configuration.
Access policy statements
- For a more fine-grained access control, you can configure accessPolicyStatements.
- Using these requires knowledge of AWS IAM Docs.
- You can find a list of example bucket policies in AWS Docs.
Copy
resources:myBucket:type: bucketproperties:accessibility:accessibilityMode: privateaccessPolicyStatements:- Resource:- $ResourceParam('myBucket', 'arn')Action:- 's3:ListBucket'Principal: '*'
Referenceable parameters
The following parameters can be easily referenced using $ResourceParam directive directive.
To learn more about referencing parameters, refer to referencing parameters.
AWS (physical) name of the bucket
- Usage:
$ResourceParam('<<resource-name>>', 'name')
Arn of the bucket
- Usage:
$ResourceParam('<<resource-name>>', 'arn')
Default domain of the CDN distribution (only available if you DO NOT configure custom domain names for the CDN).
- Usage:
$ResourceParam('<<resource-name>>', 'cdnDomain')
Default url of the CDN distribution (only available if you DO NOT configure custom domain names for the CDN).
- Usage:
$ResourceParam('<<resource-name>>', 'cdnUrl')
Comma-separated list of custom domain names assigned to the CDN (only available if you configure custom domain names for the CDN).
- Usage:
$ResourceParam('<<resource-name>>', 'cdnCustomDomains')
Comma-separated list of custom domain name URLs of the CDN (only available if you configure custom domain names for the CDN).
- Usage:
$ResourceParam('<<resource-name>>', 'cdnCustomDomainUrls')