Stacktape

Sign up



Batch Jobs

Overview and basic concepts

  • Batch job is a computing resource - it runs your code. Batch job runs until your code finishes processing.

  • The execution of a batch job is initiated by an event (such as incoming request to HTTP API Gateway, message arriving to an SQS queue, or object being created in an S3 bucket)

  • Batch jobs can be configured to use spot instances, which can help you save up to 90% of computing costs.

  • Similarly to functions and container workloads, batch jobs are serverless and fully managed. This means you don't have to worry about administration tasks such as provisioning and managing servers, scaling, VM security, OS security & much more.

  • Stacktape batch job consists of:

    • User-defined Docker container (runs your code)
    • Lambda function & State-machine (stacktape-managed, used to manage the lifecycle, integrations and execution of the batch job)
  • In addition to CPU and RAM, you can also configure a GPU for your batch job's environment.

When to use

Batch jobs are ideal for long-running and resource-demanding tasks, such as data-processing and ETL pipelines, training a machine-learning model, etc.

Advantages

  • Pay-per-use - You only pay for the compute resources your jobs use.
  • Resource flexibility - Whether your job requires 1 CPU or 50 CPUs, 1GiB or 128Gib of memory, the self-managed compute environment always meets your needs by spinning up the optimal instance to run your job.
  • Time flexibility - Unlike functions, batch jobs can run indefinitely.
  • Secure by default - Underlying environment is securely managed by AWS.
  • Easy integration - batch-job can be invoked by events from a wide variety of services.

Disadvantages

  • Slow start time - After a job execution is triggered, the job instance is put into an execution queue and can take anywhere from few seconds up to few minutes to start.

Basic usage

Copy

resources:
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: schedule
properties:
scheduleRate: cron(0 14 * * ? *) # every day at 14:00 UTC

Copy

(async () => {
const event = JSON.parse(process.env.STP_TRIGGER_EVENT_DATA);
// process the event
})();

Container

  • Batch jobs execution runs a Docker container inside a fully managed batch environment.
  • You can configure the following properties of the container:
packaging
Required
environment

Image

  • Docker container is a running instance of a Docker image.
  • The image for your container can be supplied in 4 different ways:

Environment variables

Most commonly used types of environment variables:

Copy

environment:
- name: STATIC_ENV_VAR
value: my-env-var
- name: DYNAMICALLY_SET_ENV_VAR
value: $MyCustomDirective('input-for-my-directive')
- name: DB_HOST
value: $ResourceParam('myDatabase', 'host')
- name: DB_PASSWORD
value: $Secret('dbSecret.password')


Pre-set environment variables

Stacktape pre-sets the following environment variables:

NameValue
STP_TRIGGER_EVENT_DATAContains JSON stringified event from event integration that triggered this batch job.
STP_MAXIMUM_ATTEMPTSAbsolute amount of attempts this batch-job gets, before it is considered failed.
STP_CURRENT_ATTEMPTSerial number of this attempt

Logging

  • Every time your code outputs (prints) something to the stdout or stderr, your log will be captured and stored in a AWS CloudWatch log group.
  • You can browse your logs in 2 ways:
    • go to your batch job's log-group in the AWS CloudWatch console. You can use stacktape stack-info command to get a direct link.
    • use stacktape logs command that will print logs to the console.
  • You can optionally disable logging or set retention for keeping logs.
BatchJobLogging  API reference
Parent API reference: BatchJob
disabled
retentionDays
Default: 90

Computing resources

  • You can configure the amount of resource your batch job will have access to.
  • In addition to CPU and RAM, batch jobs also allow you to configure GPU. To learn more about GPU instances, refer to AWS Docs.
  • Behind the scenes, AWS Batch selects an instance type (from the C4, M4, and R4 instance families) that best fits the needs of the jobs with a preference for the lowest-cost instance type (BEST_FIT strategy).

If you define memory required for your batch-job in multiples of 1024 be aware: Your self managed environment might spin up instances that are much bigger than expected. This can happen because the instances in your environment need memory to handle the management processes (managed by AWS) associated with running the batch job. Example: If you define 8192 memory for your batch-job, you might expect that the self managed environment will primarily try to spin up one of the instances from used families with memory 8GiB(8192MB). However, the self managed environment knows that instance with such memory would not be sufficient for both the batch job and management processes. As a result, it will try to spin up a bigger instance. To learn more about this issue, refer to AWS Docs Due to this behaviour, we advise to specify memory for your batch-jobs smartly. I.e instead of specifying 8192, consider specifying lower value, i.e 7680. This way the self managed environment will be able to use instances with 8GiB (8192MB) of memory, which can lead to cost saving.

If you define GPUs, instances are chosen according to your need from the GPU accelerated families:

  • p2 family: NVIDIA K80 GPU. More in AWS Docs
  • p3 family: NVIDIA V100 Tensor Core. More in AWS Docs
  • g3 family and g3s family: Tesla M60 GPU. More in AWS Docs
  • g4 family: T4 Tensor Core GPU. More in AWS Docs
BatchJobResources  API reference
Parent API reference: BatchJob
cpu
Required
memory
Required
gpu

Copy

resources:
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: batch-jobs/js-batch-job.js
resources:
cpu: 2
memory: 1800
events:
- type: schedule
properties:
scheduleRate: 'cron(0 14 * * ? *)' # every day at 14:00 UTC

Spot instances

  • Batch jobs can be configured to use spot instances.
  • Spot instances leverage AWS's spare computing capacity and can cost up to 90% less than "onDemand" (normal) instances.
  • However, your batch job can be interrupted at any time, if AWS needs the capacity back. When this happens, your batch job receives a SIGTERM signal and you then have 120 seconds to save your progress or clean up.
  • Interruptions are usually infrequent as can be seen in the AWS Spot instance advisor.
  • To learn more about spot instances, refer to AWS Docs.

Copy

resources:
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
useSpotInstances: true

Retries

  • If the batch job exits with non-zero exit code (due to internal failure, timeout, spot instance interruption from AWS, etc.) and attempts are not exhausted, it can be retried.
BatchJobRetryConfiguration  API reference
Parent API reference: BatchJob
attempts
Default: 1
retryIntervalSeconds
retryIntervalMultiplier
Default: 1

Timeout

  • When the timeout is reached, the batch job will be stopped.
  • If the batch job fails and maximum attempts are not yet exhausted, it will be retried.

Copy

resources:
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
timeout: 1200

Storage

  • Each batch job instance has access to its own ephemeral storage. It's removed after the batch job finishes processing or fails.
  • It has a fixed size of 20GB.
  • To store data persistently, consider using Buckets.

Lifecycle process

The lifecycle of your batch job is fully managed. Stacktape leverages 2 extra resources to achieve this:

  • Trigger function

    • Stacktape-managed AWS lambda function used to connect event integration to the batch job and start the execution of the batch job state machine
  • Batch job state machine

    • Stacktape-managed AWS State machine used to control the lifecycle of the batch job container.

Batch job execution flow:

  1. Trigger function receives the event from one of its integrations
  2. Trigger function starts the execution of the batch job state machine
  3. Batch job state machine spawns the batch job instance (Docker container) and controls its lifecycle.

Trigger events

  • Batch jobs are invoked ("triggered") in a reaction to an event.
  • Each batch job can have multiple event integrations.
  • Payload (data) received by the batch job depends on the event integration. It is accessible using the STP_TRIGGER_EVENT_DATA environment variable as a JSON stringified value.
  • Be careful when connecting your batch jobs to event integrations that can spawn your batch job. Your batch job can get triggered many times a second, and this can get very costly.
  • Example: connecting your batch job to an HTTP API Gateway and receiving 1000 HTTP requests will result in 1000 invocations.

HTTP Api event

  • The batch job is triggered in a reaction to an incoming request to the specified HTTP API Gateway.
  • HTTP API Gateway selects the route with the most-specific match. To learn more about how paths are evaluated, refer to AWS Docs

Copy

resources:
myHttpApi:
type: http-api-gateway
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: http-api-gateway
properties:
httpApiGatewayName: myHttpApi
path: /hello
method: GET

Lambda function connected to an HTTP API Gateway "myHttpApi"

HttpApiIntegration  API reference
Parent API reference: BatchJob
type
Required
properties.httpApiGatewayName
Required
properties.method
Required
properties.path
Required
properties.authorizer
properties.payloadFormat
Default: '1.0'

Cognito authorizer

  • Using Cognito authorizer allows only the users authenticated with User pool to access your batch job.
  • Request must include access token (specified as a bearer token, { Authorization: "<<your-access-token>>"" })
  • If the request is successfully authorized, your batch job will receive some authorization claims in its payload. To get more information about the user, you can use getUser API Method
  • HTTP API uses JWT(JSON Web token)-based authorization. To lean more about how requests are authorized, refer to AWS Docs.

Copy

resources:
myGateway:
type: http-api-gateway
myUserPool:
type: user-auth-pool
properties:
userVerificationType: email-code
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: http-api-gateway
properties:
httpApiGatewayName: myGateway
path: /some-path
method: '*'
authorizer:
type: cognito
properties:
userPoolName: myUserPool

Example cognito authorizer

Copy

import { CognitoIdentityProvider } from '@aws-sdk/client-cognito-identity-provider';
const cognito = new CognitoIdentityProvider({});
(async () => {
const event = JSON.parse(process.env.STP_TRIGGER_EVENT_DATA);
const userData = await cognito.getUser({ AccessToken: event.headers.authorization });
// do something with your user data
})();

Example lambda batch job that fetches user data from Cognito

CognitoAuthorizer  API reference
Parent API reference: HttpApiIntegration
type
Required
properties.userPoolName
Required
properties.identitySources

Lambda authorizer

  • When using Lambda authorizer, a special lambda function determines if the client can access your batch job.
  • When a request arrives to the HTTP API Gateway, lambda authorizer function is invoked. It must return either a simple response indicating if the client is authorized

Copy

{
"isAuthorized": true,
"context": {
"exampleKey": "exampleValue"
}
}

Simple lambda authorizer response format

or an IAM Policy document (when the iamReponse property is set to true, you can further configure permissions of the target batch job)

Copy

{
"principalId": "abcdef", // The principal user identification associated with the token sent by the client.
"policyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Action": "execute-api:Invoke",
"Effect": "Allow|Deny",
"Resource": "arn:aws:execute-api:{regionId}:{accountId}:{apiId}/{stage}/{httpVerb}/[{resource}/[{child-resources}]]"
}
]
},
"context": {
"exampleKey": "exampleValue"
}
}

IAM Policy document lambda authorizer response format

  • Data returned in the context property will be available to the batch job.
  • You can configure identitySources that specify the location of data that's required to authorize a request. If they are not included in the request, the Lambda authorizer won't be invoked, and the client receives a 401 error. The following identity sources are supported: $request.header.name, $request.querystring.name and $context.variableName.
  • When caching is enabled for an authorizer, API Gateway uses the authorizer's identity sources as the cache key. If a client specifies the same parameters in identity sources within the configured TTL, API Gateway uses the cached authorizer result, rather than invoking the authorizer Lambda function.
  • By default, API Gateway uses the cached authorizer response for all routes of an API that use the authorizer. To cache responses per route, add $context.routeKey to your authorizer's identity sources.
  • To learn more about Lambda authorizers, refer to AWS Docs
LambdaAuthorizer  API reference
Parent API reference: HttpApiIntegration
type
Required
properties.functionName
Required
properties.iamResponse
properties.identitySources
properties.cacheResultSeconds

Schedule event

The batch job is triggered on a specified schedule. You can use 2 different schedule types:

Copy

resources:
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
# invoke function every two hours
- type: schedule
properties:
scheduleRate: rate(2 hours)
# invoke function at 10:00 UTC every day
- type: schedule
properties:
scheduleRate: cron(0 10 * * ? *)

ScheduleIntegration  API reference
Parent API reference: BatchJob
type
Required
properties.scheduleRate
Required
properties.input
properties.inputPath
properties.inputTransformer

Event Bus event

The batch job is triggered when the specified event bus receives an event matching the specified pattern.

2 types of event buses can be used:


  • Default event bus

    • Default event bus is pre-created by AWS and shared by the whole AWS account.
    • Can receive events from multiple AWS services. Full list of supported services.
    • To use the default event bus, set the useDefaultBus property.

Copy

resources:
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: event-bus
properties:
useDefaultBus: true
eventPattern:
source:
- 'aws.autoscaling'
region:
- 'us-west-2'

Batch job connected to the default event bus

  • Custom event bus
    • Your own, custom Event bus.
    • This event bus can receive your own, custom events.
    • To use custom event bus, specify either eventBusArn or eventBusName property.

Copy

resources:
myEventBus:
type: event-bus
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: event-bus
properties:
eventBusName: myEventBus
eventPattern:
source:
- 'mycustomsource'

Batch job connected to a custom event bus

EventBusIntegration  API reference
Parent API reference: BatchJob
type
Required
properties.eventPattern
Required
properties.eventBusArn
properties.eventBusName
properties.useDefaultBus
properties.input
properties.inputPath
properties.inputTransformer

SNS event

The batch job is triggered every time a specified SNS topic receives a new message.

  • Amazon SNS is a fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication.
  • Messages (notifications) are published to the topics
  • To add your custom SNS topic to your stack, add Cloudformation resource to the cloudformationResources section of your config.

Copy

resources:
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: sns
properties:
topicArn: $CfResourceParam('mySnsTopic', 'Arn')
onDeliveryFailure:
sqsQueueArn: $CfResourceParam('mySnsTopic', 'Arn')
sqsQueueUrl: $CfResourceParam('mySqsQueue', 'QueueURL')
cloudformationResources:
mySnsTopic:
Type: AWS::SNS::Topic
mySqsQueue:
Type: AWS::SQS::Queue

SnsIntegration  API reference
Parent API reference: BatchJob
type
Required
properties.topicArn
Required
properties.filterPolicy
properties.onDeliveryFailure

SQS event

The function is triggered whenever there are messages in the specified SQS Queue.

  • Messages are processed in batches
  • If the SQS queue contains multiple messages, the batch job is invoked with multiple messages in its payload
  • A single queue should always be "consumed" by a workload. SQS message can only be read once from the queue and while it's being processed, it's invisible to other workloads. If multiple different workloads are processing messages from the same queue, each will get their share of the messages, but one message won't be delivered to more than one workload at a time. If you need to consume the same message by multiple consumers (Fanout pattern), consider using EventBus integration or SNS integration.
  • To add your custom SQS queue to your stack, simply add Cloudformation resource to the cloudformationResources section of your config.

Batching behavior can be configured. The batch job is triggered when any of the following things happen:

  • Batch window expires. Batch window can be configured using maxBatchWindowSeconds property.
  • Maximum Batch size (amount of messages in the queue) is reached. Batch size can be configured usingbatchSize property.
  • Maximum Payload limit is reached. Maximum payload size is 6 MB.

Copy

resources:
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: sqs
properties:
queueArn: $CfResourceParam('mySqsQueue', 'Arn')
cloudformationResources:
mySqsQueue:
Type: AWS::SQS::Queue

SqsIntegration  API reference
Parent API reference: BatchJob
type
Required
properties.queueArn
Required
properties.batchSize
Default: 10
properties.maxBatchWindowSeconds

Kinesis event

The batch job is triggered whenever there are messages in the specified Kinesis Stream.

  • Messages are processed in batches.
  • If the stream contains multiple messages, the batch job is invoked with multiple messages in its payload.
  • To add a custom Kinesis stream to your stack, simply add Cloudformation resource to the cloudformationResources section of your config.
  • Similarly to SQS, Kinesis is used to process messages in batches. To learn the differences, refer to AWS Docs

Batching behavior can be configured. The batch job is triggered when any of the following things happen:

  • Batch window expires. Batch window can be configured using maxBatchWindowSeconds property.
  • Maximum Batch size (amount of messages in the queue) is reached. Batch size can be configured usingbatchSize property.
  • Maximum Payload limit is reached. Maximum payload size is 6 MB.

Consoming messages from a kinesis stream can be done in 2 ways:

  • Consuming directly from the stream - polling each shard in your Kinesis stream for records once per second. Read throughput of the kinesis shard is shared with other stream consumers.
  • Consuming using a stream consumer - To minimize latency and maximize read throughput, use "stream consumer" with enhanced fan-out. Enhanced fan-out consumers get a dedicated connection to each shard that doesn't impact other applications reading from the stream. You can either pass reference to the consumer using consumerArn property, or you can let Stacktape auto-create consumer using autoCreateConsumer property.

Copy

resources:
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: kinesis-stream
properties:
autoCreateConsumer: true
maxBatchWindowSeconds: 30
batchSize: 200
streamArn: $CfResourceParam('myKinesisStream', 'Arn')
onFailure:
arn: $CfResourceParam('myOnFailureSqsQueue', 'Arn')
type: sqs
cloudformationResources:
myKinesisStream:
Type: AWS::Kinesis::Stream
Properties:
ShardCount: 1
myOnFailureSqsQueue:
Type: AWS::SQS::Queue

KinesisIntegration  API reference
Parent API reference: BatchJob
type
Required
properties.streamArn
Required
properties.consumerArn
properties.autoCreateConsumer
properties.maxBatchWindowSeconds
properties.batchSize
Default: 10
properties.startingPosition
Default: TRIM_HORIZON
properties.maximumRetryAttempts
properties.onFailure
properties.parallelizationFactor
properties.bisectBatchOnFunctionError

DynamoDb event

The batch job is triggered whenever there are processable records in the specified DynamoDB streams.

  • DynamoDB stream captures a time-ordered sequence of item-level modifications in a DynamoDB table and durably stores the information for up to 24 hours.
  • Records from the stream are processed in batches. This means that multiple records are included in a single batch job invocation.
  • DynamoDB stream must be enabled in a DynamoDB table definition. Learn how to enable streams in dynamo-table docs

Copy

resources:
myDynamoDbTable:
type: dynamo-db-table
properties:
primaryKey:
partitionKey:
name: id
type: string
streamType: NEW_AND_OLD_IMAGES
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: dynamo-db-stream
properties:
streamArn: $ResourceParam('myDynamoDbTable', 'streamArn')
batchSize: 200

DynamoDbIntegration  API reference
Parent API reference: BatchJob
type
Required
properties.streamArn
Required
properties.maxBatchWindowSeconds
properties.batchSize
Default: 100
properties.startingPosition
Default: TRIM_HORIZON
properties.maximumRetryAttempts
properties.onFailure
properties.parallelizationFactor
properties.bisectBatchOnFunctionError

S3 event

The batch job is triggered when a specified event occurs in your bucket.

  • Supported events are listed in the s3EventType API Reference.

  • To learn more about the even types, refer to AWS Docs.

Copy

resources:
myBucket:
type: bucket
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: s3
properties:
bucketArn: $ResourceParam('myBucket', 'arn')
s3EventType: 's3:ObjectCreated:*'
filterRule:
prefix: order-
suffix: .jpg

S3Integration  API reference
Parent API reference: BatchJob
type
Required
properties.bucketArn
Required
properties.s3EventType
Required
properties.filterRule
S3FilterRule  API reference
Parent API reference: S3Integration
prefix
suffix

Cloudwatch Log event

The batch job is triggered when a log record arrives to the specified log group.

  • Event payload arriving to the batch job is BASE64 encoded and has the following format: { "awslogs": { "data": "BASE64ENCODED_GZIP_COMPRESSED_DATA" } }
  • To read access the log data, event payload needs to be decoded and decompressed first.

Copy

resources:
myLogProducingLambda:
type: function
properties:
packaging:
type: stacktape-lambda-buildpack
properties:
entryfilePath: lambdas/log-producer.ts
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: cloudwatch-log
properties:
logGroupArn: $ResourceParam('myLogProducingLambda', 'arn')

CloudwatchLogIntegration  API reference
Parent API reference: BatchJob
type
Required
properties.logGroupArn
Required
properties.filter

Application Load Balancer event

The batch job is triggered when a specified Application load Balancer receives an HTTP request that matches the integration's conditions.

  • You can filter requests based on HTTP Method, Path, Headers, Query parameters, and IP Address.

Copy

resources:
# load balancer which routes traffic to the function
myLoadBalancer:
type: application-load-balancer
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
events:
- type: application-load-balancer
properties:
# referencing load balancer defined above
loadBalancerName: myLoadBalancer
priority: 1
paths:
- /invoke-my-job
- /another-path

ApplicationLoadBalancerIntegration  API reference
Parent API reference: BatchJob
type
Required
properties.loadBalancerName
Required
properties.priority
Required
properties.listenerPort
properties.paths
properties.methods
properties.hosts
properties.headers
properties.queryParams
properties.sourceIps

Accessing other resources

  • For most of the AWS resources, resource-to-resource communication is not allowed by default. This helps to enforce security and resource isolation. Access must be explicitly granted using IAM (Identity and Access Management) permissions.

  • Access control of Relational Databases is not managed by IAM. These resources are not "cloud-native" by design and have their own access control mechanism (connection string with username and password). They are accessible by default, and you don't need to grant any extra IAM permissions. You can further restrict the access to your relational databases by configuring their access control mode.

  • Stacktape automatically handles IAM permissions for the underlying AWS services that it creates (i.e. granting batch job permission to write logs to Cloudwatch, allowing trigger functions to communicate with their event source and many others).

If your workload needs to communicate with other infrastructure components, you need to add permissions manually. You can do this in 2 ways:

Using allowAccessTo

  • List of resource names that this batch job will be able to access (basic IAM permissions will be granted automatically). Granted permissions differ based on the resource.
  • Works only for resources managed by Stacktape (not arbitrary Cloudformation resources)
  • This is useful if you don't want to deal with IAM permissions yourself. Handling permissions using raw IAM role statements can be cumbersome, time-consuming and error-prone.

Copy

resources:
photosBucket:
type: bucket
myBatchJob:
type: batch-job
properties:
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
resources:
cpu: 2
memory: 1800
accessControl:
allowAccessTo:
- photosBucket


Granted permissions:

Bucket

  • list objects in a bucket
  • create / get / delete / tag object in a bucket DynamoDb Table
  • get / put / update / delete item in a table
  • scan / query a table
  • describe table stream MongoDb Atlas Cluster
  • Allows connection to a cluster with accessibilityMode set to scoping-workloads-in-vpc. To learn more about MongoDb Atlas clusters accessibility modes, refer to MongoDB Atlas cluster docs. Relational database
  • Allows connection to a relational database with accessibilityMode set to scoping-workloads-in-vpc. To learn more about relational database accessibility modes, refer to Relational databases docs. Redis cluster
  • Allows connection to a redis cluster with accessibilityMode set to scoping-workloads-in-vpc. To learn more about redis cluster accessibility modes, refer to Redis clusters docs. Event bus
  • publish events to the specified Event bus Function
  • invoke the specified function Batch job
  • submit batch-job instance into batch-job queue
  • list submitted job instances in a batch-job queue
  • describe / terminate a batch-job instance
  • list executions of state machine which executes the batch-job according to its strategy
  • start / terminate execution of a state machine which executes the batch-job according to its strategy

Using iamRoleStatements

  • List of raw IAM role statement objects. These will be appended to the batch job's role.
  • Allow you to set granular control over your batch job's permissions.
  • Can be used to give access to any Cloudformation resource

Copy

resources:
myBatchJob:
type: batch-job
properties:
resources:
cpu: 2
memory: 1800
container:
packaging:
type: stacktape-image-buildpack
properties:
entryfilePath: path/to/my/batch-job.ts
accessControl:
iamRoleStatements:
- Resource:
- $CfResourceParam('NotificationTopic', 'Arn')
Effect: Allow
Action:
- 'sns:Publish'
cloudformationResources:
NotificationTopic:
Type: AWS::SNS::Topic

Default VPC connection

Referenceable parameters

The following parameters can be easily referenced using $ResourceParam directive directive.

To learn more about referencing parameters, refer to referencing parameters.

jobDefinitionArn
  • Arn of the job definition resource

  • Usage: $ResourceParam('<<resource-name>>', 'jobDefinitionArn')
stateMachineArn
  • Arn of the state machine controlling the execution flow of the batch job

  • Usage: $ResourceParam('<<resource-name>>', 'stateMachineArn')
logGroupArn
  • Arn of the log group aggregating logs from the batch job

  • Usage: $ResourceParam('<<resource-name>>', 'logGroupArn')

Pricing

  • You are charged for the instances running in your batch job compute environment.
  • Instance sizes are automatically chosen to best suit the needs of your batch jobs.
  • You are charged only for the time your batch job runs. After your batch job finishes processing, the instances are automatically killed.
  • Price depends on region and instance used. (https://aws.amazon.com/ec2/pricing/on-demand/)
  • You can use spot instances to save costs. These instances can be up to 90% cheaper. (https://aws.amazon.com/ec2/spot/pricing/)
  • You are also paying a very neglibile price for lambda functions and state machines used to manage the execution and integrations of your batch job.

API reference

BatchJob  API reference
type
Required
properties.container
Required
properties.resources
Required
properties.timeout
properties.useSpotInstances
properties.logging
properties.retryConfig
properties.events
properties.accessControl
overrides
CognitoAuthorizer  API reference
Parent API reference: HttpApiIntegration
type
Required
properties.userPoolName
Required
properties.identitySources
LambdaAuthorizer  API reference
Parent API reference: HttpApiIntegration
type
Required
properties.functionName
Required
properties.iamResponse
properties.identitySources
properties.cacheResultSeconds
EventInputTransformer  API reference
Parent API reference: (EventBusIntegration or ScheduleIntegration)
inputTemplate
Required
inputPathsMap
EventBusIntegrationPattern  API reference
Parent API reference: EventBusIntegration
version
detail-type
source
account
region
resources
detail
replay-name
SnsOnDeliveryFailure  API reference
Parent API reference: SnsIntegration
sqsQueueArn
Required
sqsQueueUrl
Required
DestinationOnFailure  API reference
Parent API reference: (DynamoDbIntegration or KinesisIntegration)
arn
Required
type
Required
S3FilterRule  API reference
Parent API reference: S3Integration
prefix
suffix
LbHeaderCondition  API reference
headerName
Required
values
Required
LbQueryParamCondition  API reference
paramName
Required
values
Required
KinesisIntegration  API reference
Parent API reference: BatchJob
type
Required
properties.streamArn
Required
properties.consumerArn
properties.autoCreateConsumer
properties.maxBatchWindowSeconds
properties.batchSize
Default: 10
properties.startingPosition
Default: TRIM_HORIZON
properties.maximumRetryAttempts
properties.onFailure
properties.parallelizationFactor
properties.bisectBatchOnFunctionError
EnvironmentVar  API reference
Parent API reference: BatchJobContainer
name
Required
value
Required
AccessControl  API reference
Parent API reference: BatchJob
iamRoleStatements
allowAccessTo
StpIamRoleStatement  API reference
Parent API reference: AccessControl
Resource
Required
Sid
Effect
Action
Condition
Need help? Ask a question on SlackDiscord or info@stacktape.com.