logoStacktape docs




State Machines

Overview and basic concepts

State machines allow you to design and automate business processes and data pipelines by composing workloads (functions, batch-jobs) or other AWS services into workflows. State machines manage failures, retries, parallelization, service integrations, and observability so developers can focus on higher-value business logic.

When to use

  • Extract, Transform, and Load (ETL) process - state machines ensure that long-running, multiple ETL jobs execute in order and complete successfully, instead of manually orchestrating those jobs or maintaining a separate application.

  • Orchestrate microservices - use state machines to combine multiple functions into responsive serverless applications and microservices.

Define states

Definition of state machines is written using Amazon states language.

Amazon states language syntax gives enables users to specify any workflow from easy ones to most complex ones.

The following example shows an order-payment flow made up of lambda functions:

resources:
checkAndHoldProduct:
type: function
properties:
packageConfig:
filePath: 'check-and-hold-product.ts'
billCustomer:
type: function
properties:
packageConfig:
filePath: 'bill-customer.ts'
shipmentNotification:
type: function
properties:
packageConfig:
filePath: 'shipment-notification.ts'
buyProcessStateMachine:
type: 'state-machine'
properties:
definition:
StartAt: 'checkAndHold'
States:
checkAndHold:
Type: Task
Resource: $Param('checkAndHoldProduct', 'LambdaFunction::Arn')
Next: bill
bill:
Type: Task
Resource: $Param('billCustomer', 'LambdaFunction::Arn')
Next: notify
notify:
Type: Task
Resource: $Param('shipmentNotification', 'LambdaFunction::Arn')
Next: succeed
succeed:
Type: Succeed

Retry example

The following example shows:

  • generateReport - batch-job which generates report.
  • uploadReport - function that uploads generated report.
  • reportStateMachine - state machines that ties above workloads together.

State machine definitions provide great flexibility. In this case,reportStateMachine only retries upload part of our workflow, since regenerating the report (in case of upload failure) would be costly and redundant.

resources:
uploadReport:
type: function
properties:
packageConfig:
filePath: 'upload-report.ts'
generateReport:
type: 'batch-job'
properties:
container:
imageConfig:
filePath: generate-report.ts
resources:
cpu: 2
memory: 7800
reportStateMachine:
type: 'state-machine'
properties:
definition:
StartAt: 'generate'
States:
generate:
Type: Task
Resource: 'arn:aws:states:::batch:submitJob.sync'
Parameters:
JobDefinition: $Param('generateReport', 'JobDefinition::Arn')
JobName: report-job
JobQueue: $Param('SHARED_GLOBAL', 'BatchOnDemandJobQueue::Arn')
Next: upload
upload:
Type: Task
Resource: $Param('uploadReport', 'LambdaFunction::Arn')
Next: succeed
Retry:
- ErrorEquals:
- 'State.ALL'
IntervalSeconds: 10
succeed:
Type: Succeed