AWS Step Functions Error Handling
- Error can happen in variety of ways
- State machine definition error (example: no matching name for a state)
- Task failures (example: exception in lambda)
- Transient issues (example: network partition event)
- Use Retry to retry failed state and Catch to transition the state machine to failure path
- Note: try not to handle the error in the Application layer because it increases the complexity of our application
- Some of the predefined error codes
States.ALL: matches any error nameStates.Timeout: task ran longer than TimeoutSeconds eor no heartbeat receviedStates.TaskFailed: execution failureStates.Permissions: insufficient privileges to execute code
- The state may report is own errors and you can catch them in step functions
Retry (for Task State or Parallel State)

ErrorEquals: specify the error typeIntervalSeconds: how long should we delay after each retryBackoffRate: multiple with the delay after each retry for Exponential Backoff (any AWS service)MaxAttempts: default to 3. Set to 0 to never retry- When max attempts are reached. The Catch block kicks in
Catch (for Task State or Parallel State)

ErrorEquals: match a specific kind of errorNext: state to send toResultPath: A path that determines what input is sent to the state specified in Next field- the
$.errorputs the error inside the output. For example
- the