

  • can do real-time stuff, WebSocket, MQTT on WebSocket
  • GraphQL



  • Has traffic shifting feature, automated rollback since it's using CodeDeploy in the background
  • To step through and debug the code, use SAM CLI + AWS Toolkits
  • AWS Policy Templates: to give permission to lambda function (yes only lambda)
    • Some of the command
      • sam build: fetch and generate and create local deployments artifact
      • sam package: package and upload to S3
      • sam deploy: deploy to CloudFormation
      • sam publish: publish to AWS SAR


  • Serverless application repository to store SAM applications


  • AWS managed key for us
  • Has 2 main services
    • AWS Managed Key: Pre-defined key. Free of charge. For example aws/rds, aws/ebs
    • Customer Managed Keys (CMK): has 2 type of keys which are both $1 per month per key
      • Create your own
      • Import your own
  • Encryption and Decryption limited to 4KB, if more than that we need to use Envelope Encryption technique
  • Key rotation
    • AWS Managed Key: automatic every 1 year
    • Customer Managed Key: must be enabled
      • For created key, automatic every 1 year
      • For imported key, must manually rotate using alias
        • (makes sense cus you import the key)
  • Key policy
    • You need to have key policy to even access to the key
    • Default will allow the entire account to access the key if you don't specify one.
    • Good for cross-account access
  • Envelop encryption / decryption
    • Technique to encrypt / decrypt large file (more than 4KB)
    • Use something called a Data Key.
    • Data Key Caching:
      • Cache the data key and re-using it to avoid having to call KMS to reduce quotas consumption
      • For example: S3 Bucket Key
  • Throttle and stuff
    • Every services that make request to KMS will share the same quota across each region
    • Solution
      • Request Quotas increase
      • Data Key Caching
      • Exponential Backoff
  • Some of the API
    • Encrypt: 4KB encrypt
    • Decrypt: 4KB decrypt
    • GenerateRandom : return random byte of string
    • GenerateDataKey: Generate unique data key, return a plaintext and encrypted copy of the CMK
    • GenerateDataKeyWithoutPlainText: Generate a data key but for later use at some point

AWS SSM Parameter Store

  • Old way to store parameter and secret, we use the Parameter Store of System Manager to store stuff
  • Can store in hierarchy as well
  • No charge if we use standard tier, providing
    • Maximum size of parameter: 4KB
    • Total number of parameters: < 10,000
  • Advanced tier can have TTL (expiration policies) to force update or delete
  • No automatic rotation
  • KMS is mandatory

AWS Secret Manager

  • Newer services dedicated to storing secrets and stuff
  • Capacity to force rotation every X days
  • Seemlessly integration with RDS
  • Can do hierarchy storing (folder like structure)
  • Automatic rotation
  • KMS is optional

Step Function

  • Organise workflows as state machine
  • Standard vs express
    • Standard if you need maximum duration 1 year
    • Express if it's 5 minutes
    • Notes: express support more executions, standard has 2000 capped
  • Error handling:
    • We can use Retry or Catch with exponential backoff rate
  • State types
    • Task state: (Lambda job, Batch job, SQS, EC2, ...) can invoke or run 1 activity
    • Choice state: make a choice
    • Fail or Succeed rate: stop an execution with failure or success
    • Pass state: pass through input as output, don't do anything
    • Wait state: delay
    • Map state: dynamically iterate (loop)
    • Parallel state: begin parallel execution


  • ChangeSet: See what changes before updating the stack. This won't say if the update is successful.
  • CloudFormation Drift: check if there is manual config
  • Can be edited using
    • YAML
    • CloudFormation Designer
  • Building block
    • resources (mandatory)
    • Mappings: constants to declare
    • Parameters: for users to put inputs in
    • Outputs: what to export
    • Conditions: needs to satisfies these conditions
  • Can't edit previous template, have to update the new one to overwrite
  • Rollback behaviour:
    • Stack creation fails:
      • Everything roll back (removed)
    • Stack update fails:
      • roll back to previous working state
  • Nested stack
    • for reusing components
  • Cross stack:
    • to share the output of one stacks to another, export some values
  • Stacksets:
    • Create, update delete stack across multiple region


  • Pricing classes:
    • All: all regions (best performance)
    • 200: exclude most expensive
    • 100: only the least expensive
  • Multiple region:
    • Comes with ability to route to different origins based on the path
  • Origin group:
    • Consist of 1 primary and 1 secondary: if primary fail, we use secondary
    • always route to primary first before routing to secondary
  • Field level encryption:
    • protect sensitive information encrypted at edge close to the user
    • Specify the specific field in POST request you want to encrypt
  • CloudFront caching:
    • Can cache based on header, session cookies, query string parameters.
    • Cache luives at CloudFront Edge Location
    • Control TTL of the cache
    • Invalidate cache using CreateInvalidation API
    • Have the ability to cache between static and dynamic content
      • For dynamic cached based on headers, and cookies
      • For static cache normally.
    • Security
      • Viewer Protocol Policy
        • To work with Client, viewer side
          • redirect client http to https
          • force client to use http only
      • Origin protocol policy
        • Protocol side
          • Specify HTTPS only
          • Match with Viewer Protocol
    • Note: S3 Bucket website doesn't support HTTPS
    • CloudFront Signed URL:
      • Distribute premium content
      • any origin
      • account-wide
      • different than S3 Signed URL:
        • S3 signed URL only have limited life type
        • Use the IAM key of the signing IAM principal => User have the same permission
        • no caching, only can sign s3 bucket
      • When signing it's recommend to use trusted key group
        • Support auto rotation key
        • Otherwise you need to use root account which is not recommended


  • To store software dependencies (maven, npm)
  • For CodeBuild and develoeprs to retrieve artifiacts


  • Like Jenkins CI, supports Build and Test the application
  • Cannot access to resources in VPC, if you want you need to configure it


  • Github thingy


  • CD to deploy. Need appspec.yml in the root directory
  • Components
    • Application: application's name
    • Compute platform: EC2, Lambdaa
    • Deployment Configuration:
      • One at a time
      • Custom (min healthy instances)
      • All at once
      • Half at a time
    • Deployment group
      • Group of EC2 or ASG
    • Deployment type
      • in-place deployment
      • blue/green deployment
    • IAM Instance Profile: to get permission to access to S3 and Github
    • Application revision
    • Service Role: IAM Role for code deploy to perform operations on EC2
    • Target Revisision: most recent revision you want to deploy
  • Hooks (will be called in this order):
    • ApplicationStop
    • DownloadBundle
    • BeforeInstall
    • Install
    • AfterInstall
    • ApplicationStart
    • ValidateService:
      • Make sure the service is working correctly
  • Rollback:
    • Automatically or manually, when rollback it will redeploy previous last known good revision as a new version
      • If we are doing in-place upgrade, we need to re-deploy the original version.
      • Only blue-green deployment allows you to rollback


ML powered service

  • CodeGuru reviewer: Automated code review
  • CodeGuru profiler: application performance recommendations


  • Visual workflow for CI CD pipeline
  • Flow mananger CodeCommit, CodeBuild, CodeDeploy integration
  • Next step read CodePipeline Artifact


  • All-in-one central UI for CodeCommit, CodeBuild, CodeDeploy, CodePipeline
  • Has Cloud9 web IDE development


  • Has two modes
    • On-demand:
      • Automatic increase scale for read, write
      • Unlimited WRUs (write request unit) and RCUs (read request unit)
      • Charge based on RRUs (read request units) and WRUs (Write request unit)
      • More expensive
    • Provisioned:
      • Have to provision RCUs and WCUs
        • RCUs calculation:
          • 1 Read request = 1 RCU
          • 1 RCU = 1 Strongly Consistent Read = 2 Eventually consistent read
          • Up to 4 KB read per item
        • WCUs
          • 1 Write request = 1 WCU
          • Write up to 1 KB
          • Has 2 mode
            • Standard
            • Transactional: Single or nothing
      • Cheaper, have option to setup auto-scaling of RCU and WCU to meet demand by providing min, max, target utility (%)
      • Throughput can be exceeded temporarily using Burst Capacity
  • DynamoDB transaction:
    • Consumes 2x WCUs and RCUs since it performs 2 operations for every item (prepare & commit)
      • TransactGetItems
      • TransactWriteItems
  • Write types:
    • Concurrent writes (1 overwrites the other)
    • Conditional writes
    • Atomic writes (take both)
    • Batch writes
  • Primary key (required) can be either
    • hash (partition key only)
    • hash + range (partition key + sort key)
  • Sort Key (optional): determine the order fo how the data can be sorted
  • APIS
    • PutItem: create or replace
    • UpdateItem: edit item's attributes
    • Conditional Writes
      • No performance impact
    • GetItem:
      • ProjectionExpression: To retrieve only certain attribute, use
      • Default for Eventually Consistent
    • Query: query item using
      • For key attribute, use: KeyConditionExpress
        • key attribute can only use =
      • For sort key, use: FilterExpression
        • Sort key can use =, >=, <=,...
      • For other attribute, use FilterExpression
        • Happening in client side, runs after Query is executed but before result returns to you
    • Scan:
      • Read entire table then filter out the data (inefficient)
      • For faster performance, use Parallel Scan
      • can use ProjectionExpression & FilterExpression
        • FilterExpression is client side filtering
        • ProjectionExpression is server side, the functionality is mostly the same
    • DeleteItem
      • Delete individual item
      • can perform conditional delete
    • DeleteTable
      • Drop the whole table, quicker then calling DeleteItem on all items
    • BatchWriteItem:
      • Up to 25 PutItem and/or DeleteItem
      • maximum 16MB of data written with 400 KB of data per item
      • can't update item
    • BatchGetItem
      • return the items from one or more tables parallely
      • up to 100 items or 16MB of data
    • --page-size: default 1000 items
      • how many items are we query concurrently
      • for example, if page size = 100 and we have 1000 items. We have 10 concurrent api calls to avoid timeout
    • --max-item:
      • max item to show for the current query, return NextToken for --starting-token
    • --starting-token:
      • Given a NextToken and start querying from there, pagination stuff
  • Local Secondary Index (LSI)
    • additional sort key
    • Can have up to 5 LSI
    • Must be defined at creation time.
    • Attribute projections:
      • Can get all attirbutes in the base table
      • Can get these attribute even though it's not projected with the index
    • Use WCUs and RCUs of the main table
  • Global Secondary Index
    • additional primary key (can be hash or range + hash)
    • To speed up on queries that are not primary key
    • Attribute projections:
      • Can also get attributes in the base table
      • However, if the attributes is not projected with index, we can't get it
    • Must provision RCUs and WCUs separately for the index (autoscalable)
  • Limitations:
    • Each table has infinite number of rows
    • Maximum size of item is 400 KB
      • Supported type includes: String, number, binary, boolean, null, List, Map, Set
  • DynamoDB Accelerator (DAX)
    • Improve read by using memory cache.
    • 5 minutes TTL. Up to 10 nodes
    • Multi-AZ support
    • Comparison to ElastiCache:
      • DAX is good for individual object cache
      • Elasticache is more like caching the result as the whole
  • Optimistic locking
    • Allow you to have conditional write
  • PartiQL
    • Query DynamoDB using SQL like syntax
    • Support Batch operations
  • Partition
    • Internally, DynamoDB store data in partition
      • WCUs and RCUs are spread evently across partitions
        • So if you have 10 partition and 10 RCUs and 10 WCUs, each partition has 1 RCU and 1 WCU
    • Partition strategies
      • If the partition is too hot, we can
        • Shard using random suffix
        • Shard using calculated suffix
  • Security
  • DynamoDB Global tables
    • Multi-region, multi-active, fully replicated, high performance
  • DynamoDB local
    • Test locally without accessing DynamoDB web for local environment
  • AWS Database Migration Service
    • To migrate other databases into DynamoDB (MongoDB, Oracle, Mysql)
  • For user to interact with DynamoDB directly
    • use Cognito, specify LeadingKeys and Attributes to limit specific attributes user can see
  • DynamoDB stream:
    • Ordered stream level modifications
    • retention up to 24 hours
    • Only receive updates after you enable stream
  • Throttling
    • If exceeded provisioned RCU and WCU, we're gonna get ProvisionedThroughputExceededException
    • Solutions
      • Exponential Backoff
      • Redistribute (re-shard)
      • If it's RCU can try DAX
    • For DynamoDB Global Secondary Index if WCU throttled, the main table will be throttled as well
    • For DynamoDB Local Secondary Index, since we're using the same WCUs and RCUs of the main table, no special throttling considerations
  • TTL
    • Time to live for each item, automatically deleted within 48 hours of expirations
      • Will be deleted from both LSI and GSI
      • Expired items that haven't been deleted will still appears in reads queries or scans
    • Doesn't consume WCUs, delete operation goes to DynamoDB stream


  • ECR: Repository for ECS
    • to push/pull from ECR use the native docker push/pull
  • Rolling update
    • Can specify
      • Minimum healthy percentage
        • minimum task to be healthy (0 - 100%)
      • Maximum healthy percentage
        • maximum task to be healthy (100% - 200%)
  • Task placement
    • Strategies
      • Type
        • BinPack: try to fill in one EC2 as much as possible
        • Random: randomly placed in EC2
        • Spread: spread across specific value (availability zone, instance id, ...)
      • We can mix the type together, for example spread the availbility zone and binpack the memory
    • Constraints
      • distinctInstance: each task is placed in a different container instance
      • memberOf: place task on instance that satisfy an expression
  • Task Definition:
    • JSON form tell ECS how to run a docker container, has image name, port binding, ...
    • For EC2 launch type:
      • Get dynamic host port mapping if you define only the container port in the task definition
    • For Fargate launch type:
      • Each task has an unique private IP

Elastic Beanstalk

  • Extensions
    • To configure Elastic Beanstalk using code
    • Add stuff inside .ebextensions/
    • Files has to be eitrher yaml or json and the ending need to be .config
  • Can integrate with HTTPS
    • By loading cert into .ebextensions/securelistener-alb.config
    • or setup using Console or ACM (AWS Certificate Mananger)
  • Can also redirect from HTTP to HTTPs
  • Cloning: Allow to clone the exact same Elastic Beanstalk environment. Good for testing
  • Components
    • Application: collection of elastic beanstalks components (like a folder)
    • Application Version: the version of your code
    • Environment: Collection of AWS resources running the application
      • Tiers:
        • Web Server Tier
          • Have ELB with EC2 running web server with ASG
        • Worker tier
          • SQS Queue and EC2 with ASG
      • Can create multi environment (dev, test, prod)
  • Custom platform:
    • Define your own platform with custom OS system, software and scripts
    • Define an AMI in platform.yaml file
  • Deployment
    • process
      1. Describe dependencies (in package.json or requirements.txt)
      2. Upload the code
        • Through console: Using zip
        • Through CLI: just type the command it will zip for you
      3. It will deploy the thing on EC2, resolve dependencies and start the application
    • options:
      • Single Instance
      • High Availability with Load Balancer
  • Docker Integration
    • Single Docker
      • Does not use ECS
      • Provide Dockefile or
    • Mutli Docker
      • Use ECS, will create ECS cluster
      • Provide
      • Your docker image must be prebuilt and store ECR (for example)
  • Lifecycle policy
    • To remove the old version
    • There are options to not delete the source bundle in S3 to prevent data lost
  • Migration
    • When we want to change the config of things we can't change, we need to do migration
      • For example, you can't change the network load balancer to application load balancer. If we want to change, we have to:
        1. Create a new environment with the same configuration except the load balancer
        2. Deploy the application in the new environment
        3. Put the new load balancer in
      • For RDS separation (to prevent database gets deleted if we delete the elastic beanstalk stack):
        1. Create snapshot of RDS
        2. go to RDS console and protect RDS database from deletion
        3. Create a new beanstalk environment with same config but without the RDS. Point the application to the new exisiting RDS.
        4. Delete the old application
  • Update options
    • All at one
    • Rolling
      • Update a few instances at a time
    • Rolling with additional batch
      • Same as rolling but go over extra capacity
    • Immutable
      • Use another Auto Scaling Group for the update
      • zero downtime
    • Blue/Green deployment
      • Zero downtime, create new environment and shift traffic gradually over
    • Traffic splitting
      • Good for testing split between different auto scaling group


  • Redis Cluster Mode enabled
    • Good for scale write
    • Primary node is spread across multiple shards
      • So we can do write on multiple primary nodes
    • Have multi-az
    • 1 Primary has 0-5 replica node
  • Redis Cluster mode disabled
    • Cheaper
    • 1 Primary has 0-5 replica node
    • Write on primary and read on secondary
    • Have multi-az
  • Caching strategies
    • Lazy Loading / Cache-aside / Lazy population
      • Only write cache if the cache does not contain information
      • Cons: Cache miss latency, data staling if cache keeps missing
    • Write through (usually combined with lazy-loading)
      • Update the cache when the database is update
      • Cons:
        • Data is missing if you don't update the database
          • Can implement Lazy Loading to avoid this
        • Cache churn: waste cache space (a lot might not being used)
  • Cache eviction and Time-to-leave
    • remove old cache to save some spaces.
      • LRU
    • If too many evictions happen, we might need to increase cache size


  • Dead-letter queue:
    • messages that are not processed will go to this queue
  • Fanout pattern
    • Combination of SNS and SQS
  • Queue Type
    • Standard
      • Can have duplicated message but guarantee at least 1 delivery
      • can have out of order message
    • FIFO
      • name needs to end with .fifo
      • Guarantee to be in-ordered, exactly 1 message
        • Using deduplication
          • Default is 5 minutes: if you send the same message within 5 minutes, it will be refused
          • Detecting the same mssage by content-based deduplication or providing the message deduplication id
  • Access policy
    • Cross-account access
    • Allow other services to send messages to SQS
  • APIs
    • CreateQueue
      • MessageRetentionPeriod
    • DeleteQueue: Delete the whole queue
    • PurgeQueue: delete all messages only
    • SendMessage: (batch support)
      • DelaySeconds
    • DeleteMessage (batch support)
    • ReceiveMessage
      • MaxNumberOfMessages: from 1 -> 10
    • ReceiveMesageWaitTimeSeconds: Long Polling
      • Wait for message to arrive if non from the queue
      • can wait from 1-20 seconds (preferably 20 seconds)
    • ChangeMessageVisibility (batch support): message timeout
    • Note: Batch API can help reducing the cost
  • Consumer:
    • Receive up to 10 messages at a time
  • Producer
    • Unlimited throughput
    • message persists until the consumer deletes
  • Delay Queue
    • Set delay messages in the queue up to 15 minutes
      • default is 0 seconds
    • Can overwrite using DelaySeconds
  • Extended Client
    • If the message is too big ( > 256 KB), use this which used S3 to send and link message
    • Only for Java
  • FIFO Message grouping
    • Only for FIFO
    • Use Group ID to group all the message.
    • Each Group ID can only have 1 consumer
    • Ordering across group is not guaranteed
  • Visibility Timeout
    • Invisible to other consumers when polled by a consumer
      • Default is 30 seconds
    • If within the visibility timeout and you haven't processed the message, message will go back to the queue for other consumer to process


  • pub/sub can have many receivers
    • Up to 12,500,000 subscriptions per topics
    • up to 100,000 topics
  • Publish type
    • Topic Publish (SDK)
      • Publish straight to the subscriptions
    • Direct Publish
      • Publish to a platform endpoint (work with 3rd party)
  • Access Policies
    • For cross-account access
    • Control which services write to your SNS
    • The only subscribers can be SQS FIFO
    • name has to end with .fifo
      • ordering by Message Group ID
      • Deduplication using Deduplication ID or content based deduplication
  • Message filtering
    • Allow subscriber to choose what to receive
    • If doesn't specify, receives all messages


  • Kinesis Data Stream
    • Real-time stream data into your application
    • Provisioned:
      • select the number of shard, pay per shard per hour
      • each shard gets 1 MB/s (or 1000 records per second) in and 2MB/s out
    • Capacity
      • Pay per stream/hour, data in/out in GB
      • 4MB/s in or 4000 records per second
      • Scale automatically
    • Retention:
      • 1-365 days (1 day by default)
    • Ability to replay data
    • To increase throughput, enable Enhanced mode
    • Doesn't work with SNS
    • Consumers: subscribe to data
      • AWS Lambda
        • Support both classic and enhanced fanout consumers
      • Kinesis Data Analytics
      • Kinesis Data Firehose
      • Kinesis Client Library
        • Java library that act as a Kinesis consumer
        • Each shard is read by 1 Kinesis Client Library instance
          • 4 shards = max 4 KCL instances
          • 6 shards = max 6 KCL instances
        • if needed Enhanced Fan-out consumer, use v2
      • Kinesis Custom Consumers
        • Classic Fan-out Consumer
          • 2 MB/sec per shard across all consumers
          • Use GetRecord() API
        • Enhanced Fan-out Consumer
          • Use SubscribeToShard() API
          • 2 MB/sec for each shard for each consumer
          • lower latency but higher cost
    • Producer: produce data
      • Data consist of
        • Sequence number: unique per partition key within shard
        • Partition key: (must specified)
          • Hash the data to relevant shard
          • Can be used to achieve ordering
      • Producers
        • AWS SDK: Simple
        • Kinesis Producer Library (KPL): advanced with compression, retries
        • 1MB/sec or 1000 records / sec per shard
  • Kinesis Data Firehose
    • Similar to DataStream but fully serverless
    • Destinations
      • S3
      • RedShift
      • ElasticSearch
    • Near-real time
      • 60 seconds latency for non full batches
      • 1 MB of data at a time
    • Works with SNS
  • Kinesis Data Analytics
    • Real time query both Data Stream and Data FireHose using SQL
    • Fully managed Serverless
    • Apache Flink:
      • Use Amazon MSK to work with Apache Flink
  • Throttle
    • Happens when we have more throughput than provisioned
    • Retries with exponential backoff
    • Reshard / increase shard
  • Common operations
    • Shard Splitting
      • increase stream capacity limit
      • divide "hot shard". Increase stream capacity
      • Old shard will be close and delete once the data is expired
    • Merge shard
      • decrease stream capacity to save cost
      • Old shard will be close and delete once the data is expired
      • can't merge more than 2 shard in single operation


  • Get history of events, api calls within AWS
  • Governance, compliance, auditing
  • Three types of events
    • Management events (default)
      • Operations that are related to management to your AWS
    • Data events
      • S3 Object level activity
      • AWS Lambda function execution
    • CloudTrail Insights event
  • CludTrail Insights
    • Automatically detect unusual activities using Machine Learning
    • analyse normal events to create a base line


  • CloudWatch Metrics
    • Provide metrics to your AWS service
    • default: every 5 minutes
      • with detailed monitoring, can get for every 1 minute
      • If want less than 1 minute, need to define Custom Metrics
    • Note: EC2 memory usage is not pushed, if you want to push, you have to create Custom Metrics
    • Custom metrics:
      • Define your own metrics
      • Resolution (you can define how often the metrics get pushed)
        • Standard: 1 minute
        • High resolution: 1/5/10/30 seconds
          • If you want CloudWatch alarm can handle 10, 30 or 60 seconds
  • CloudWatch Alarms
    • Trigger notifications based on metrics
    • Can specify an action for each alaram state
    • Test the alarm in the CLI using set-alarm-state
  • CloudWatch Events
    • Intecepts events from AWS Service
      • Create a JSON payuload and send to the target
    • Schedule Cron Job
  • CloudWatch Logs
    • log groups:
      • Some name to represent the application
    • log stream:
      • log files,
    • Can have expiration
      • But nevery expire by default
    • Can have encryption at log group level
      • Using AWS KMS
      • If you wanna use CMK, need to specify in the CLI
        • associate-kms-key if log group exists
        • create-log-group if it doesn't exist
    • Can export the logs to S3 but it's not real time if we want real time we can use CloudWatch Logs Subscription
    • CloudWatch Logs Subscription
      • Filter that applied on top of cloudwatch
      • Good for multi-account logs
    • CloudWatch Logs Agent
      • By default, no logs from EC2. We need CloudWatch Logs agent for EC2 to have Logs
      • Can only monitor CPU, disk, and high level network
      • If need RAM and stuff, or more details measurement, need CloudWatch Unified Agent
    • CloudWatch Unified Agent
      • Updated version of CloudWatch Logs Agent
      • Collects RAM, process, Netstat
      • Integration with SSM Parameter Store
    • CloudWatch Logs Insight
      • To query cloudwatch logs

Event Bridge

  • Newer CloudWatch Events
  • Default Event Bus:
    • generated by AWS (similar to CloudWatch Events)
  • Partner Event Bus:
    • Receive event from third party
  • Custom Event Bus:
    • Your own thingy
  • Ability to replay archived events
  • Resource-based policy management access


  • Debug, visualise analysis of the applications
  • Troubleshoot bottle necks
  • X-Ray Sampling:
    • Control the amount of data to send to X-Ray
    • Default rule:
      • reservoir: 1
        • at least 1 request is sent to XRay each second as long as service is serving request
      • rate: 5%
        • 5% of additional requests after reservior are sent as well
  • Integration
    • ECS, a few options:
      1. Run X-Ray daemon container in each EC2
      2. X-Ray as a side car:
        • X-Ray in each application container (1 EC2 can have multiple containers)
        • If running Fargate, we have to use this one because we dont have access to EC2
    • Elastic BeanStalk
      • Enable in .ebextensions/xray-daemon.config
      • Not provided for multi-container docker
  • APIS
    • GetSamplingRules: retrieve all sampling rules

    • For Write excusively

      • PutTraceSegments: upload segment to XRay
      • PutTelemetryRecords: used by daemon
    • For Read explicitly

      • GetTraceGraph
      • GetServiceGraph
      • GetTraceSummaries



  • Create a map of backend services for visualisation
  • Register application locations and health status check


  • Synchronise data in file storage system (not database)
    • From on-premises to cloud (need DataSync agent)
    • From one AWS to AWS (no need agent)

AWS FIS (Fault Injection Simulator)

  • Run fault injection expirements (chaos engineering)

AWS SES (Simple Email Service)

  • Send email

Exponential Backoff

  • Keep trying, the next tries exponentially double the time as the current try
  • Should only retries for 5xx error and 429


  • Athena
    • Serverless Query on S3 Object
    • Support compression or columnar for cost saving
  • Bucket Key:
    • Optimisation to reduce number of calls to AWS KMS for encryption
  • 4 Encryptions methods:
    • SSE-S3: encryption using key handled by AWS
    • SSE-KMS: use KMS managed key to encrypt objects
      • can be optimised using S3 Bucket Key
    • SSE-C: client encryption
      • The only one that mandates you to use HTTPS
      • You use your own key but AWS do the encryption
    • client side encryption
      • You encrypt yourself
  • You can force SSL by explicitly DENY if
"Condition": {
	"Bool": {
		"aws:SecureTransport": "false"
  • Force KMS encryption by DENY if
"Condition": {
	"StringNotEquals": {
		"s3:x-amz-server-side-encryption": "aws:kms"
  • Strong consistency: As of 2020, write and read is consistent

NACL (Network ACL - Access Control List)

  • Networking/Firewall controls traffic from and to
  • Can have ALLOW and DENY
  • works at subnet level
  • Stateless: have to specify in and out traffic

Security group

  • Work at instance level
  • Can only have ALLOW rule
  • Stateful: traffic returns automatically allowed

VPC Flow Logs

  • Capture information about
    • VPC
    • Subnet
    • Elastic Network Interface

VPC Peering

  • Connect 2 VPC together
  • Have to connect A to B, B to C and A to C

VPC Endpoints

  • Allow AWS services to talk to eachother via private network
  • Endpoint Gateway: S3 and DynamoDB
  • Endpoint Interface: others

Site-Site VPN

  • On-premise VPN to AWS
  • Does not allow you to connect to VPC endpoint

Direct Connect (DX)

  • Physical connection
  • Does not allow you to connect to VPC endpoint


  • is within a VPC
  • Public: accessible from internet
  • Private: not accessible from internet

Internet Gateway

  • Provide Internet to a VPC

NAT Gateway or Nat Instance

  • NAT Gateway is serverless
  • NAT Instance is created from an AMI on EC2
  • Provide connection to internet gateway for instances in private subnet

Data Limitations

  • Read write through put
    • RCU:
      • 1 read = 4 KB
        • eventually consistent take 1/2 read
      • 1 write = 1 KB
        • Transactional takes 2 write
    • Kinesis DataStream
      • Provisioned:
        • Read = 1MB/s or 1000 records/s
          • All producer is 1MB/s
        • Write = 2MB/s per shard
          • Classic consumer: 2MB/s across all consumers
          • Enhanced consumer: 2MB/s each consumer
      • Capacity
        • Read = 8MB/s or 4000 records/s
        • Write = 4 MB/s
  • Key rotation:
    • KMS:
      • AWS Managed: rotation every 1 year
      • CMK:
        • created: must enable auto rotation every 1 year
        • imported: manually
    • SSM parameter store: No rotations
    • Secret manager: can force rotate after X days
  • GP2 IOPS to GBs in the rate of 3:1
    • 16000 IOPS / 3 = 5334 GB
  • IO2 IOPS to GBS in the rate of 50:1
    • 100000 IOPS / 50 = 200GB

SAA Stuff

  • Placement Groups
    • Ways to controls EC2 Placement
    • Strategies
      • Cluster
        • All EC2 into a single rack within an AZ. If it fails, you die
        • Good for application that needs extremely low latency
      • Spread
        • Spread out, 1 rack has 1 instance.
        • Maximum 7 instances per AZ
        • Good for critical application
      • Partition
        • Spreads out across different. 1 rack has 1+ instances.
        • Maximum 7 instances per AZ
        • Good for Hadoop, Kafka,...
  • Hibernation↓
    • Stop→Data on disk (EBS) is kept
    • Terminate→Data on disk (EBS) is lost
    • Hibernate
      • Machine state is stored in RAM
      • Root EBS must be encrypted
      • Cannot hibernates more than→60 days
      • Instance RAM must be→smaller than 150GB
      • Root volume must be→EBS
  • EBS
    • encryption is regional
    • the volume itself is locked to an az
  • HDD cannot be boot volume
  • Geoproximity Routing: location routing but with bias 1-99
  • Minimum day to transfer from S3 Standard / S3 Standard-IA to S3 Standard-IA or S3 One Zone-IA→30 days
  • Minimum charge day for S3 Standard-IA and S3 one-zone IA→30 days
  • ETL -> Glue
  • Amazon Rekognition→find objects, people, facial analysis, text, images, video
  • Amazon Transcribe→Speech to text
  • Amazon Polly→Text to speech
  • Amazon Translate→Translate duh
  • Amazon Lex→Same technology as Alexa, convert speed to text. Understanding natural language. To build chatbots and stuff ‒ like dialogflow
  • Amazon Connect→Receive calls, create contact flows, cloud-based virtual contact center. 80% cheaper than traditional contact center solutions
  • Amazon Comprehend→Natural Language Processing. Analyse customer interaction (emails)
  • Amazon SageMaker→Jupyter notebook
  • Amazon Kendra→Document search service (text, pdf, html, powerpoint, words,...) to extract answers from within a documents
  • Amazon Personalize→Personal recommendation real time
  • Amazon textract→automatically extract text and stuff
  • Site-site VPN
    • Customer side: Customer Gateway
    • VPC side: Virtual Private Gateway
  • AWS Transit Gateway connects your Amazon Virtual Private Clouds (VPCs) and on-premises networks through a central hub.