Notes

AppSync

can do real-time stuff, WebSocket, MQTT on WebSocket
GraphQL

Lambda

Lambda Event Source Mapping is to pull from service synchronously to trigger something.
- When pooling from DynamoDB or Kinesis Data Stream, it doesn't delete the item
- If pooling from SQS, SNS, it deletes the item.
- if we want in-order processing, we need to use FIFO
The more RAM, the stronger the CPU Lambda Function Performance
Don't need to run x-ray daemon
For
- lambda to invoke other services: Lambda Execution Role (IAM Role)
- other services to invoke lambda: Lambda Resource Based Policies
invocation on CLI
- aysnc: --invocation-type: Event
- sync: --invocation-type: RequestResponse

SAM

Has traffic shifting feature, automated rollback since it's using CodeDeploy in the background
To step through and debug the code, use SAM CLI + AWS Toolkits
AWS Policy Templates: to give permission to lambda function (yes only lambda)
- Some of the command
  - sam build: fetch and generate and create local deployments artifact
  - sam package: package and upload to S3
  - sam deploy: deploy to CloudFormation
  - sam publish: publish to AWS SAR

SAR

Serverless application repository to store SAM applications

KMS

AWS managed key for us
Has 2 main services
- AWS Managed Key: Pre-defined key. Free of charge. For example aws/rds, aws/ebs
- Customer Managed Keys (CMK): has 2 type of keys which are both $1 per month per key
  - Create your own
  - Import your own
Encryption and Decryption limited to 4KB, if more than that we need to use Envelope Encryption technique
Key rotation
- AWS Managed Key: automatic every 1 year
- Customer Managed Key: must be enabled
  - For created key, automatic every 1 year
  - For imported key, must manually rotate using alias
    - (makes sense cus you import the key)
Key policy
- You need to have key policy to even access to the key
- Default will allow the entire account to access the key if you don't specify one.
- Good for cross-account access
Envelop encryption / decryption
- Technique to encrypt / decrypt large file (more than 4KB)
- Use something called a Data Key.
- Data Key Caching:
  - Cache the data key and re-using it to avoid having to call KMS to reduce quotas consumption
  - For example: S3 Bucket Key
Throttle and stuff
- Every services that make request to KMS will share the same quota across each region
- Solution
  - Request Quotas increase
  - Data Key Caching
  - Exponential Backoff
Some of the API
- Encrypt: 4KB encrypt
- Decrypt: 4KB decrypt
- GenerateRandom : return random byte of string
- GenerateDataKey: Generate unique data key, return a plaintext and encrypted copy of the CMK
- GenerateDataKeyWithoutPlainText: Generate a data key but for later use at some point

AWS SSM Parameter Store

Old way to store parameter and secret, we use the Parameter Store of System Manager to store stuff
Can store in hierarchy as well
No charge if we use standard tier, providing
- Maximum size of parameter: 4KB
- Total number of parameters: < 10,000
Advanced tier can have TTL (expiration policies) to force update or delete
No automatic rotation
KMS is mandatory

AWS Secret Manager

Newer services dedicated to storing secrets and stuff
Capacity to force rotation every X days
Seemlessly integration with RDS
Can do hierarchy storing (folder like structure)
Automatic rotation
KMS is optional

Step Function

Organise workflows as state machine
Standard vs express
- Standard if you need maximum duration 1 year
- Express if it's 5 minutes
- Notes: express support more executions, standard has 2000 capped
Error handling:
- We can use Retry or Catch with exponential backoff rate
State types
- Task state: (Lambda job, Batch job, SQS, EC2, ...) can invoke or run 1 activity
- Choice state: make a choice
- Fail or Succeed rate: stop an execution with failure or success
- Pass state: pass through input as output, don't do anything
- Wait state: delay
- Map state: dynamically iterate (loop)
- Parallel state: begin parallel execution

CloudFormation

ChangeSet: See what changes before updating the stack. This won't say if the update is successful.
CloudFormation Drift: check if there is manual config
Can be edited using
- YAML
- CloudFormation Designer
Building block
- resources (mandatory)
- Mappings: constants to declare
- Parameters: for users to put inputs in
- Outputs: what to export
- Conditions: needs to satisfies these conditions
Can't edit previous template, have to update the new one to overwrite
Rollback behaviour:
- Stack creation fails:
  - Everything roll back (removed)
- Stack update fails:
  - roll back to previous working state
Nested stack
- for reusing components
Cross stack:
- to share the output of one stacks to another, export some values
Stacksets:
- Create, update delete stack across multiple region

CloudFront

Pricing classes:
- All: all regions (best performance)
- 200: exclude most expensive
- 100: only the least expensive
Multiple region:
- Comes with ability to route to different origins based on the path
Origin group:
- Consist of 1 primary and 1 secondary: if primary fail, we use secondary
- always route to primary first before routing to secondary
Field level encryption:
- protect sensitive information encrypted at edge close to the user
- Specify the specific field in POST request you want to encrypt
CloudFront caching:
- Can cache based on header, session cookies, query string parameters.
- Cache luives at CloudFront Edge Location
- Control TTL of the cache
- Invalidate cache using CreateInvalidation API
- Have the ability to cache between static and dynamic content
  - For dynamic cached based on headers, and cookies
  - For static cache normally.
- Security
  - Viewer Protocol Policy
    - To work with Client, viewer side
      - redirect client http to https
      - force client to use http only
  - Origin protocol policy
    - Protocol side
      - Specify HTTPS only
      - Match with Viewer Protocol
- Note: S3 Bucket website doesn't support HTTPS
- CloudFront Signed URL:
  - Distribute premium content
  - any origin
  - account-wide
  - different than S3 Signed URL:
    - S3 signed URL only have limited life type
    - Use the IAM key of the signing IAM principal => User have the same permission
    - no caching, only can sign s3 bucket
  - When signing it's recommend to use trusted key group
    - Support auto rotation key
    - Otherwise you need to use root account which is not recommended

CodeArtifact

To store software dependencies (maven, npm)
For CodeBuild and develoeprs to retrieve artifiacts

CodeBuild

Like Jenkins CI, supports Build and Test the application
Cannot access to resources in VPC, if you want you need to configure it

CodeCommit

Github thingy

CodeDeploy

CD to deploy. Need appspec.yml in the root directory
Components
- Application: application's name
- Compute platform: EC2, Lambdaa
- Deployment Configuration:
  - One at a time
  - Custom (min healthy instances)
  - All at once
  - Half at a time
- Deployment group
  - Group of EC2 or ASG
- Deployment type
  - in-place deployment
  - blue/green deployment
- IAM Instance Profile: to get permission to access to S3 and Github
- Application revision
- Service Role: IAM Role for code deploy to perform operations on EC2
- Target Revisision: most recent revision you want to deploy
Hooks (will be called in this order):
- ApplicationStop
- DownloadBundle
- BeforeInstall
- Install
- AfterInstall
- ApplicationStart
- ValidateService:
  - Make sure the service is working correctly
Rollback:
- Automatically or manually, when rollback it will redeploy previous last known good revision as a new version
  - If we are doing in-place upgrade, we need to re-deploy the original version.
  - Only blue-green deployment allows you to rollback

CodeGuru

ML powered service

CodeGuru reviewer: Automated code review
CodeGuru profiler: application performance recommendations

CodePipeline

Visual workflow for CI CD pipeline
Flow mananger CodeCommit, CodeBuild, CodeDeploy integration
Next step read CodePipeline Artifact

CodeStar

All-in-one central UI for CodeCommit, CodeBuild, CodeDeploy, CodePipeline
Has Cloud9 web IDE development

DynamoDB

Has two modes
- On-demand:
  - Automatic increase scale for read, write
  - Unlimited WRUs (write request unit) and RCUs (read request unit)
  - Charge based on RRUs (read request units) and WRUs (Write request unit)
  - More expensive
- Provisioned:
  - Have to provision RCUs and WCUs
    - RCUs calculation:
      - 1 Read request = 1 RCU
      - 1 RCU = 1 Strongly Consistent Read = 2 Eventually consistent read
      - Up to 4 KB read per item
    - WCUs
      - 1 Write request = 1 WCU
      - Write up to 1 KB
      - Has 2 mode
        Standard
        Transactional: Single or nothing
  - Cheaper, have option to setup auto-scaling of RCU and WCU to meet demand by providing min, max, target utility (%)
  - Throughput can be exceeded temporarily using Burst Capacity
DynamoDB transaction:
- Consumes 2x WCUs and RCUs since it performs 2 operations for every item (prepare & commit)
  - TransactGetItems
  - TransactWriteItems
Write types:
- Concurrent writes (1 overwrites the other)
- Conditional writes
- Atomic writes (take both)
- Batch writes
Primary key (required) can be either
- hash (partition key only)
- hash + range (partition key + sort key)
Sort Key (optional): determine the order fo how the data can be sorted
APIS
- PutItem: create or replace
- UpdateItem: edit item's attributes
- Conditional Writes
  - No performance impact
- GetItem:
  - ProjectionExpression: To retrieve only certain attribute, use
  - Default for Eventually Consistent
- Query: query item using
  - For key attribute, use: KeyConditionExpress
    - key attribute can only use =
  - For sort key, use: FilterExpression
    - Sort key can use =, >=, <=,...
  - For other attribute, use FilterExpression
    - Happening in client side, runs after Query is executed but before result returns to you
- Scan:
  - Read entire table then filter out the data (inefficient)
  - For faster performance, use Parallel Scan
  - can use ProjectionExpression & FilterExpression
    - FilterExpression is client side filtering
    - ProjectionExpression is server side, the functionality is mostly the same
- DeleteItem
  - Delete individual item
  - can perform conditional delete
- DeleteTable
  - Drop the whole table, quicker then calling DeleteItem on all items
- BatchWriteItem:
  - Up to 25 PutItem and/or DeleteItem
  - maximum 16MB of data written with 400 KB of data per item
  - can't update item
- BatchGetItem
  - return the items from one or more tables parallely
  - up to 100 items or 16MB of data
- --page-size: default 1000 items
  - how many items are we query concurrently
  - for example, if page size = 100 and we have 1000 items. We have 10 concurrent api calls to avoid timeout
- --max-item:
  - max item to show for the current query, return NextToken for --starting-token
- --starting-token:
  - Given a NextToken and start querying from there, pagination stuff
Local Secondary Index (LSI)
- additional sort key
- Can have up to 5 LSI
- Must be defined at creation time.
- Attribute projections:
  - Can get all attirbutes in the base table
  - Can get these attribute even though it's not projected with the index
- Use WCUs and RCUs of the main table
Global Secondary Index
- additional primary key (can be hash or range + hash)
- To speed up on queries that are not primary key
- Attribute projections:
  - Can also get attributes in the base table
  - However, if the attributes is not projected with index, we can't get it
- Must provision RCUs and WCUs separately for the index (autoscalable)
Limitations:
- Each table has infinite number of rows
- Maximum size of item is 400 KB
  - Supported type includes: String, number, binary, boolean, null, List, Map, Set
DynamoDB Accelerator (DAX)
- Improve read by using memory cache.
- 5 minutes TTL. Up to 10 nodes
- Multi-AZ support
- Comparison to ElastiCache:
  - DAX is good for individual object cache
  - Elasticache is more like caching the result as the whole
Optimistic locking
- Allow you to have conditional write
PartiQL
- Query DynamoDB using SQL like syntax
- Support Batch operations
Partition
- Internally, DynamoDB store data in partition
  - WCUs and RCUs are spread evently across partitions
    - So if you have 10 partition and 10 RCUs and 10 WCUs, each partition has 1 RCU and 1 WCU
- Partition strategies
  - If the partition is too hot, we can
    - Shard using random suffix
    - Shard using calculated suffix
Security
- VPC endpoints, IAM, SSL (Secure Sockets Layer), TLS (Transport Layer Security)
DynamoDB Global tables
- Multi-region, multi-active, fully replicated, high performance
DynamoDB local
- Test locally without accessing DynamoDB web for local environment
AWS Database Migration Service
- To migrate other databases into DynamoDB (MongoDB, Oracle, Mysql)
For user to interact with DynamoDB directly
- use Cognito, specify LeadingKeys and Attributes to limit specific attributes user can see
DynamoDB stream:
- Ordered stream level modifications
- retention up to 24 hours
- Only receive updates after you enable stream
Throttling
- If exceeded provisioned RCU and WCU, we're gonna get ProvisionedThroughputExceededException
- Solutions
  - Exponential Backoff
  - Redistribute (re-shard)
  - If it's RCU can try DAX
- For DynamoDB Global Secondary Index if WCU throttled, the main table will be throttled as well
- For DynamoDB Local Secondary Index, since we're using the same WCUs and RCUs of the main table, no special throttling considerations
TTL
- Time to live for each item, automatically deleted within 48 hours of expirations
  - Will be deleted from both LSI and GSI
  - Expired items that haven't been deleted will still appears in reads queries or scans
- Doesn't consume WCUs, delete operation goes to DynamoDB stream

ECS

ECR: Repository for ECS
- to push/pull from ECR use the native docker push/pull
Rolling update
- Can specify
  - Minimum healthy percentage
    - minimum task to be healthy (0 - 100%)
  - Maximum healthy percentage
    - maximum task to be healthy (100% - 200%)
Task placement
- Strategies
  - Type
    - BinPack: try to fill in one EC2 as much as possible
    - Random: randomly placed in EC2
    - Spread: spread across specific value (availability zone, instance id, ...)
  - We can mix the type together, for example spread the availbility zone and binpack the memory
- Constraints
  - distinctInstance: each task is placed in a different container instance
  - memberOf: place task on instance that satisfy an expression
Task Definition:
- JSON form tell ECS how to run a docker container, has image name, port binding, ...
- For EC2 launch type:
  - Get dynamic host port mapping if you define only the container port in the task definition
- For Fargate launch type:
  - Each task has an unique private IP

Elastic Beanstalk

Extensions
- To configure Elastic Beanstalk using code
- Add stuff inside .ebextensions/
- Files has to be eitrher yaml or json and the ending need to be .config
Can integrate with HTTPS
- By loading cert into .ebextensions/securelistener-alb.config
- or setup using Console or ACM (AWS Certificate Mananger)
Can also redirect from HTTP to HTTPs
Cloning: Allow to clone the exact same Elastic Beanstalk environment. Good for testing
Components
- Application: collection of elastic beanstalks components (like a folder)
- Application Version: the version of your code
- Environment: Collection of AWS resources running the application
  - Tiers:
    - Web Server Tier
      - Have ELB with EC2 running web server with ASG
    - Worker tier
      - SQS Queue and EC2 with ASG
  - Can create multi environment (dev, test, prod)
Custom platform:
- Define your own platform with custom OS system, software and scripts
- Define an AMI in platform.yaml file
Deployment
- process
  1. Describe dependencies (in package.json or requirements.txt)
  2. Upload the code
    - Through console: Using zip
    - Through CLI: just type the command it will zip for you
  3. It will deploy the thing on EC2, resolve dependencies and start the application
- options:
  - Single Instance
  - High Availability with Load Balancer
Docker Integration
- Single Docker
  - Does not use ECS
  - Provide Dockefile or Dockerrun.aws.json
- Mutli Docker
  - Use ECS, will create ECS cluster
  - Provide Dockerrun.aws.json
  - Your docker image must be prebuilt and store ECR (for example)
Lifecycle policy
- To remove the old version
- There are options to not delete the source bundle in S3 to prevent data lost
Migration
- When we want to change the config of things we can't change, we need to do migration
  - For example, you can't change the network load balancer to application load balancer. If we want to change, we have to:
    1. Create a new environment with the same configuration except the load balancer
    2. Deploy the application in the new environment
    3. Put the new load balancer in
  - For RDS separation (to prevent database gets deleted if we delete the elastic beanstalk stack):
    1. Create snapshot of RDS
    2. go to RDS console and protect RDS database from deletion
    3. Create a new beanstalk environment with same config but without the RDS. Point the application to the new exisiting RDS.
    4. Delete the old application
Update options
- All at one
- Rolling
  - Update a few instances at a time
- Rolling with additional batch
  - Same as rolling but go over extra capacity
- Immutable
  - Use another Auto Scaling Group for the update
  - zero downtime
- Blue/Green deployment
  - Zero downtime, create new environment and shift traffic gradually over
- Traffic splitting
  - Good for testing split between different auto scaling group

Elasticache

Redis Cluster Mode enabled
- Good for scale write
- Primary node is spread across multiple shards
  - So we can do write on multiple primary nodes
- Have multi-az
- 1 Primary has 0-5 replica node
Redis Cluster mode disabled
- Cheaper
- 1 Primary has 0-5 replica node
- Write on primary and read on secondary
- Have multi-az
Caching strategies
- Lazy Loading / Cache-aside / Lazy population
  - Only write cache if the cache does not contain information
  - Cons: Cache miss latency, data staling if cache keeps missing
- Write through (usually combined with lazy-loading)
  - Update the cache when the database is update
  - Cons:
    - Data is missing if you don't update the database
      - Can implement Lazy Loading to avoid this
    - Cache churn: waste cache space (a lot might not being used)
Cache eviction and Time-to-leave
- remove old cache to save some spaces.
  - LRU
- If too many evictions happen, we might need to increase cache size

SQS

Dead-letter queue:
- messages that are not processed will go to this queue
Fanout pattern
- Combination of SNS and SQS
Queue Type
- Standard
  - Can have duplicated message but guarantee at least 1 delivery
  - can have out of order message
- FIFO
  - name needs to end with .fifo
  - Guarantee to be in-ordered, exactly 1 message
    - Using deduplication
      - Default is 5 minutes: if you send the same message within 5 minutes, it will be refused
      - Detecting the same mssage by content-based deduplication or providing the message deduplication id
Access policy
- Cross-account access
- Allow other services to send messages to SQS
APIs
- CreateQueue
  - MessageRetentionPeriod
- DeleteQueue: Delete the whole queue
- PurgeQueue: delete all messages only
- SendMessage: (batch support)
  - DelaySeconds
- DeleteMessage (batch support)
- ReceiveMessage
  - MaxNumberOfMessages: from 1 -> 10
- ReceiveMesageWaitTimeSeconds: Long Polling
  - Wait for message to arrive if non from the queue
  - can wait from 1-20 seconds (preferably 20 seconds)
- ChangeMessageVisibility (batch support): message timeout
- Note: Batch API can help reducing the cost
Consumer:
- Receive up to 10 messages at a time
Producer
- Unlimited throughput
- message persists until the consumer deletes
Delay Queue
- Set delay messages in the queue up to 15 minutes
  - default is 0 seconds
- Can overwrite using DelaySeconds
Extended Client
- If the message is too big ( > 256 KB), use this which used S3 to send and link message
- Only for Java
FIFO Message grouping
- Only for FIFO
- Use Group ID to group all the message.
- Each Group ID can only have 1 consumer
- Ordering across group is not guaranteed
Visibility Timeout
- Invisible to other consumers when polled by a consumer
  - Default is 30 seconds
- If within the visibility timeout and you haven't processed the message, message will go back to the queue for other consumer to process

SNS

pub/sub can have many receivers
- Up to 12,500,000 subscriptions per topics
- up to 100,000 topics
Publish type
- Topic Publish (SDK)
  - Publish straight to the subscriptions
- Direct Publish
  - Publish to a platform endpoint (work with 3rd party)
Access Policies
- For cross-account access
- Control which services write to your SNS
SNS FIFO
- The only subscribers can be SQS FIFO
- name has to end with .fifo
  - ordering by Message Group ID
  - Deduplication using Deduplication ID or content based deduplication
Message filtering
- Allow subscriber to choose what to receive
- If doesn't specify, receives all messages

Kinesis

Kinesis Data Stream
- Real-time stream data into your application
- Provisioned:
  - select the number of shard, pay per shard per hour
  - each shard gets 1 MB/s (or 1000 records per second) in and 2MB/s out
- Capacity
  - Pay per stream/hour, data in/out in GB
  - 4MB/s in or 4000 records per second
  - Scale automatically
- Retention:
  - 1-365 days (1 day by default)
- Ability to replay data
- To increase throughput, enable Enhanced mode
- Doesn't work with SNS
- Consumers: subscribe to data
  - AWS Lambda
    - Support both classic and enhanced fanout consumers
  - Kinesis Data Analytics
  - Kinesis Data Firehose
  - Kinesis Client Library
    - Java library that act as a Kinesis consumer
    - Each shard is read by 1 Kinesis Client Library instance
      - 4 shards = max 4 KCL instances
      - 6 shards = max 6 KCL instances
    - if needed Enhanced Fan-out consumer, use v2
  - Kinesis Custom Consumers
    - Classic Fan-out Consumer
      - 2 MB/sec per shard across all consumers
      - Use GetRecord() API
    - Enhanced Fan-out Consumer
      - Use SubscribeToShard() API
      - 2 MB/sec for each shard for each consumer
      - lower latency but higher cost
- Producer: produce data
  - Data consist of
    - Sequence number: unique per partition key within shard
    - Partition key: (must specified)
      - Hash the data to relevant shard
      - Can be used to achieve ordering
  - Producers
    - AWS SDK: Simple
    - Kinesis Producer Library (KPL): advanced with compression, retries
    - 1MB/sec or 1000 records / sec per shard
Kinesis Data Firehose
- Similar to DataStream but fully serverless
- Destinations
  - S3
  - RedShift
  - ElasticSearch
- Near-real time
  - 60 seconds latency for non full batches
  - 1 MB of data at a time
- Works with SNS
Kinesis Data Analytics
- Real time query both Data Stream and Data FireHose using SQL
- Fully managed Serverless
- Apache Flink:
  - Use Amazon MSK to work with Apache Flink
Throttle
- Happens when we have more throughput than provisioned
- Retries with exponential backoff
- Reshard / increase shard
Common operations
- Shard Splitting
  - increase stream capacity limit
  - divide "hot shard". Increase stream capacity
  - Old shard will be close and delete once the data is expired
- Merge shard
  - decrease stream capacity to save cost
  - Old shard will be close and delete once the data is expired
  - can't merge more than 2 shard in single operation

CouldTrail

Get history of events, api calls within AWS
Governance, compliance, auditing
Three types of events
- Management events (default)
  - Operations that are related to management to your AWS
- Data events
  - S3 Object level activity
  - AWS Lambda function execution
- CloudTrail Insights event
CludTrail Insights
- Automatically detect unusual activities using Machine Learning
- analyse normal events to create a base line

CloudWatch

CloudWatch Metrics
- Provide metrics to your AWS service
- default: every 5 minutes
  - with detailed monitoring, can get for every 1 minute
  - If want less than 1 minute, need to define Custom Metrics
- Note: EC2 memory usage is not pushed, if you want to push, you have to create Custom Metrics
- Custom metrics:
  - Define your own metrics
  - Resolution (you can define how often the metrics get pushed)
    - Standard: 1 minute
    - High resolution: 1/5/10/30 seconds
      - If you want CloudWatch alarm can handle 10, 30 or 60 seconds
CloudWatch Alarms
- Trigger notifications based on metrics
- Can specify an action for each alaram state
- Test the alarm in the CLI using set-alarm-state
CloudWatch Events
- Intecepts events from AWS Service
  - Create a JSON payuload and send to the target
- Schedule Cron Job
CloudWatch Logs
- log groups:
  - Some name to represent the application
- log stream:
  - log files,
- Can have expiration
  - But nevery expire by default
- Can have encryption at log group level
  - Using AWS KMS
  - If you wanna use CMK, need to specify in the CLI
    - associate-kms-key if log group exists
    - create-log-group if it doesn't exist
- Can export the logs to S3 but it's not real time if we want real time we can use CloudWatch Logs Subscription
- CloudWatch Logs Subscription
  - Filter that applied on top of cloudwatch
  - Good for multi-account logs
- CloudWatch Logs Agent
  - By default, no logs from EC2. We need CloudWatch Logs agent for EC2 to have Logs
  - Can only monitor CPU, disk, and high level network
  - If need RAM and stuff, or more details measurement, need CloudWatch Unified Agent
- CloudWatch Unified Agent
  - Updated version of CloudWatch Logs Agent
  - Collects RAM, process, Netstat
  - Integration with SSM Parameter Store
- CloudWatch Logs Insight
  - To query cloudwatch logs

Event Bridge

Newer CloudWatch Events
Default Event Bus:
- generated by AWS (similar to CloudWatch Events)
Partner Event Bus:
- Receive event from third party
Custom Event Bus:
- Your own thingy
Ability to replay archived events
Resource-based policy management access

X-Ray

Debug, visualise analysis of the applications
Troubleshoot bottle necks
X-Ray Sampling:
- Control the amount of data to send to X-Ray
- Default rule:
  - reservoir: 1
    - at least 1 request is sent to XRay each second as long as service is serving request
  - rate: 5%
    - 5% of additional requests after reservior are sent as well
Integration
- ECS, a few options:
  1. Run X-Ray daemon container in each EC2
  2. X-Ray as a side car:
    - X-Ray in each application container (1 EC2 can have multiple containers)
    - If running Fargate, we have to use this one because we dont have access to EC2
- Elastic BeanStalk
  - Enable in .ebextensions/xray-daemon.config
  - Not provided for multi-container docker
APIS
- GetSamplingRules: retrieve all sampling rules
- For Write excusively
  - PutTraceSegments: upload segment to XRay
  - PutTelemetryRecords: used by daemon
- For Read explicitly
  - GetTraceGraph
  - GetServiceGraph
  - GetTraceSummaries

ACM

Let you provision and manage SSL (Secure Sockets Layer) certs
Free of charge for public TLS cert

CloudMap

Create a map of backend services for visualisation
Register application locations and health status check

Datasync

Synchronise data in file storage system (not database)
- From on-premises to cloud (need DataSync agent)
- From one AWS to AWS (no need agent)

AWS FIS (Fault Injection Simulator)

Run fault injection expirements (chaos engineering)

AWS SES (Simple Email Service)

Send email

Exponential Backoff

Keep trying, the next tries exponentially double the time as the current try
Should only retries for 5xx error and 429

S3

Athena
- Serverless Query on S3 Object
- Support compression or columnar for cost saving
Bucket Key:
- Optimisation to reduce number of calls to AWS KMS for encryption
4 Encryptions methods:
- SSE-S3: encryption using key handled by AWS
- SSE-KMS: use KMS managed key to encrypt objects
  - can be optimised using S3 Bucket Key
- SSE-C: client encryption
  - The only one that mandates you to use HTTPS
  - You use your own key but AWS do the encryption
- client side encryption
  - You encrypt yourself
You can force SSL by explicitly DENY if

"Condition": {
	"Bool": {
		"aws:SecureTransport": "false"
	}
}

Force KMS encryption by DENY if

"Condition": {
	"StringNotEquals": {
		"s3:x-amz-server-side-encryption": "aws:kms"
	}
}

Strong consistency: As of 2020, write and read is consistent

NACL (Network ACL - Access Control List)

Networking/Firewall controls traffic from and to
Can have ALLOW and DENY
works at subnet level
Stateless: have to specify in and out traffic

Security group

Work at instance level
Can only have ALLOW rule
Stateful: traffic returns automatically allowed

VPC Flow Logs

Capture information about
- VPC
- Subnet
- Elastic Network Interface

VPC Peering

Connect 2 VPC together
Have to connect A to B, B to C and A to C

VPC Endpoints

Allow AWS services to talk to eachother via private network
Endpoint Gateway: S3 and DynamoDB
Endpoint Interface: others

Site-Site VPN

On-premise VPN to AWS
Does not allow you to connect to VPC endpoint

Direct Connect (DX)

Physical connection
Does not allow you to connect to VPC endpoint

Subnet

is within a VPC
Public: accessible from internet
Private: not accessible from internet

Internet Gateway

Provide Internet to a VPC

NAT Gateway or Nat Instance

NAT Gateway is serverless
NAT Instance is created from an AMI on EC2
Provide connection to internet gateway for instances in private subnet

Data Limitations

Read write through put
- RCU:
  - 1 read = 4 KB
    - eventually consistent take 1/2 read
  - 1 write = 1 KB
    - Transactional takes 2 write
- Kinesis DataStream
  - Provisioned:
    - Read = 1MB/s or 1000 records/s
      - All producer is 1MB/s
    - Write = 2MB/s per shard
      - Classic consumer: 2MB/s across all consumers
      - Enhanced consumer: 2MB/s each consumer
  - Capacity
    - Read = 8MB/s or 4000 records/s
    - Write = 4 MB/s
Key rotation:
- KMS:
  - AWS Managed: rotation every 1 year
  - CMK:
    - created: must enable auto rotation every 1 year
    - imported: manually
- SSM parameter store: No rotations
- Secret manager: can force rotate after X days
GP2 IOPS to GBs in the rate of 3:1
- 16000 IOPS / 3 = 5334 GB
IO2 IOPS to GBS in the rate of 50:1
- 100000 IOPS / 50 = 200GB

SAA Stuff

Placement Groups
- Ways to controls EC2 Placement
- Strategies
  - Cluster
    - All EC2 into a single rack within an AZ. If it fails, you die
    - Good for application that needs extremely low latency
  - Spread
    - Spread out, 1 rack has 1 instance.
    - Maximum 7 instances per AZ
    - Good for critical application
  - Partition
    - Spreads out across different. 1 rack has 1+ instances.
    - Maximum 7 instances per AZ
    - Good for Hadoop, Kafka,...
Hibernation↓
- Stop→Data on disk (EBS) is kept
- Terminate→Data on disk (EBS) is lost
- Hibernate
  - Machine state is stored in RAM
  - Root EBS must be encrypted
  - Cannot hibernates more than→60 days
  - Instance RAM must be→smaller than 150GB
  - Root volume must be→EBS
EBS
- encryption is regional
- the volume itself is locked to an az
HDD cannot be boot volume
Geoproximity Routing: location routing but with bias 1-99
Minimum day to transfer from S3 Standard / S3 Standard-IA to S3 Standard-IA or S3 One Zone-IA→30 days
Minimum charge day for S3 Standard-IA and S3 one-zone IA→30 days
ETL -> Glue
Amazon Rekognition→find objects, people, facial analysis, text, images, video
Amazon Transcribe→Speech to text
Amazon Polly→Text to speech
Amazon Translate→Translate duh
Amazon Lex→Same technology as Alexa, convert speed to text. Understanding natural language. To build chatbots and stuff ‒ like dialogflow
Amazon Connect→Receive calls, create contact flows, cloud-based virtual contact center. 80% cheaper than traditional contact center solutions
Amazon Comprehend→Natural Language Processing. Analyse customer interaction (emails)
Amazon SageMaker→Jupyter notebook
Amazon Kendra→Document search service (text, pdf, html, powerpoint, words,...) to extract answers from within a documents
Amazon Personalize→Personal recommendation real time
Amazon textract→automatically extract text and stuff
Site-site VPN
- Customer side: Customer Gateway
- VPC side: Virtual Private Gateway
AWS Transit Gateway connects your Amazon Virtual Private Clouds (VPCs) and on-premises networks through a central hub.

ReactiveStream API

Reactor Core

RxJava