The learning AWS journey continues. This week was all about storage. Because there were too many things to share, my timeline was quite busy with multiple tweets per day.
At the beginning I started with EBS (Elastic Block Store) and EFS (Elastic File System) which are about HDD type of storage for EC2. That was interesting because I learned about a few hardware terms that I was not aware of.
The week continued with RDS which I initially thought of skipping as “it was the AWS service i’m the most familiar with” but thankfully ended up doing it and learned a lot about how to scale Databases in AWS. Especially the part about Aurora DB was really exciting.
The week ended with ElastiCache which also revealed some great techniques for managing cache. Let’s go straight to the Tweets.
What is an AWS EBS (Elastic Block Store)
A volume in which you store your EC2 data so that they don’t get lost when the instance is terminated
An EBS is network drive which works like a USB stick on the cloud.
⏳ It has a bit of latency
🔌 It can quickly be attached to other instances
🔒 It is locked to an AZ. To move it across you need to snapshot it
💰 You are billed for the provisioned capacity, not the one you use
There are 4 types of AWS EBS (Elastic Block Store)
- GP2: A general purpose SSD that balances performances and price
- IO1: A high performance SSD for mission critical low latency / high throughput work (good for large DBs) Only these 2 can be boot volumes.
- ST1: Low cost HDD for frequently accessed and throughput intense workloads
- SCI: Lowest cost HDD for less frequently access workloads.
All 4 are characterised in size, throughput and IOPS (I/O Ops Per Second)
Finally for AWS EBS we have the instance store.
It is a physical HDD attached to the instance.
➕ I/O performance, good for cache / temp content, survives reboots
➖ Lost on stop / termination, can’t resize and require manual backups
What is an AWS EFS (Elastic File System)
A managed NFS (Network File System) that can be mounted on multiple EC2 instances
It is multi AZ, highly available and scalable It is pay per use and is good for web serving, data sharing and Wordpress.
Unlike EBS, it has no capacity planning and uses a pay-per-use pricing model which scales automatically.
📈 Can have thousands of concurrent NFS clients and can grow to Petabytes scale.
Let’s compare EBS and EFS
1 instance at a time locked in 1 AZ.
To migrate we need to take snapshots and restore them to another AZ.
When running backups they use a lot of IO and shouldn’t be run when app handles lot of traffic.
Root volumes are lost upon termination
We can mount hundreds of of instances across multiple AZs
We can use them to share website files but is Linux only.
Is ~ 3 times more expensive than EBS. For cost savings can use EFS-IA (cost to retrieve files, lower to store)
What is an AWS RDS (Relation Database Service)
An AWS managed SQL only database which can be one of: Postgres, MySQL, MariaDB, Oracle, MSSQL and Aurora
As it is a managed service, we cannot SSH into our machine but only connect to the remote DB
Why RDS and not deploying a DB on EC2?
Automated provisioning, OS patching. and maintenance windows for upgrades
Continuous backups and can restore to specific timestamps
Option for multi AZ setup for Disaster recovery
Horizontal / vertical scaling
EBS storage (GP2 or IO1)
Let’s talk about backups in AWS RDS
They are automatically enabled
Daily during the maintenance window
Transaction logs every 5 minutes with ability to restore to any point in time up to 5 minutes ago
7 days retention of backups (up to 35)
Now let’s talk about snapshots.
The difference between a backup and a snapshot is that a snapshot is manually triggered by the user unlike the backups which are automatic.
Because they are manual, the retention period is as long as the user wants.
For IAM auth, we don’t need a password, just authenticate with IAM authentication tokens.
That IAM authentication token is short lived and has lifetime of 15 minutes
What are AWS RDS Read Replicas?
They are additional servers which have copies of our DB and are used for read scalability.
RDS supports up to 5 read replicas that can be within AZ, across AZ and even cross region.
Replication is ASYNC so reads are eventually consistent.
Replicas can be promoted to their own DB.
Each replica has its own connection string which an app needs to have to connect to.
A use case of using read replicas.
Your prod DB is taking on normal load and you want to run a reporting app to run some analytics.
You can create a read replica to run the new workload there and the prod DB is not affected.
Now let’s talk about RDS multi AZ DR (Disaster Recovery)
It is a SYNC replication which is when a change happens on main, also needs to happen on the secondary replicas.
There’s 1 DNS name and supports automatic app failover to secondary replicas to increase availability.
The failover might happen in case of loss of AZ, loss of network, or instance / storage failure.
No manual intervention in apps and used for scaling.
Can set read replicas as multi AZ to ensure even higher availability.
Let’s talk about AWS RDS encryption and security.
There are 2 types of encryption.
🛏 At rest encryption (data not in movement)
🛩 In flight encryption
For rest encryption, we can encrypt master & read replicas and is defined at launch time. Master not encrypted = replicas not encrypted
For in-flight encryption, we use SSL certificates to encrypt data to RDS in flight. SSL options will trust certificates when connecting to DB
To encrypt and un-encrypted RDS DB
✅ We create a snapshot
✅ We copy is and enable encryption
✅ We restore the DB from the encrypted snapshot
✅ Then migrate to new DB and delete old one
More about AWS RDS security 🔐
RDS clusters are deployed within private subnets and not public ones.
We can add Security Group rules which uses the same inbound / outbound logic as EC2.
We can also setup IAM policies about who can manage RDS.
To login to the Database we use the traditional username and password way.
For MySQL and Postgres in RDS we can also use IAM based authentication
Today we’re going to talk about AWS Aurora DB.
🥷 Not open sourced
✅ Supports MySQL and Postgres
💪 Cloud optimised with 5x performance over classic RDS
🏔 Automatically grows up to 64TB (starts with 10GB)
📈 Can have up to 15 replicas with replication process of 10ms
🌋 Instantaneous failover being high availability native
💸 20% more expensive than classic RDS
Aurora provides Serverless plans in certain regions which is great for automated DB instantiation and auto-scaling
It’s great for infrequent or unpredictable workloads
Requires no capacity planing and you pay per second which makes it super cost effective
Let’s talk about AWS Aurora Global
Cross Region read replicas that are easy to setup and useful for disaster recovery
The Aurora Global database gives 1 primary region for reads and writes and can have up to 5 secondary read regions with 16 replicas (80 in total)
Today let’s talk about AWS ElastiCache
It is a managed Redis or Memcached service
Caching is a great way to help reduce load off our DB and make our app stateless
Elastic cache has write scaling using sharing and read scaling using Read Replicas
It is multi AZ with auto failover
You can enhance the reads with read replicas for high availability
Even if your cache restarts, you still have access to your data using AOF (Append Only File) persistence
Then you can backup and restore features
Uses multiple nodes for partitioning data (sharding)
Unlike Redis, if the cache restarts, all of the data has been lost
Also there is no option to backup and restore your data
- Lazy loading 🥱 (or cache-aside or lazy population)
When reading from the cache
If data found in the cache, return them (cache hit)
If they aren’t there (cache miss), read from the DB, write them into the cache and then return them
➕ we only cache data that is used if there’s a failure, it’s not fatal as we still have the DB
➖ In case of cache miss, we do 3 round journeys to the DB and cache We might cache data that will not be used again and need to also implement an invalidation strategy
- Write through ✍🏼
That’s when we write to the DB and also to the cache
➕ Data in cache is always up to date
Reads are always fast
No one expects writes to be ultra fast
Makes sense from a UX point of view
➖ In case a page needs to read before a write takes place, there will be no data in the cache
In many cases we rely lazy loading to be there as well
We add too much into the cache and it is very likely that a lot of data will never be read
- Cache evictions and TTL (Time to Live) ⌛
When we explicitly define that cache will be available for x seconds and that it will automatically be deleted
The items that were Least Recently Used (LRU) can be evicted
TTL are good for leaderboards, comments on social media and activity streams
They can last from seconds, hours or even days
If there are too many evictions due to memory, then we need to scale either vertically or horizontally
Uuu boy, that was a lot this week. I really enjoyed it though, especially the RDS and ElastiCache part. It is really good to see that AWS training not only educates you on their services but on architectural / scaling concepts as well. These concepts are pretty much applicable to any cloud service.
Next week we’re looking at Route53, some more stuff about VPCs and S3.