[Study Notes] Designing Systems that Scale

Jay Kim

4 min readApr 15, 2024

Scalability

Horizontal Scaling

Scale in/out
More servers to maintain (including load balancer)
Practically infinite scalability
Web servers need to be stateless

Vertical Scaling

Scale up/down
Fewer servers to maintain
Limited resource capacity
Single point of failure

Failover Strategies

Cold Standby

Cheap
Needs periodic backup
There is downtime and possible data loss until new instance is provisioned and backup is restored

Warm Standby

Replication is up constantly copying from primary database host
Most databases have their own replication mechanisms built-in
Minimal downtime and data loss

Hot Standby

Web host simultaneously writes to primary and secondary database hosts (no replication)
Read operations can be distributed (kind of a horizontal scaling)

Horizontal Scaling of Databases

Sharding

Shard: a horizontal partition of a database
Each shard might have its own backup host running replication in the background
Possible to route a given request to a given partition (increased scalability and resiliency)

NoSQL

Shaded databases are sometimes called NoSQL.

Characteristics

Works best with simple key/value lookups
Formal schemas may not be needed
Tough to do joins across shards
Examples: MongoDB, DynamoDB, Cassandra, HBase

MongoDB

mongos instances route queries and write operations to shards in a sharded cluster. it provides only interface to a sharded cluster so applications should never connect directly with the shards.
mongos tracks what data is on which shard by caching the metadata from config servers. It uses metadata to route operations from applications and clients to the mongod instances.

Cassandra

Solves single point of failure by having a ring of nodes (shards)
Any one of these nodes can act as the primary interface point
Data needs to be replicated to all nodes (eventual consistency)
Nodes can be added or removed with no downtime

Data Lakes

Another strategy is to put raw data into a distributed storage system (e.g Amazon S3). This is a common approach for “big data” and unstructured data. Another process (e.g. Amazon Glue) creates a schema for that data. Some cloud-based features (e.g. Amazon Athena, Amazon Redshift) can be used to query the data. Partitioning the raw data for best performance may be required. For partitioning, think about how the end-user will be using the data or querying data.

ACID Compliance

Atomicity: Either the entire transaction succeeds or fails
Consistency: All database rules are enforced or the entire transaction is rolled back
Isolation: No transaction is affected by any other transaction that is in progress
Durability: Once a transaction is committed, it stays even if the system crashes immediately after

The CAP Theorem

You can have any two of them but not all three. Understand the requirements about scalability, consistency, and availability before proposing a specific database solution.

Consistency: Every read receives the most recent write or an error (single-master designs favor consistency and partition tolerance)
Availability: Every request receives a non-error response without the guarantee that it contains the most recent write
Partition Tolerance: System continues to operate despite any number of communication breakdowns between nodes

Caching

Discussion points

Expiration policy
Hotspots problem (the “celebrity” problem)
Cold-start problem: How to warm up the cache initially to prvent DB failures?

Eviction Policies

LRU: Least Recently Used
LFU: Least Frequently Used
FIFO: First In First Out

LRU Implementation using Doubly Linked List

Caching Technologies

Memcached

In-memory key/value store
open source

Redis

More Features: snapshots, replication, transactions, pub/sub
Advanced data structures

ElasticCache

Fully-managed Redis or Memcached AWS solution

Others

Ncache, Ehcache, etc

Content Delivery Networks (CDNs)

Geographically distributed fleet of servers
Local hosting of HTML, CSS, JavaScript, Images, etc
Useful for static content or some limited computation
Cost can be expensive quickly depending on the where the edge locations are implemented (Think about what needs to be hosted on CDN)

Resiliency

Secondaries should be spread across multiple availability zones and regions
System needs enough capacity to survive a failure at any reasonable scale (over-provisioning)
Think about balance between budget and availability

Distributed Storage Solutions

Cloud: Amazon S3, Google Cloud Storage, Microsoft Azure
Self-Hosting: Hadoop HDFS

Example of Distributed Storage Solution: HDFS Architecture

Files are broken up into blocks
Rack-aware replication
Writes get replicated across different racks
For high availability, there maybe 3 or more NameNodes and highly available data store for metadata
Clients try to read from the nearest replica