Cache Considerations
- Only use for read heavy workloads.
- Think about expiration of entries, when they’ll expire
- Eviction policy: if the cache is full and a new entry needs to be added, which entry will get evicted
- Consistency: How will the data store and the cache be consistent? Modifications to both of them won’t happen in a single transaction so think about dealing with that.
- This is particularly challenging when scaling across multiple regions.
- TODO: See scaling memcache at Facebook paper
CDN Considerations
- Cost: Saving infrequently accessed assets in a CDN doesn’t make sense
- Expiry: Same as caching, provide a proper expiry so updated assets can be fetched. Rapidly changing assets should have a smaller TTL
- Invalidation: We can invalidate entries in CDN before their TTL expires by
- vendor specific APIs
- Changing the request URL through versioning, so CDN will not have the older version saved
- Handle CDN failure
Stateless Web Tier
Stateful servers can be handled by using sticky sessions with load balancers.
Message Queue
Decouple producers and consumers so that producers can continue to work if consumers are slow/down and vice versa.
Consumers can also be easily scaled up if the queue starts to fill up, transparently to the producer.
Database Scaling
Vertical Scaling
Apart from the upper bound of vertical scaling, there are other issues too. More powerful servers are much more expensive and there’s a single point of failure
Horizontal Scaling
Also called Sharding, it distributes data across multiple servers such that data held by the servers is mutually exclusive. Each shard has the same schema
A sharding key determines how data is distributed across servers. It is a collection of columns of the db. A hash function over the sharding key determines which shard the data will go to.
Challenges of sharding:
- if a shard gets full, data needs to be resharded. It can be every shard reaching its capacity or data being unevenly distributed and one shard getting overloaded. In the latter case, the hash function needs to be updated. This is commonly solved by consistent hashing
- Celebrity problem: If a shard has data that is much more frequently accessed, it can get overloaded
- hard to perform joins across database shared. Workaround is to de-normalize (?) the database.