System Design ch 4 - Rate Limiter

Benefits

prevent DoS attacks, either intentional or unintentional
cost: less cost allocated to servers. Also save on thrid party API calls

High Level Design

As cloud microservices have become popular, they are all behind a component called API gateway. It supports rate limiting, SSL termination, serving static content, etc.

Rate Limiting Algorithms

Token bucket algorithm

Each second, tokens are added to a bucket at token refill rate. Bucket can only have bucket capacity number of tokens, rest will overflow

Each network request will take a token and the proceed. If the bucket is empty the request is dropped.

Bucket is per granularity of rate limiting. If it is something like 5 posts per second for a user, the bucket will be separate for each user. It can also be per IP address.

Pros: simple to implement, allows burst of requests, less memory consumption Cons: might be hard to tune the 2 parameters.

Leaking Bucket Algorithm

Requests are added to a queue of fixed size. It is processed at a specified rate. If the queue is full, requests are dropped.

It is good when we need a stable inflow rate to service. Also memory efficient

Cons: Burst of traffic will fill up the queue with old requests.

Fixed window counter algo

Time is divided into fixed windows. Each window allows some number of requests. If a request arrives and the quota is finished, it is dropped.

One issue is that across the boundary of windows, more requests can pass through. Consider boundary of 1 minute. Just ±5 seconds of the boundary, 2n requests can pass where in each window we allow n requests.

Sliding window log algo

Store sorted timestamps of request arrival in a log. When a request arrives, its timestamp is inserted. Outdated timestamps (outside the window) are removed. If the log size is smaller or equal than the allowed count, then the request is allowed else it is rejected. Ll

Sliding Window Counter algo

Divide time into windows of minutes. Suppose a request comes after 30% of a window (i.e. 0.3 seconds into a window). Then take 70% times number of requests in previous window + 30% times number of requests in current window. If the number is less than requests allowed per minute, the request is allowed to go through.

This smoothes out the traffic. It has some assumptions of even traffic distribution in a window but that doesn’t have much impact as per studies.

High Level Design

Counters need to be stored. In memory database like Redis is used as disk access in traditional db is slow.

Distributed Environment

We might need multiple load balancers to handle the load. Since our compute is stateless and we don’t want to use sticky sessions, using a central data store solves the issue

Wrap up

Hard rate limiting means requests cannot exceed the threshold. Soft means they can for a short period
can also do rate limiting at IP layer using iptables

Arpit Saxena

System Design ch 4 - Rate Limiter

Table of Contents

Benefits

High Level Design

Rate Limiting Algorithms

Token bucket algorithm

Leaking Bucket Algorithm

Fixed window counter algo

Sliding window log algo

Sliding Window Counter algo

High Level Design

Distributed Environment

Wrap up

Backlinks

Graph View

Table of Contents