Rate Limiting
A technique that controls the frequency of requests a user or client can make to an API or service within a specified time period.
Definition
Rate limiting restricts request volume to protect APIs from abuse, ensure fair usage, and maintain performance. Limits might allow 100 requests per minute per user, with excess requests receiving error responses.
Implementation strategies include fixed windows (100/minute), sliding windows (smoothed limits), token buckets (burst allowance), and leaky buckets (constant rate). The right approach depends on use case and user experience priorities.
Why It Matters
Rate limiting protects against abuse and ensures availability. Without limits, single users or bots can overwhelm systems, degrading service for everyone.
Well-designed rate limits balance protection with usability. Too aggressive limits frustrate legitimate users. Too permissive limits fail to protect. Good limits also communicate clearly when exceeded.
Examples in Practice
Twitter's API allows 500 requests per 15-minute window. Exceeding this returns 429 errors with headers indicating when limits reset—standard practice for public APIs.
An e-commerce site implements stricter limits on checkout endpoints (10 requests/minute) than browsing (200/minute) to prevent automated purchase attacks while allowing normal shopping.