Answers>Learn about RPC & APIs>What is RPC rate limiting?
What is RPC rate limiting?
// Tags
RPC rate limitingAPI rate limits
TL;DR: RPC rate limiting is a mechanism that restricts how many requests your application can send to a blockchain node within a given time window. When you exceed the limit, subsequent requests are rejected (usually with an HTTP 429 error) until the window resets. Rate limits exist to protect shared infrastructure from abuse and ensure fair access across all users. Understanding and managing rate limits is essential for building reliable blockchain applications that do not break under load.
The Simple Explanation
Imagine a restaurant kitchen that can prepare 100 meals per hour. If 50 customers each order 3 dishes, the kitchen is at capacity. If one customer tries to order 50 dishes, the kitchen has to turn them away so that everyone else can still eat. Rate limiting works the same way. A blockchain node has finite processing capacity, and rate limits ensure that no single application monopolizes that capacity at the expense of everyone else using the same infrastructure.
When your application sends too many RPC requests too quickly, the node responds with an error instead of the data you requested. On most providers, this is an HTTP 429 "Too Many Requests" status code, often accompanied by a JSON-RPC error body that tells you how many requests are allowed, how many you attempted, and how long to wait before trying again. If your application ignores these signals and keeps sending requests, the provider may temporarily block your IP or API key entirely.
Rate limits are typically expressed as requests per second (RPS), requests per minute, or a combination of both. A provider might allow 25 RPS on a free tier, 300 RPS on a growth tier, and 1,000+ RPS on an enterprise tier. Some providers also apply per-method limits, meaning certain computationally expensive methods like debug_traceTransaction or eth_getLogs with large block ranges have lower individual limits than lightweight methods like eth_blockNumber.
Why Rate Limits Exist
Rate limits serve three primary purposes. First, they protect infrastructure stability. Blockchain nodes are stateful systems that maintain the entire blockchain's data and execute complex operations like EVM calls and state lookups. An unconstrained flood of requests can overwhelm a node's CPU, memory, disk I/O, or network bandwidth, degrading performance for all users or crashing the node entirely. Rate limits act as a circuit breaker that prevents any single client from causing cascading failures.
Second, rate limits ensure fair resource allocation. On shared infrastructure (which is what most developers use), hundreds or thousands of applications share the same pool of nodes. Without rate limits, a single application running an aggressive indexing script or a misconfigured polling loop could consume the majority of available capacity, starving other applications of the resources they need. Rate limits enforce a fair-use policy that guarantees every application gets a reasonable share.
Third, rate limits are a defense against abuse. Denial-of-service attacks, whether intentional or accidental, can take down infrastructure that many applications depend on. Rate limits prevent both malicious actors from weaponizing public endpoints and well-meaning developers from accidentally DDoSing a node with a runaway script.
How Rate Limits Affect Your Application
If your application is not designed to handle rate limits, it will break in production. The most common failure mode is a polling loop that queries the blockchain too frequently. An application that calls "eth_blockNumber" every 100ms to check for new blocks is making 600 requests per minute on that method alone. Add balance checks, log queries, and transaction status polls for every active user, and you can easily blow through even generous rate limits.
WebSocket subscriptions can also trigger rate limits. While the initial subscription request counts as a single call, each incoming message from the server (like a new block notification or a pending transaction alert) may count toward your usage on some providers. If your application subscribes to high-frequency event streams across multiple contracts, the inbound message volume can accumulate rapidly.
Failed requests due to rate limiting create a ripple effect. When your app does not receive the data it requested, it may retry the request, further increasing load. If retries are not implemented with exponential backoff, you create a feedback loop where rate-limited requests generate more requests, which generate more rate limiting. This pattern can effectively lock your application out of the RPC endpoint until the rate limit window resets.
Strategies for Managing Rate Limits
The most effective strategy is reducing the number of requests your application needs to make. Caching is the foundation of this approach. If your app displays ETH prices, token metadata, or other slowly changing data, cache those results locally and refresh them on a reasonable interval rather than querying the node on every page load or user action. For data that changes every block, subscribe via WebSocket rather than polling via HTTP, since a single subscription replaces thousands of individual polling requests.
Batching requests reduces HTTP overhead and can help you stay within rate limits. Instead of making 10 separate calls for 10 different wallet balances, batch them into a single JSON-RPC array request. The node processes each individually, but you only use one HTTP request to deliver them all. Be aware that some providers count each method call within a batch against your rate limit, not just the HTTP request itself.
Implementing client-side rate limiting is a defensive best practice. Rather than waiting for the server to reject your requests, track your outgoing request rate and queue or delay requests that would exceed your known limit. This keeps your application within bounds proactively and avoids the latency penalty of rejected requests. Libraries like bottleneck (Node.js) or ratelimit (Python) make this straightforward to implement.
Choosing the right infrastructure tier for your usage is ultimately the most reliable solution. If your application consistently hits rate limits, it needs more capacity, not more optimization tricks. Upgrading to a higher tier or moving to dedicated infrastructure eliminates rate limiting as a concern entirely.
What does an HTTP 429 error mean?
An HTTP 429 "Too Many Requests" response means you have exceeded your allowed request rate for the current time window. The node is not broken and your credentials are usually fine; you are simply sending requests faster than your plan permits. The fix is to read the response, which often includes how long to wait, then retry after that delay using exponential backoff so repeated failures do not pile on more load. Persistent 429s mean you should reduce request volume or move to a higher-capacity tier rather than retrying harder. Understanding how RPC requests work and how an RPC endpoint processes calls makes it easier to see why some methods hit limits sooner than others.
How can you avoid hitting RPC rate limits?
Avoiding rate limits is mostly about sending fewer, cheaper requests and spreading them out. The table below summarizes the highest-impact techniques and when each one fits best.
Technique
What it does
Best for
Caching
Reuses results for slowly changing data
Prices, metadata, balances
Streaming over polling
Replaces repeated polls with pushed data
Per-block updates and indexing
Batching
Combines many calls into one request
Bulk reads like many balances
Client-side rate limiting
Queues requests to stay under the cap
Bursty or user-driven traffic
Higher tier or dedicated infra
Raises or removes the limit
Sustained high request volume
Switching from polling to streaming is often the single biggest reduction in request count. When optimization is no longer enough, the decision becomes a build versus buy question about how much capacity and reliability you want to own.
What is the difference between rate limiting and throttling?
The terms are often used interchangeably, but they describe different responses to excess load. Rate limiting enforces a hard ceiling: once you cross it, requests are rejected outright, typically with a 429. Throttling is softer: instead of rejecting requests, the system slows them down or queues them so they still complete, just more slowly. Both protect shared infrastructure, but they fail in different ways from the client's perspective.
Aspect
Rate limiting
Throttling
Response to excess
Rejects requests
Slows or queues requests
Typical signal
HTTP 429 error
Increased latency
Client impact
Failed calls to retry
Delayed but completed calls
Goal
Enforce a hard usage cap
Smooth out traffic spikes
Both can show up as worse RPC latency from the application's point of view, and both are tied to overall throughput and latency tradeoffs in your infrastructure.
Frequently Asked Questions
What is an acceptable RPC rate limit?
It depends on your workload and tier. Free tiers may allow around 25 requests per second, while growth and enterprise tiers reach hundreds or thousands. The right limit is whatever comfortably exceeds your peak request rate after you have applied caching, batching, and streaming.
How should I handle a 429 response?
Back off and retry rather than hammering the endpoint. Use exponential backoff with jitter, respect any retry-after hint in the response, and cap the number of retries. If 429s persist, reduce request volume or upgrade capacity instead of retrying more aggressively.
Do batch requests count as one request?
Not always. The batch travels as a single HTTP request, but many providers count each method call inside the batch against your rate limit. Batching still reduces network overhead, but it does not necessarily reduce the number of metered calls.
Does streaming avoid rate limits?
Largely, yes, for data ingestion. A push-based stream delivers new blockchain data as it is produced instead of requiring thousands of polling calls, which removes most of the request volume that triggers rate limiting in the first place.
Why are some methods rate limited more strictly?
Computationally expensive methods like transaction tracing or large log queries consume far more node resources than lightweight calls such as fetching the latest block number. Providers apply tighter per-method limits to those heavy calls to protect overall node stability.
How Quicknode Handles Rate Limits
Quicknode provides flexible rate limiting controls that put developers in charge of their endpoint's usage. Beyond plan-level RPS limits, Quicknode offers method-level rate limiting that lets you set precise limits on individual RPC methods. This is particularly useful for protecting your endpoint from runaway scripts or third-party integrations that might call expensive methods excessively. You can configure limits per second, per minute, or per day on a per-method basis, either through the dashboard UI or programmatically via the Console API.
Quicknode also provides IP-based rate limiting for controlling total requests from individual IP addresses, which is useful when your endpoint is exposed to client-side applications where you cannot fully control request volume. Real-time analytics on the Quicknode dashboard show your method call volume, response statuses, and response times, giving you full visibility into your usage patterns so you can identify and address issues before they become rate-limiting problems.
For teams that need to move beyond rate limits entirely, Quicknode Streams replaces the request-response pattern with a push-based data delivery model. Instead of your application polling the node thousands of times per minute, Streams pushes blockchain data directly to your webhook, database, or data warehouse as new blocks are produced. This eliminates RPC rate limits from the equation entirely for data ingestion workloads, while also reducing latency and simplifying your backend architecture.