Question 1

What is RPC rate limiting?

Accepted Answer

TL;DR: RPC rate limiting is a mechanism that restricts how many requests your application can send to a blockchain node within a given time window. When you exceed the limit, subsequent requests are rejected (usually with an HTTP 429 error) until the window resets. Rate limits exist to protect shared infrastructure from abuse and ensure fair access across all users. Understanding and managing rate limits is essential for building reliable blockchain applications that do not break under load. The Simple Explanation Imagine a restaurant kitchen that can prepare 100 meals per hour. If 50 customers each order 3 dishes, the kitchen is at capacity. If one customer tries to order 50 dishes, the kitchen has to turn them away so that everyone else can still eat. Rate limiting works the same way. A blockchain node has finite processing capacity, and rate limits ensure that no single application monopolizes that capacity at the expense of everyone else using the same infrastructure. When your application sends too many RPC requests too quickly, the node responds with an error instead of the data you requested. On most providers, this is an HTTP 429 "Too Many Requests" status code, often accompanied by a JSON-RPC error body that tells you how many requests are allowed, how many you attempted, and how long to wait before trying again. If your application ignores these signals and keeps sending requests, the provider may temporarily block your IP or API key entirely. Rate limits are typically expressed as requests per second (RPS), requests per minute, or a combination of both. A provider might allow 25 RPS on a free tier, 300 RPS on a growth tier, and 1,000+ RPS on an enterprise tier. Some providers also apply per-method limits, meaning certain computationally expensive methods like debug_traceTransaction or eth_getLogs with large block ranges have lower individual limits than lightweight methods like eth_blockNumber. Why Rate Limits Exist Rate limits serve three primary purposes. First, they protect infrastructure stability. Blockchain nodes are stateful systems that maintain the entire blockchain's data and execute complex operations like EVM calls and state lookups. An unconstrained flood of requests can overwhelm a node's CPU, memory, disk I/O, or network bandwidth, degrading performance for all users or crashing the node entirely. Rate limits act as a circuit breaker that prevents any single client from causing cascading failures. Second, rate limits ensure fair resource allocation. On shared infrastructure (which is what most developers use), hundreds or thousands of applications share the same pool of nodes. Without rate limits, a single application running an aggressive indexing script or a misconfigured polling loop could consume the majority of available capacity, starving other applications of the resources they need. Rate limits enforce a fair-use policy that guarantees every application gets a reasonable share. Third, rate limits are a defense against abuse. Denial-of-service attacks, whether intentional or accidental, can take down infrastructure that many applications depend on. Rate limits prevent both malicious actors from weaponizing public endpoints and well-meaning developers from accidentally DDoSing a node with a runaway script. How Rate Limits Affect Your Application If your application is not designed to handle rate limits, it will break in production. The most common failure mode is a polling loop that queries the blockchain too frequently. An application that calls "eth_blockNumber" every 100ms to check for new blocks is making 600 requests per minute on that method alone. Add balance checks, log queries, and transaction status polls for every active user, and you can easily blow through even generous rate limits. WebSocket subscriptions can also trigger rate limits. While the initial subscription request counts as a single call, each incoming message from the server (like a new block notification or a pending transaction alert) may count toward your usage on some providers. If your application subscribes to high-frequency event streams across multiple contracts, the inbound message volume can accumulate rapidly. Failed requests due to rate limiting create a ripple effect. When your app does not receive the data it requested, it may retry the request, further increasing load. If retries are not implemented with exponential backoff, you create a feedback loop where rate-limited requests generate more requests, which generate more rate limiting. This pattern can effectively lock your application out of the RPC endpoint until the rate limit window resets. Strategies for Managing Rate Limits The most effective strategy is reducing the number of requests your application needs to make. Caching is the foundation of this approach. If your app displays ETH prices, token metadata, or other slowly changing data, cache those results locally and refresh them on a reasonable interval rather than querying the node on every page load or user action. For data that changes every block, subscribe via WebSocket rather than polling via HTTP, since a single subscription replaces thousands of individual polling requests. Batching requests reduces HTTP overhead and can help you stay within rate limits. Instead of making 10 separate calls for 10 different wallet balances, batch them into a single JSON-RPC array request. The node processes each individually, but you only use one HTTP request to deliver them all. Be aware that some providers count each method call within a batch against your rate limit, not just the HTTP request itself. Implementing client-side rate limiting is a defensive best practice. Rather than waiting for the server to reject your requests, track your outgoing request rate and queue or delay requests that would exceed your known limit. This keeps your application within bounds proactively and avoids the latency penalty of rejected requests. Libraries like bottleneck (Node.js) or ratelimit (Python) make this straightforward to implement. Choosing the right infrastructure tier for your usage is ultimately the most reliable solution. If your application consistently hits rate limits, it needs more capacity, not more optimization tricks. Upgrading to a higher tier or moving to dedicated infrastructure eliminates rate limiting as a concern entirely. What does an HTTP 429 error mean? An HTTP 429 "Too Many Requests" response means you have exceeded your allowed request rate for the current time window. The node is not broken and your credentials are usually fine; you are simply sending requests faster than your plan permits. The fix is to read the response, which often includes how long to wait, then retry after that delay using exponential backoff so repeated failures do not pile on more load. Persistent 429s mean you should reduce request volume or move to a higher-capacity tier rather than retrying harder. Understanding how RPC requests work and how an RPC endpoint processes calls makes it easier to see why some methods hit limits sooner than others. How can you avoid hitting RPC rate limits? Avoiding rate limits is mostly about sending fewer, cheaper requests and spreading them out. The table below summarizes the highest-impact techniques and when each one fits best. TechniqueWhat it doesBest forCachingReuses results for slowly changing dataPrices, metadata, balancesStreaming over pollingReplaces repeated polls with pushed dataPer-block updates and indexingBatchingCombines many calls into one requestBulk reads like many balancesClient-side rate limitingQueues requests to stay under the capBursty or user-driven trafficHigher tier or dedicated infraRaises or removes the limitSustained high request volume Switching from polling to streaming is often the single biggest reduction in request count. When optimization is no longer enough, the decision becomes a build versus buy question about how much capacity and reliability you want to own. What is the difference between rate limiting and throttling? The terms are often used interchangeably, but they describe different responses to excess load. Rate limiting enforces a hard ceiling: once you cross it, requests are rejected outright, typically with a 429. Throttling is softer: instead of rejecting requests, the system slows them down or queues them so they still complete, just more slowly. Both protect shared infrastructure, but they fail in different ways from the client's perspective. AspectRate limitingThrottlingResponse to excessRejects requestsSlows or queues requestsTypical signalHTTP 429 errorIncreased latencyClient impactFailed calls to retryDelayed but completed callsGoalEnforce a hard usage capSmooth out traffic spikes Both can show up as worse RPC latency from the application's point of view, and both are tied to overall throughput and latency tradeoffs in your infrastructure. Frequently Asked Questions What is an acceptable RPC rate limit? It depends on your workload and tier. Free tiers may allow around 25 requests per second, while growth and enterprise tiers reach hundreds or thousands. The right limit is whatever comfortably exceeds your peak request rate after you have applied caching, batching, and streaming. How should I handle a 429 response? Back off and retry rather than hammering the endpoint. Use exponential backoff with jitter, respect any retry-after hint in the response, and cap the number of retries. If 429s persist, reduce request volume or upgrade capacity instead of retrying more aggressively. Do batch requests count as one request? Not always. The batch travels as a single HTTP request, but many providers count each method call inside the batch against your rate limit. Batching still reduces network overhead, but it does not necessarily reduce the number of metered calls. Does streaming avoid rate limits? Largely, yes, for data ingestion. A push-based stream delivers new blockchain data as it is produced instead of requiring thousands of polling calls, which removes most of the request volume that triggers rate limiting in the first place. Why are some methods rate limited more strictly? Computationally expensive methods like transaction tracing or large log queries consume far more node resources than lightweight calls such as fetching the latest block number. Providers apply tighter per-method limits to those heavy calls to protect overall node stability. How Quicknode Handles Rate Limits Quicknode provides flexible rate limiting controls that put developers in charge of their endpoint's usage. Beyond plan-level RPS limits, Quicknode offers method-level rate limiting that lets you set precise limits on individual RPC methods. This is particularly useful for protecting your endpoint from runaway scripts or third-party integrations that might call expensive methods excessively. You can configure limits per second, per minute, or per day on a per-method basis, either through the dashboard UI or programmatically via the Console API. Quicknode also provides IP-based rate limiting for controlling total requests from individual IP addresses, which is useful when your endpoint is exposed to client-side applications where you cannot fully control request volume. Real-time analytics on the Quicknode dashboard show your method call volume, response statuses, and response times, giving you full visibility into your usage patterns so you can identify and address issues before they become rate-limiting problems. For teams that need to move beyond rate limits entirely, Quicknode Streams replaces the request-response pattern with a push-based data delivery model. Instead of your application polling the node thousands of times per minute, Streams pushes blockchain data directly to your webhook, database, or data warehouse as new blocks are produced. This eliminates RPC rate limits from the equation entirely for data ingestion workloads, while also reducing latency and simplifying your backend architecture. Further Reading How to Set Up Method Rate Limits - Quicknode Guide Guide to Efficient RPC Requests - Quicknode Getting Started with Streams - Quicknode Docs Quicknode Core API

Question 2

How do RPC requests work?

Accepted Answer

An RPC (Remote Procedure Call) request is how your application communicates with a blockchain node. Your app sends a JSON-formatted message to an RPC endpoint specifying which method to call and what parameters to include. The node processes the request, executes the corresponding logic against the blockchain's state, and returns a JSON response with the result. Every wallet balance check, transaction submission, and smart contract interaction follows this request-response pattern.

Question 3

What is an RPC endpoint?

Accepted Answer

An RPC endpoint is a URL that your application uses to communicate with a blockchain node. RPC stands for Remote Procedure Call, which is a protocol that lets one program request data or actions from another program over a network. In blockchain, RPC endpoints are how wallets check balances, dapps execute smart contracts, and developers read and write onchain data. Every blockchain interaction you have, whether you realize it or not, flows through an RPC endpoint.

Question 4

What is RPC latency?

Accepted Answer

RPC latency is the time it takes for your application to send a request to a blockchain node and receive a response. It is measured in milliseconds and directly impacts how fast your dapp feels to users. High latency means slow balance updates, delayed transaction confirmations, and missed trading opportunities. Low latency means a responsive, real-time experience. RPC latency depends on network distance, node performance, request complexity, and the quality of your infrastructure provider.

Want to stay updated?

Developer Tools

Docs & Guides

Want to stay updated?

Developer Tools

Docs & Guides

What is RPC rate limiting?

The Simple Explanation

Why Rate Limits Exist

How Rate Limits Affect Your Application

Strategies for Managing Rate Limits

What does an HTTP 429 error mean?

How can you avoid hitting RPC rate limits?

What is the difference between rate limiting and throttling?

Frequently Asked Questions

What is an acceptable RPC rate limit?

How should I handle a 429 response?

Do batch requests count as one request?

Does streaming avoid rate limits?

Why are some methods rate limited more strictly?

How Quicknode Handles Rate Limits

Further Reading

Start Building Now

Technique	What it does	Best for
Caching	Reuses results for slowly changing data	Prices, metadata, balances
Streaming over polling	Replaces repeated polls with pushed data	Per-block updates and indexing
Batching	Combines many calls into one request	Bulk reads like many balances
Client-side rate limiting	Queues requests to stay under the cap	Bursty or user-driven traffic
Higher tier or dedicated infra	Raises or removes the limit	Sustained high request volume

Aspect	Rate limiting	Throttling
Response to excess	Rejects requests	Slows or queues requests
Typical signal	HTTP 429 error	Increased latency
Client impact	Failed calls to retry	Delayed but completed calls
Goal	Enforce a hard usage cap	Smooth out traffic spikes