Rate Limits

ClawHQ uses per-key rate limiting to ensure fair usage and platform stability. This page explains how rate limits work, how to configure them, and best practices for handling rate-limited responses in your application.

How Rate Limiting Works

Every API key has a configurable requests-per-minute (RPM) limit. When a request arrives, the API checks how many requests the key has made in the current sliding 60-second window. If the count exceeds the configured limit, the request is rejected with a 429 status code.

Rate limits are applied per API key, not per account. If you have multiple keys, each has its own independent limit. This allows you to allocate different rate limits to different services or environments.

Available Tiers

When creating or editing an API key in the dashboard, you can select from four rate limit tiers:

Tier	Requests per Minute	Recommended Use
Standard	30 RPM	Development, testing, and low-traffic applications
Enhanced	60 RPM	Small to medium production workloads
Professional	120 RPM	High-traffic applications and multi-channel deployments
Enterprise	300 RPM	Large-scale production systems with burst capacity

Tip: Start with the Standard tier (30 RPM) during development and increase as your traffic grows. You can change the rate limit tier for any key at any time from the API Access page in the dashboard without regenerating the key.

Rate Limit Response

When a request is rate-limited, the API returns a 429 Too Many Requests response with a Retry-After header indicating how many seconds to wait before retrying.

Response Headers

Header	Description
`Retry-After`	Number of seconds to wait before the next request will be accepted
`X-RateLimit-Limit`	The configured RPM limit for this key
`X-RateLimit-Remaining`	Number of requests remaining in the current window
`X-RateLimit-Reset`	Unix timestamp when the current rate limit window resets

Response Body

{
  "error": "Rate limit exceeded",
  "code": "RATE_LIMITED",
  "status": 429,
  "retry_after": 12
}

Best Practices

Use session_id for Conversations

When building conversational applications, always include a session_id parameter in your Chat API requests. This allows the agent to maintain context across messages without you needing to send the full conversation history in every request, which reduces both request size and token consumption.

Handle 429 with Exponential Backoff

Implement exponential backoff when you receive a 429 response. Start with the delay indicated by the Retry-After header, then double the wait time for each consecutive rate-limited response. Add a small random jitter to prevent thundering herd problems when multiple clients retry simultaneously.

Python Example

import requests
import time
import random

API_KEY = "clw_your_api_key_here"
BASE_URL = "https://app.clawhq.tech/api/v1"

def send_message(message, max_retries=5):
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json",
    }

    for attempt in range(max_retries):
        response = requests.post(
            f"{BASE_URL}/chat",
            headers=headers,
            json={"message": message},
        )

        if response.status_code == 429:
            retry_after = int(response.headers.get("Retry-After", 5))
            jitter = random.uniform(0, 1)
            wait_time = retry_after * (2 ** attempt) + jitter
            print(f"Rate limited. Retrying in {wait_time:.1f}s...")
            time.sleep(wait_time)
            continue

        response.raise_for_status()
        return response.json()

    raise Exception("Max retries exceeded")

JavaScript Example

const API_KEY = "clw_your_api_key_here";
const BASE_URL = "https://app.clawhq.tech/api/v1";

async function sendMessage(message, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(`${BASE_URL}/chat`, {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ message }),
    });

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get("Retry-After") || "5");
      const jitter = Math.random();
      const waitTime = retryAfter * Math.pow(2, attempt) + jitter;
      console.log(`Rate limited. Retrying in ${waitTime.toFixed(1)}s...`);
      await new Promise((r) => setTimeout(r, waitTime * 1000));
      continue;
    }

    if (!response.ok) throw new Error(`HTTP ${response.status}`);
    return await response.json();
  }

  throw new Error("Max retries exceeded");
}

Monitor Usage

Use the /api/v1/usage endpoint to track your API consumption over time. Set up a daily check that compares your usage against your rate limit to detect when you are approaching capacity and need to upgrade your tier.

# Check your current usage over the last 24 hours
curl "https://app.clawhq.tech/api/v1/usage?days=1" \
  -H "Authorization: Bearer clw_your_api_key_here"

Distribute Load Across Keys

If a single key's rate limit is insufficient, create multiple API keys and distribute requests across them. For example, create a dedicated key for each service or microservice that calls the ClawHQ API, each with its own appropriate rate limit.

Use Streaming for Long Responses

Streaming requests (stream: true) count as a single request against your rate limit, regardless of how many SSE chunks are delivered. For long responses, streaming can improve the user experience without increasing your rate limit consumption.

Tip: The rate limit headers (X-RateLimit-Remaining and X-RateLimit-Reset) are included in every successful response, not just 429 responses. Use them proactively to throttle your request rate before hitting the limit.

Rate Limits by Endpoint

All API endpoints share the same per-key rate limit. There are no separate limits for individual endpoints. A key configured for 60 RPM can make 60 requests per minute across any combination of Chat, Agents, Models, Conversations, Threads, Usage, and Health endpoints.

Configuring Rate Limits

To change the rate limit for an existing key:

Navigate to the API Access page in your dashboard
Find the key you want to modify
Click the Edit button (pencil icon)
Select the new rate limit tier from the dropdown
Click Save

The new rate limit takes effect immediately. There is no need to regenerate or replace the key.

Next Steps

Authentication — Create and manage API keys
Chat API — Start sending messages
Usage API — Monitor your consumption