Rate limits
Per-token request budgets and backoff guidance.
The API enforces per-token rate limits with a sliding-window counter.
A request that exceeds the budget gets 429 rate_limit_exceeded; the
counter resets at the timestamp in X-RateLimit-Reset.
Default budget
60 requests per minute per token. This is generous for normal use — even a script doing one request per second sits comfortably under it.
Heavy endpoints (bulk analytics, full-corpus reads) may be put in a separate, lower bucket. The reference page for an affected endpoint calls this out.
Headers on every response
Every response — 2xx, 4xx, even 429 itself — carries:
| Header | Meaning |
|---|---|
X-RateLimit-Limit | Your budget for the current bucket (per minute) |
X-RateLimit-Remaining | Calls you have left before the reset |
X-RateLimit-Reset | Unix timestamp (seconds) when the counter resets |
Read these proactively rather than waiting for 429.
When you hit the limit
429 response shape:
{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Try again in 23 seconds.",
"doc_url": "https://papermark.com/docs/api/errors#rate-limit-exceeded"
}
}The CLI marks this error as retryable in its
output contract and retries with
exponential backoff capped at the value in X-RateLimit-Reset. If
you're hand-rolling a client:
const reset = Number(res.headers.get('x-ratelimit-reset')) * 1000;
const waitMs = Math.max(0, reset - Date.now());
await new Promise((r) => setTimeout(r, waitMs));Reducing pressure
- Batch reads.
GET /v1/documents?limit=100is one request, not 100. - Cache by ID. Document, link, and dataroom IDs are stable; cache lookups in your client.
- Use webhooks instead of polling. (coming, see Webhooks)
Need more?
If your integration legitimately needs more than 60 RPM sustained (an analytics export pipeline, a bulk-import tool), get in touch. We'll bump your token's bucket.