AI Engineering
Claude Batch API
Anthropic's Message Batches API lets you send up to 100,000 requests in a single batch and get results asynchronously within 24 hours — at **50% off** standard per-token pricing. Ideal for any workload that doesn't need real-time responses.
2
Minutes
4
Concepts
+45
XP
1
How It Works
POST /v1/messages/batches
{
  "requests": [
    {
      "custom_id": "req-001",
      "params": {
        "model": "claude-sonnet-4-6-20250514",
        "max_tokens": 1024,
        "messages": [{"role": "user", "content": "..."}]
      }
    }
    // ... up to 100,000 requests
  ]
}
  • Pricing: 50% discount on both input and output tokens
  • SLA: Results within 24 hours (often much faster)
  • Limits: 100K requests per batch, standard model context windows apply
  • Status polling: GET /v1/messages/batches/{batch_id}in_progress | ended
  • Results: GET /v1/messages/batches/{batch_id}/results → JSONL stream