What is batch pricing?
Batch pricing is a discounted rate offered by model providers for API requests submitted in bulk with no latency guarantee. Instead of processing each request immediately, the provider queues batch requests and processes them when compute capacity is available — typically within 24 hours.
OpenAI offers 50% off for batch API requests. Anthropic offers a similar discount. The model output is identical — the only difference is response time.
Why it matters
If your workload does not require real-time responses — data processing, content generation at scale, overnight evaluation runs — batch pricing cuts your model cost in half. The constraint is latency, not capability.