What is async vs sync?
Synchronous (sync) API calls block until the response is complete — your code waits. Asynchronous (async) calls return immediately with a reference, and you check back later for the result. Streaming is a third pattern: the response arrives token by token as the model generates it.
Sync is simpler to implement. Async is more efficient at scale — your application can process other work while waiting for model responses. Streaming gives the fastest perceived latency because the user sees output appearing before generation is complete.
Why it matters
The choice between sync, async, and streaming determines your application’s perceived speed and scalability. A chatbot needs streaming for user experience. A batch pipeline needs async for throughput. A simple script can use sync for simplicity. Most production applications use streaming for user-facing features and async for background processing.