Streaming

Users expect to see responses as they’re generated, not after a long wait. The SDK streams responses back from the API in chunks, enabling real-time UI updates where text appears word by word.

Callbacks

The SDK provides callbacks for each phase of the streaming lifecycle. The onData callback receives each content chunk as it arrives — use this to append to a displayed response. When streaming completes, onFinish is called with the full response object. If something goes wrong, onError handles it.

You can cancel streaming at any time by calling stop(). Partial responses are automatically saved to the conversation, so nothing is lost.

Usage

All chat hooks support streaming out of the box. For basic streaming without persistence, use useChat. For streaming with automatic message storage, use useChatStorage.

The Chat with Storage tutorial shows a complete streaming chat UI implementation.

Extended Thinking

Some models like OpenAI’s o-series or Claude with extended thinking emit their reasoning process separately from the final answer. The onThinking callback surfaces this, letting you show users how the model works through a problem before delivering the response.