Responses Streaming - OpenHands Docs

How It Works
Next Steps

This example is available on GitHub: examples/01_standalone_sdk/24_responses_streaming.py

Enable live token streaming when using the OpenAI Responses API path. This guide shows how to:

Subscribe to streaming deltas from the model
Log streamed chunks to a JSONL file
Optionally render streaming visually or print deltas to stdout

examples/01_standalone_sdk/24_responses_streaming.py

Running the Example

export LLM_API_KEY="your-openai-compatible-api-key"
# Optional overrides
# export LLM_MODEL="openhands/gpt-5-codex"
# export LLM_BASE_URL="https://your-litellm-or-provider-base-url"

cd agent-sdk
uv run python examples/01_standalone_sdk/24_responses_streaming.py

How It Works

Pass a token callback to Conversation to receive streaming chunks as they arrive:

conversation = Conversation(
    agent=agent,
    workspace=os.getcwd(),
    token_callbacks=[on_token],
)

Each chunk contains a delta: text_delta for content tokens or arguments_delta for tool-call arguments. The example logs a serialized record per chunk to ./logs/stream/*.jsonl.
For a visual live view, use the built-in streaming visualizer:

from openhands.sdk.conversation.streaming_visualizer import create_streaming_visualizer

visualizer = create_streaming_visualizer()
conversation = Conversation(
    agent=agent,
    workspace=os.getcwd(),
    token_callbacks=[on_token],
    callbacks=[visualizer.on_event],
    visualize=False,
)

Next Steps

Reasoning (Responses API) – Access model reasoning traces
LLM Routing – Route requests to different models
Image Input – Send images to multimodal models

Image Input Interactive Terminal

⌘I