Forum Discussion
Chaining and Streaming with Responses API in Azure
Responses API is an enhancement of the existing Chat Completions API. It is stateful and supports agentic capabilities.
As a superset of the Chat Completions class, it continues to support functionality of chat completions. In addition, reasoning models, like GPT-5 result in better model intelligence when compared to Chat Completions. It has input flexibility, supporting a range of input types. It is currently available in the following regions on Azure and can be used with all the models available in the region.
The API supports response streaming, chaining and also function calling. In the examples below, we use the gpt-5-nano model for a simple response, a chained response and streaming responses.
To get started update the installed openai library.
pip install --upgrade openai
Simple Message
1) Build the client with the following code
from openai import OpenAI
client = OpenAI(
base_url=endpoint,
api_key=api_key,
)
2) The response received is an id which can then be used to retrieve the message.
# Non-streaming request
resp_id = client.responses.create(
model=deployment,
input=messages,
)
3) Message is retrieved using the response id from previous step
response = client.responses.retrieve(resp_id.id)
Chaining
For a chained message, an extra step is sharing the context. This is done by sending the response id in the subsequent requests.
resp_id = client.responses.create(
model=deployment,
previous_response_id=resp_id.id,
input=[{"role": "user", "content": "Explain this at a level that could be understood by a college freshman"}]
)
Streaming
A different function call is used for streaming queries.
client.responses.stream(
model=deployment,
input=messages, # structured messages
)
In addition, the streaming query response has to be handled appropriately till end of event stream
for event in s:
# Accumulate only text deltas for clean output
if event.type == "response.output_text.delta":
delta = event.delta or ""
text_out.append(delta)
# Echo streaming output to console as it arrives
print(delta, end="", flush=True)
The code is available in the following github link - https://github.com/arunacarunac/ResponsesAPI
Additional details are available in the following links - Azure OpenAI Responses API - Azure OpenAI | Microsoft Learn