How to smooth traffic and avoid gridlocks
Updated Mar 20, 2025
Version 1.0Hello, does this currently work in other way than the curl requests described in the documentation?
I am trying to make it work by sending extra_headers to `chat_completions.create` from Python but it doesn't seem to spillover
Are you able to test with requests library in python? I'm not sure how the SDK forwards extra_headers to the backend.
something like this:
import requests
url = "$AZURE_OPENAI_ENDPOINT/openai/deployments/{ptu-deployment}/chat/completions?api-version=2025-02-01-preview"
headers = {
"Content-Type": "application/json",
"x-ms-spillover-deployment": "{spillover-standard-deployment}",
"Authorization": "Bearer YOUR_AUTH_TOKEN"
}
data = {
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
{"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
{"role": "user", "content": "Do other Azure AI services support this too?"}
]
}
response = requests.post(url, headers=headers, json=data)
print(response.json())
I would like to make it work directly with azure openai
client.chat.completions.create
Yes this is possible, please refer to this example: Passing headers with the Azure OpenAI client in 1.1.1 · Issue #740 · openai/openai-python