Forum Discussion

psychobunny83's avatar
psychobunny83
Brass Contributor
Sep 13, 2024

Azure OpenAI - If I add a data source, it no longer follows my instructions on how to respond?

Hi everyone, I've been messing with Azure OpenAI for a few days and find it pretty cool.  One issue I have though is I will give it example prompts, and fill in the "Give the model instructions and context" section which works great, but when you add a data source like Azure Blob, it ONLY uses that and ignores all instructions.

 

I have "limit responses to your data content" under data source disabled.

 

Is this normal behaviour?

 

The second issue is that even with this unchecked above, it still only responds based on the documents.  For example, I'll ask about a product and it responds based on my documents.  I'll then ask if a competitor company also offers this product, and it will say that it doesn't know as that wasn't found in the documents.  I tell it to search online and it says it's not capable of searching online and can only provide information based on the documents I've provided."

 

Just a bit confused by that since using ChatGPT directly would have answered that question.

 

My goal is to have it be a general AI like ChatGPT, as in, it will answer general questions, but ALSO reference our documents.  Is that not possible, like is it one or the other?

2 Replies

  • This is expected when you add a data source with Azure OpenAI "on your data". The retrieved content and grounding rules become part of the model context, so they can strongly influence the answer.

     

    Turning off "limit responses to your data" lets the model use its base knowledge, but it still will not browse the internet unless you add a separate search/tooling layer.

     

    The clean pattern is orchestration:

     

    1. Route document-specific questions to Azure AI Search/RAG.

    2. Route general questions to a normal chat completion.

    3. Add a web-search tool if current internet answers are required.

    4. Tune strictness and system instructions so the model knows when it may answer from retrieved data versus general model knowledge.