Blog Post

Microsoft Developer Community Blog
6 MIN READ

Making Sense of Azure AI Foundry IQ

Samarpitaa's avatar
Samarpitaa
Icon for Microsoft rankMicrosoft
Apr 29, 2026

“By the end, you’ll know when Foundry IQ is the right fit, when it’s not, and how to query a knowledge base directly using Azure AI Search APIs/SDKs.”

As enterprise teams build AI agents, the hardest design decisions often have nothing to do with models. Instead, they revolve around a more fundamental question:

How should an agent access organizational knowledge in a way that is accurate, secure, and sustainable over time?

Azure AI Foundry IQ is designed to address a specific version of that problem. It is not a general‑purpose data access layer, and it is not a replacement for every retrieval pattern.

Understanding where it fits and where it does not is key to using it effectively.

This post explores those boundaries and grounds them in concrete, enterprise‑relevant scenarios, before showing how Foundry IQ can be implemented directly via Azure AI Search APIs and SDKs.

What Azure AI Foundry IQ Is (and Is Not):

Azure AI Foundry IQ is a managed knowledge layer built on Azure AI Search. It allows you to define a knowledge base that spans multiple content sources such as SharePoint, Azure Blob Storage, OneLake, existing Azure AI Search indexes, and selected external sources and expose them through a single, permission‑aware endpoint.

When an agent queries a knowledge base, Foundry IQ:

  •  Plans how the query should be executed
  •  Selects relevant knowledge sources
  •  Runs retrieval (optionally in multiple steps)
  •  Enforces user permissions
  •  Returns grounded results with citations

A single knowledge base can be reused across multiple agents or applications, avoiding duplicated indexing and inconsistent retrieval logic.

What Foundry IQ is not:

It does not execute SQL queries, perform aggregations, or provide real‑time numeric accuracy. Foundry IQ retrieves unstructured text, not transactional or analytical data.

Where Foundry IQ Is a Good Fit

1. Multi‑Source, Distributed Knowledge

Foundry IQ is most valuable when relevant knowledge is spread across multiple systems. It removes the need for each agent to manage source‑specific routing and retrieval logic. This benefit increases as the number of sources grows; with a single source, the overhead is rarely justified.

2. Complex or Multi‑Part Questions

Foundry IQ’s agentic retrieval model is designed for questions that require:

  •  Decomposition into sub‑questions
  •  Retrieval from multiple documents
  •  Synthesis across sources

Its multi‑step retrieval approach is especially effective when a single document cannot answer the question on its own.

3. Reduced Custom Retrieval Engineering

Foundry IQ automates indexing, chunking, vectorization, and orchestration across sources. This makes it a strong choice for teams that want to focus on agent behavior rather than building and maintaining custom RAG pipelines.

4. Enterprise Security and Governance

Foundry IQ integrates with Microsoft Entra ID and supports document‑level permissions and Purview sensitivity labels where the underlying source allows it. This makes it suitable for internal or regulated scenarios where permission trimming is a hard requirement.

5. Shared Knowledge Across Multiple Agents

A single knowledge base can serve multiple agents or applications, reducing operational overhead and ensuring consistent retrieval behavior across experiences.

6. High Emphasis on Answer Quality and Trust

For scenarios where correctness, grounding, and citations matter more than latency or cost, Foundry IQ’s multi‑step retrieval consistently outperforms basic RAG approaches.

Example Scenarios Where Foundry IQ Works Well

Scenario A: Internal Policy and Operations Assistant

An enterprise builds an internal assistant for store managers. Relevant information lives in:

• HR policies in SharePoint

• Safety procedures in Blob Storage

• Operations manuals in OneLake

Questions often span multiple documents. A single Foundry IQ knowledge base unifies these sources and enforces permissions automatically.

Scenario B: Compliance or Regulatory Knowledge Assistant

A compliance team needs answers strictly grounded in approved documents, with citations and access control. Foundry IQ ensures only authorized content is retrieved, reducing the risk of accidental data exposure.

Scenario C: Shared Knowledge Layer for Multiple Internal Agents

Multiple internal agents like chat assistants, workflow helpers, embedded copilots rely on the same procedural content. A shared knowledge base avoids duplicate indexing and centralizes governance.

Where Foundry IQ Is Not a Good Fit

1. Simple or Single‑Source Q&A

For a single, well‑defined source, Foundry IQ’s orchestration adds complexity without proportional benefit.

2. Structured or Analytical Data Queries

Foundry IQ does not execute live queries or calculations. It retrieves text, not metrics.

3. Ultra‑Low Latency or High‑Throughput Requirements

Agentic retrieval introduces LLM‑in‑the‑loop latency and token costs. For sub‑second responses at scale, simpler retrieval pipelines are more appropriate.

4. Highly Customized Retrieval Logic

Foundry IQ abstracts the retrieval pipeline. If you require fine‑grained control over scoring or transformations, a fully custom search pipeline may be preferable.

Example Scenarios Where Foundry IQ Is the Wrong Tool

Scenario D: Sales and Inventory Analytics Agent

Questions like “What were Q4 sales by region?” require live data queries. Indexing reports leads to stale answers. A direct SQL or analytics tool is the correct solution.

Scenario E: High‑Volume, Low‑Latency Assistant

Voice‑based assistants requiring sub‑second responses cannot tolerate the latency of agentic retrieval.

A Common Architecture Pattern

Most successful implementations combine:

  •  Foundry IQ for unstructured documents and policies
  •  Structured data tools for analytics and live queries
  •  An application or agent layer that routes questions based on intent

    This avoids forcing a single tool to solve every problem.

Querying Foundry IQ Knowledge Bases Directly via Azure AI Search SDK

You can query Azure AI Foundry IQ knowledge bases directly using the azure-search-documents Python SDK without using Foundry Agent Service.

Your App → Azure AI Search SDK → Foundry IQ Knowledge Base → Grounded Results

Ideal when you want full orchestration control while still benefiting from managed, agentic retrieval.

How this works

 

Note:It is a reference implementation

Install

pip install --pre azure-search-documents azure-identity
Setup (High Level)
  1. Provision Azure AI Search (Basic or higher)
    • Enable Azure AD and API key authentication
    • Enable a system‑assigned managed identity
  2. Ingest Content via Knowledge Sources
    • Blob Storage, SharePoint, or OneLake
    • Index, indexer, data source, and skillset are created automatically
    • Knowledge sources and KBs are created via REST API (2025‑11‑01‑preview)
  3. Create a Knowledge Base
    • minimal reasoning → semantic retrieval only (no LLM)
    • low / medium reasoning → requires Azure OpenAI model
      • Search service MI needs Cognitive Services User 
Querying the Knowledge Base (Python)

Initialize the Client

from azure.identity import DefaultAzureCredential
from azure.search.documents.knowledgebases import KnowledgeBaseRetrievalClient

client = KnowledgeBaseRetrievalClient(
    endpoint="https://<search-service>.search.windows.net",
    knowledge_base_name="<kb-name>",
    credential=DefaultAzureCredential(),
)

Minimal Reasoning (Fast, No LLM)

from azure.search.documents.knowledgebases.models import (
    KnowledgeBaseRetrievalRequest,
    KnowledgeRetrievalSemanticIntent,
    KnowledgeRetrievalMinimalReasoningEffort,
    KnowledgeRetrievalOutputMode,
)

request = KnowledgeBaseRetrievalRequest(
    intents=[KnowledgeRetrievalSemanticIntent(search="your question here")],
    retrieval_reasoning_effort=KnowledgeRetrievalMinimalReasoningEffort(),
    output_mode=KnowledgeRetrievalOutputMode.EXTRACTIVE_DATA,
)

response = client.retrieve(retrieval_request=request)
Conversational Reasoning (LLM‑Backed)
from azure.search.documents.knowledgebases.models import (
    KnowledgeBaseRetrievalRequest,
    KnowledgeBaseMessage,
    KnowledgeBaseMessageTextContent,
    KnowledgeRetrievalLowReasoningEffort,
    KnowledgeRetrievalOutputMode,
)

request = KnowledgeBaseRetrievalRequest(
    messages=[
        KnowledgeBaseMessage(
            role="user",
            content=[KnowledgeBaseMessageTextContent(text="<first user question>")]
        ),
        KnowledgeBaseMessage(
            role="assistant",
            content=[KnowledgeBaseMessageTextContent(text="<assistant response>")]
        ),
        KnowledgeBaseMessage(
            role="user",
            content=[KnowledgeBaseMessageTextContent(text="<follow-up question>")]
        ),
    ],
    retrieval_reasoning_effort=KnowledgeRetrievalLowReasoningEffort(),
    output_mode=KnowledgeRetrievalOutputMode.EXTRACTIVE_DATA,
)

response = client.retrieve(retrieval_request=request)

Keep in mind:

  • intents → minimal reasoning only
  • messages → low / medium reasoning only

They are not interchangeable.

 

Processing the Response

# Extracted content
for msg in (response.response or []):
    for item in (msg.content or []):
        print(item.text)

# Citations (handles blob, SharePoint, OneLake, and search index references)
for ref in (response.references or []):
    ref_id = getattr(ref, "id", None)
    url = getattr(ref, "blob_url", None) or getattr(ref, "url", None)
    print(f"[{ref_id}] {url}")

# Retrieval diagnostics
for record in (response.activity or []):
    elapsed = getattr(record, "elapsed_ms", None) or ""
    print(f"{record.type}: {elapsed}ms")

Output Modes

ModeWhen to Use
extractiveDataFeed grounded chunks into your own LLM
answerSynthesisReturn a ready‑made answer with citations (LLM required)


Security & Permissions

  • RBAC: Search Index Data Reader with DefaultAzureCredential

  • Permission trimming

    • Must be enabled at ingestion (ingestionPermissionOptions)

    • Enforced at query time by passing the user’s bearer token

response = client.retrieve(
    retrieval_request=request,
    x_ms_query_source_authorization="Bearer <user-token>"
)

 

 Foundry IQ won't solve every retrieval problem. But when your agents need grounded, permission-aware answers from content scattered across SharePoint, Blob Storage, and OneLake, it handles the hard parts — so you can focus on what your agent actually does.

Updated Apr 22, 2026
Version 1.0
No CommentsBe the first to comment