azure ai foundry
198 TopicsAZD for Beginners: A Practical Introduction to Azure Developer CLI
If you are learning how to get an application from your machine into Azure without stitching together every deployment step by hand, Azure Developer CLI, usually shortened to azd , is one of the most useful tools to understand early. It gives developers a workflow-focused command line for provisioning infrastructure, deploying application code, wiring environment settings, and working with templates that reflect real cloud architectures rather than toy examples. This matters because many beginners hit the same wall when they first approach Azure. They can build a web app locally, but once deployment enters the picture they have to think about resource groups, hosting plans, databases, secrets, monitoring, configuration, and repeatability all at once. azd reduces that operational overhead by giving you a consistent developer workflow. Instead of manually creating each resource and then trying to remember how everything fits together, you start with a template or an azd -compatible project and let the tool guide the path from local development to a running Azure environment. If you are new to the tool, the AZD for Beginners learning resources are a strong place to start. The repository is structured as a guided course rather than a loose collection of notes. It covers the foundations, AI-first deployment scenarios, configuration and authentication, infrastructure as code, troubleshooting, and production patterns. In other words, it does not just tell you which commands exist. It shows you how to think about shipping modern Azure applications with them. What Is Azure Developer CLI? The Azure Developer CLI documentation on Microsoft Learn, azd is an open-source tool designed to accelerate the path from a local development environment to Azure. That description is important because it explains what the tool is trying to optimise. azd is not mainly about managing one isolated Azure resource at a time. It is about helping developers work with complete applications. The simplest way to think about it is this. Azure CLI, az , is broad and resource-focused. It gives you precise control over Azure services. Azure Developer CLI, azd , is application-focused. It helps you take a solution made up of code, infrastructure definitions, and environment configuration and push that solution into Azure in a repeatable way. Those tools are not competitors. They solve different problems and often work well together. For a beginner, the value of azd comes from four practical benefits: It gives you a consistent workflow built around commands such as azd init , azd auth login , azd up , azd show , and azd down . It uses templates so you do not need to design every deployment structure from scratch on day one. It encourages infrastructure as code through files such as azure.yaml and the infra folder. It helps you move from a one-off deployment towards a repeatable development workflow that is easier to understand, change, and clean up. Why Should You Care About azd A lot of cloud frustration comes from context switching. You start by trying to deploy an app, but you quickly end up learning five or six Azure services, authentication flows, naming rules, environment variables, and deployment conventions all at once. That is not a good way to build confidence. azd helps by giving a workflow that feels closer to software delivery than raw infrastructure management. You still learn real Azure concepts, but you do so through an application lens. You initialise a project, authenticate, provision what is required, deploy the app, inspect the result, and tear it down when you are done. That sequence is easier to retain because it mirrors the way developers already think about shipping software. This is also why the AZD for Beginners resource is useful. It does not assume every reader is already comfortable with Azure. It starts with foundation topics and then expands into more advanced paths, including AI deployment scenarios that use the same core azd workflow. That progression makes it especially suitable for students, self-taught developers, workshop attendees, and engineers who know how to code but want a clearer path into Azure deployment. What You Learn from AZD for Beginners The AZD for Beginners course is structured as a learning journey rather than a single quickstart. That matters because azd is not just a command list. It is a deployment workflow with conventions, patterns, and trade-offs. The course helps readers build that mental model gradually. At a high level, the material covers: Foundational topics such as what azd is, how to install it, and how the basic deployment loop works. Template-based development, including how to start from an existing architecture rather than building everything yourself. Environment configuration and authentication practices, including the role of environment variables and secure access patterns. Infrastructure as code concepts using the standard azd project structure. Troubleshooting, validation, and pre-deployment thinking, which are often ignored in beginner content even though they matter in real projects. Modern AI and multi-service application scenarios, showing that azd is not limited to basic web applications. One of the strongest aspects of the course is that it does not stop at the first successful deployment. It also covers how to reason about configuration, resource planning, debugging, and production readiness. That gives learners a more realistic picture of what Azure development work actually looks like. The Core azd Workflow The official overview on Microsoft Learn and the get started guide both reinforce a simple but important idea: most beginners should first understand the standard workflow before worrying about advanced customisation. That workflow usually looks like this: Install azd . Authenticate with Azure. Initialise a project from a template or in an existing repository. Run azd up to provision and deploy. Inspect the deployed application. Remove the resources when finished. Here is a minimal example using an existing template: # Install azd on Windows winget install microsoft.azd # Check that the installation worked azd version # Sign in to your Azure account azd auth login # Start a project from a template azd init --template todo-nodejs-mongo # Provision Azure resources and deploy the app azd up # Show output values such as the deployed URL azd show # Clean up everything when you are done learning azd down --force --purge This sequence is important because it teaches beginners the full lifecycle, not only deployment. A lot of people remember azd up and forget the cleanup step. That leads to wasted resources and avoidable cost. The azd down --force --purge step is part of the discipline, not an optional extra. Installing azd and Verifying Your Setup The official install azd guide on Microsoft Learn provides platform-specific instructions. Because this repository targets developer learning, it is worth showing the common install paths clearly. # Windows winget install microsoft.azd # macOS brew tap azure/azd && brew install azd # Linux curl -fsSL https://aka.ms/install-azd.sh | bash After installation, verify the tool is available: azd version That sounds obvious, but it is worth doing immediately. Many beginner problems come from assuming the install completed correctly, only to discover a path issue or outdated version later. Verifying early saves time. The Microsoft Learn installation page also notes that azd installs supporting tools such as GitHub CLI and Bicep CLI within the tool's own scope. For a beginner, that is helpful because it removes some of the setup friction you might otherwise need to handle manually. What Happens When You Run azd up ? One of the most important questions is what azd up is actually doing. The short answer is that it combines provisioning and deployment into one workflow. The longer answer is where the learning value sits. When you run azd up , the tool looks at the project configuration, reads the infrastructure definition, determines which Azure resources need to exist, provisions them if necessary, and then deploys the application code to those resources. In many templates, it also works with environment settings and output values so that the project becomes reproducible rather than ad hoc. That matters because it teaches a more modern cloud habit. Instead of building infrastructure manually in the portal and then hoping you can remember how you did it, you define the deployment shape in source-controlled files. Even at beginner level, that is the right habit to learn. Understanding the Shape of an azd Project The Azure Developer CLI templates overview explains the standard project structure used by azd . If you understand this structure early, templates become much less mysterious. A typical azd project contains: azure.yaml to describe the project and map services to infrastructure targets. An infra folder containing Bicep or Terraform files for infrastructure as code. A src folder, or equivalent source folders, containing the application code that will be deployed. A local .azure folder to store environment-specific settings for the project. Here is a minimal example of what an azure.yaml file can look like in a simple app: name: beginner-web-app metadata: template: beginner-web-app services: web: project: ./src/web host: appservice This file is small, but it carries an important idea. azd needs a clear mapping between your application code and the Azure service that will host it. Once you see that, the tool becomes easier to reason about. You are not invoking magic. You are describing an application and its hosting model in a standard way. Start from a Template, Then Learn the Architecture Beginners often assume that using a template is somehow less serious than building something from scratch. In practice, it is usually the right place to begin. The official docs for templates and the Awesome AZD gallery both encourage developers to start from an existing architecture when it matches their goals. That is a sound learning strategy for two reasons. First, it lets you experience a working deployment quickly, which builds confidence. Second, it gives you a concrete project to inspect. You can look at azure.yaml , explore the infra folder, inspect the app source, and understand how the pieces connect. That teaches more than reading a command reference in isolation. The AZD for Beginners material also leans into this approach. It includes chapter guidance, templates, workshops, examples, and structured progression so that readers move from successful execution into understanding. That is much more useful than a single command demo. A practical beginner workflow looks like this: # Pick a known template azd init --template todo-nodejs-mongo # Review the files that were created or cloned # - azure.yaml # - infra/ # - src/ # Deploy it azd up # Open the deployed app details azd show Once that works, do not immediately jump to a different template. Spend time understanding what was deployed and why. Where AZD for Beginners Fits In The official docs are excellent for accurate command guidance and conceptual documentation. The AZD for Beginners repository adds something different: a curated learning path. It helps beginners answer questions such as these: Which chapter should I start with if I know Azure a little but not azd ? How do I move from a first deployment into understanding configuration and authentication? What changes when the application becomes an AI application rather than a simple web app? How do I troubleshoot failures instead of copying commands blindly? The repository also points learners towards workshops, examples, a command cheat sheet, FAQ material, and chapter-based exercises. That makes it particularly useful in teaching contexts. A lecturer or workshop facilitator can use it as a course backbone, while an individual learner can work through it as a self-study track. For developers interested in AI, the resource is especially timely because it shows how the same azd workflow can be used for AI-first solutions, including scenarios connected to Microsoft Foundry services and multi-agent architectures. The important beginner lesson is that the workflow stays recognisable even as the application becomes more advanced. Common Beginner Mistakes and How to Avoid Them A good introduction should not only explain the happy path. It should also point out the places where beginners usually get stuck. Skipping authentication checks. If azd auth login has not completed properly, later commands will fail in ways that are harder to interpret. Not verifying the installation. Run azd version immediately after install so you know the tool is available. Treating templates as black boxes. Always inspect azure.yaml and the infra folder so you understand what the project intends to provision. Forgetting cleanup. Learning environments cost money if you leave them running. Use azd down --force --purge when you are finished experimenting. Trying to customise too early. First get a known template working exactly as designed. Then change one thing at a time. If you do hit problems, the official troubleshooting documentation and the troubleshooting sections inside AZD for Beginners are the right next step. That is a much better habit than searching randomly for partial command snippets. How I Would Approach AZD as a New Learner If I were introducing azd to a student or a developer who is comfortable with code but new to Azure delivery, I would keep the learning path tight. Read the official What is Azure Developer CLI? overview so the purpose is clear. Install the tool using the Microsoft Learn install guide. Work through the opening sections of AZD for Beginners. Deploy one template with azd init and azd up . Inspect azure.yaml and the infrastructure files before making any changes. Run azd down --force --purge so the lifecycle becomes a habit. Only then move on to AI templates, configuration changes, or custom project conversion. That sequence keeps the cognitive load manageable. It gives you one successful deployment, one architecture to inspect, and one repeatable workflow to internalise before adding more complexity. Why azd Is Worth Learning Now azd matters because it reflects how modern Azure application delivery is actually done: repeatable infrastructure, source-controlled configuration, environment-aware workflows, and application-level thinking rather than isolated portal clicks. It is useful for straightforward web applications, but it becomes even more valuable as systems gain more services, more configuration, and more deployment complexity. That is also why the AZD for Beginners resource is worth recommending. It gives new learners a structured route into the tool instead of leaving them to piece together disconnected docs, samples, and videos on their own. Used alongside the official Microsoft Learn documentation, it gives you both accuracy and progression. Key Takeaways azd is an application-focused Azure deployment tool, not just another general-purpose CLI. The core beginner workflow is simple: install, authenticate, initialise, deploy, inspect, and clean up. Templates are not a shortcut to avoid learning. They are a practical way to learn architecture through working examples. AZD for Beginners is valuable because it turns the tool into a structured learning path. The official Microsoft Learn documentation for Azure Developer CLI should remain your grounding source for commands and platform guidance. Next Steps If you want to keep going, start with these resources: AZD for Beginners for the structured course, examples, and workshop materials. Azure Developer CLI documentation on Microsoft Learn for official command, workflow, and reference guidance. Install azd if you have not set up the tool yet. Deploy an azd template for the first full quickstart. Azure Developer CLI templates overview if you want to understand the project structure and template model. Awesome AZD if you want to browse starter architectures. If you are teaching others, this is also a good sequence for a workshop: start with the official overview, deploy one template, inspect the project structure, and then use AZD for Beginners as the path for deeper learning. That gives learners both an early win and a solid conceptual foundation.Agents League: Meet the Winners
Agents League brought together developers from around the world to build AI agents using Microsoft's developer tools. With 100+ submissions across three tracks, choosing winners was genuinely difficult. Today, we're proud to announce the category champions. 🎨 Creative Apps Winner: CodeSonify View project CodeSonify turns source code into music. As a genuinely thoughtful system, its functions become ascending melodies, loops create rhythmic patterns, conditionals trigger chord changes, and bugs produce dissonant sounds. It supports 7 programming languages and 5 musical styles, with each language mapped to its own key signature and code complexity directly driving the tempo. What makes CodeSonify stand out is the depth of execution. CodeSonify team delivered three integrated experiences: a web app with real-time visualization and one-click MIDI export, an MCP server exposing 5 tools inside GitHub Copilot in VS Code Agent Mode, and a diff sonification engine that lets you hear a code review. A clean refactor sounds harmonious. A messy one sounds chaotic. The team even built the MIDI generator from scratch in pure TypeScript with zero external dependencies. Built entirely with GitHub Copilot assistance, this is one of those projects that makes you think about code differently. 🧠 Reasoning Agents Winner: CertPrep Multi-Agent System View project CertPrep Multi-Agent System team built a production-grade 8-agent system for personalized Microsoft certification exam preparation, supporting 9 exam families including AI-102, AZ-204, AZ-305, and more. Each agent has a distinct responsibility: profiling the learner, generating a week-by-week study schedule, curating learning paths, tracking readiness, running mock assessments, and issuing a GO / CONDITIONAL GO / NOT YET booking recommendation. The engineering behind the scene here is impressive. A 3-tier LLM fallback chain ensures the system runs reliably even without Azure credentials, with the full pipeline completing in under 1 second in mock mode. A 17-rule guardrail pipeline validates every agent boundary. Study time allocation uses the Largest Remainder algorithm to guarantee no domain is silently zeroed out. 342 automated tests back it all up. This is what thoughtful multi-agent architecture looks like in practice. 💼 Enterprise Agents Winner: Whatever AI Assistant (WAIA) View project WAIA is a production-ready multi-agent system for Microsoft 365 Copilot Chat and Microsoft Teams. A workflow agent routes queries to specialized HR, IT, or Fallback agents, transparently to the user, handling both RAG-pattern Q&A and action automation — including IT ticket submission via a SharePoint list. Technically, it's a showcase of what serious enterprise agent development looks like: a custom MCP server secured with OAuth Identity Passthrough, streaming responses via the OpenAI Responses API, Adaptive Cards for human-in-the-loop approval flows, a debug mode accessible directly from Teams or Copilot, and full OpenTelemetry integration visible in the Foundry portal. Franck also shipped end-to-end automated Bicep deployment so the solution can land in any Azure environment. It's polished, thoroughly documented, and built to be replicated. Thank you To every developer who submitted and shipped projects during Agents League: thank you 💜 Your creativity and innovation brought Agents League to life! 👉 Browse all submissions on GitHubWhy Data Platforms Must Become Intelligence Platforms for AI Agents to Work
The promise and the gap Your organization has invested in an AI agent. You ask it: "Prepare a summary of Q3 revenue by region, including year-over-year trends and top product lines." The agent finds revenue numbers in a SQL warehouse, product metadata in Dataverse, regional mappings in SharePoint, historical data in Azure Blob Storage, and organizational context in Microsoft Graph. Five data sources. Five schemas. No shared definitions. The result? The agent hallucinates, returns incomplete data, or asks a dozen clarifying questions that defeat its purpose. This isn't a model limitation — modern AI models are highly capable. The real constraint is that enterprise data is not structured for reasoning. Traditional data platforms were built for humans to query. Intelligence platforms must be built for agents to _reason_ over. That distinction is the subject of this post. What you'll understand Why fragmented enterprise data blocks effective AI agents What distinguishes a storage platform from an intelligence platform How Microsoft Fabric and Azure AI Foundry work together to enable trustworthy, agent-ready data access The enterprise pain: Fragmented data breaks AI agents Enterprise data is spread across relational databases, data lakes, business applications, collaboration platforms, third-party APIs, and Microsoft Graph — each with its own schema and security model. Humans navigate this fragmentation through institutional knowledge and years of muscle memory. A seasoned analyst knows that "revenue" in the data warehouse means net revenue after returns, while "revenue" in the CRM means gross bookings. An AI agent does not. The cost of this fragmentation isn't hypothetical. Each new AI agent deployment can trigger another round of bespoke data preparation — custom integrations and transformation pipelines just to make data usable, let alone agent-ready. This approach doesn't scale. Why agents struggle without a semantic layer To produce a trustworthy answer, an AI agent needs: (1) **data access** to reach relevant sources, (2) **semantic context** to understand what the data _means_ (business definitions, relationships, hierarchies), and (3) **trust signals** like lineage, permissions, and freshness metadata. Traditional platforms provide the first but rarely the second or third — leaving agents to infer meaning from column names and table structures. This is fragile at best and misleading at worst. Figure 1: Without a shared semantic layer, AI agents must interpret raw, disconnected data across multiple systems — often leading to inconsistent or incomplete results. From storage to intelligence: What must change The fix isn't another ETL pipeline or another data integration tool. The fix is a fundamental shift in what we expect from a data platform. A storage platform asks: "Where is the data, and how do I access it?" An intelligence platform asks: "What does the data mean, who can use it, and how can an agent reason over it?" This shift requires four foundational pillars: Pillar 1: Unified data access OneLake, the data lake built into Microsoft Fabric, provides a single logical namespace across an organization. Whether data originates in a Fabric lakehouse, a warehouse, or an external storage account, OneLake makes it accessible through one interface — using shortcuts and mirroring rather than requiring data migration. This respects existing investments while reducing fragmentation. Pillar 2: Shared semantic layer Semantic models in Microsoft Fabric define business measures, table relationships, human-readable field descriptions, and row-level security. When an agent queries a semantic model instead of raw tables, it gets _answers_ — like `Total Revenue = $42.3M for North America in Q3` — not raw result sets requiring interpretation and aggregation. Before vs After: What changes for an agent? Without semantic layer: Queries raw tables Infers business meaning Risk of incorrect aggregation With semantic layer: Queries `[Total Revenue]` Uses business-defined logic Gets consistent, governed results Pillar 3: Context enrichment Microsoft Graph adds organizational signals — people and roles, activity patterns, and permissions — helping agents produce responses that are not just accurate, but _relevant_ and _appropriately scoped_ to the person asking. Pillar 4: Agent-ready APIs Data Agents in Microsoft Fabric (currently in preview) provide a natural-language interface to semantic models and lakehouses. Instead of generating SQL, an AI agent can ask: "What was Q3 revenue by region?" and receive a structured, sourced response. This is the critical difference: the platform provides structured context and business logic, helping reduce the reasoning burden on the agent. Figure 2: An intelligence platform adds semantic context, trust signals, and agent-ready APIs on top of unified data access — enabling AI agents to combine structured data, business definitions, and relationships to produce more consistent responses. Microsoft Fabric as the intelligence layer Microsoft Fabric is often described as a unified analytics platform. That description is accurate but incomplete. In the context of AI agents, Fabric's role is better understood as an **intelligence layer** — a platform that doesn't just store and process data, but _makes data understandable_ to autonomous systems. Let's look at each capability through the lens of agent readiness. OneLake: One namespace, many sources OneLake provides a single logical namespace backed by Azure Data Lake Storage Gen2. For AI agents, this means one authentication context, one discovery mechanism, and one governance surface. Key capabilities: **shortcuts** (reference external data without copying), **mirroring** (replicate from Azure SQL, Cosmos DB, or Snowflake), and a **unified security model**. For more on OneLake architecture, see [OneLake documentation on Microsoft Learn](https://learn.microsoft.com/fabric/onelake/onelake-overview). Semantic models: Business logic that agents can understand Semantic models (built on the Analysis Services engine) transform raw tables into business concepts: Raw Table Column Semantic Model Measure `fact_sales.amount` `[Total Revenue]` — Sum of net sales after returns `fact_sales.amount / dim_product.cost` `[Gross Margin %]` — Revenue minus COGS as a percentage `fact_sales.qty` YoY comparison `[YoY Growth %]` — Year-over-year quantity growth Code Snippet 1 — Querying a Fabric Semantic Model with Semantic Link (Python) import sempy.fabric as fabric # Query business-defined measures — no need to know underlying table schemas dax_query = """ EVALUATE SUMMARIZECOLUMNS( 'Geography'[Region], 'Calendar'[FiscalQuarter], "Total Revenue", [Total Revenue], "YoY Growth %", [YoY Growth %] ) """ result_df = fabric.evaluate_dax( dataset="Contoso Sales Analytics", workspace="Contoso Analytics Workspace", dax_string=dax_query ) print(result_df.head()) # NOTE: Output shown is illustrative and based on the semantic model definition # Output (illustrative): # Region FiscalQuarter Total Revenue YoY Growth % # North America Q3 FY2026 42300000 8.2 # Europe Q3 FY2026 31500000 5.7 Key takeaway: The agent doesn’t need to know that revenue is in `fact_sales.amount` or that fiscal quarters don’t align with calendar quarters. The semantic model handles all of this. Code Snippet 2 — Discovering Available Models and Measures (Python) Before an agent can query, it needs to _discover_ what data is available. Semantic Link provides programmatic access to model metadata — enabling agents to find relevant measures without hardcoded knowledge. import sempy.fabric as fabric # Discover available semantic models in the workspace datasets = fabric.list_datasets(workspace="Contoso Analytics Workspace") print(datasets[["Dataset Name", "Description"]]) # NOTE: Output shown is illustrative and based on the semantic model definition # Output (illustrative): # Dataset Name Description # Contoso Sales Analytics Revenue, margins, and growth metrics # Contoso HR Analytics Headcount, attrition, and hiring pipeline # Contoso Supply Chain Inventory, logistics, and supplier data # Inspect available measures — these are the business-defined metrics an agent can query measures = fabric.list_measures( dataset="Contoso Sales Analytics", workspace="Contoso Analytics Workspace" ) print(measures[["Table Name", "Measure Name", "Description"]]) # Output (illustrative): # Table Name Measure Name Description # Sales Total Revenue Sum of net sales after returns # Sales Gross Margin % Revenue minus COGS as a percentage # Sales YoY Growth % Year-over-year quantity growth Key takeaway: An agent can programmatically discover which semantic models exist and what measures they expose — turning the platform into a self-describing data catalog that agents can navigate autonomously. For more on Semantic Link, see the Semantic Link documentation on Microsoft Learn. Data Agents: Natural-language access for AI (preview) Note: Fabric Data Agents are currently in preview. See [Microsoft preview terms](https://learn.microsoft.com/legal/microsoft-fabric-preview) for details. A Data Agent wraps a semantic model and exposes it as a natural-language-queryable endpoint. An AI Foundry agent can register a Fabric Data Agent as a tool — when it needs data, it calls the Data Agent like any other tool. Important: In production scenarios, use managed identities or Microsoft Entra ID authentication. Always follow the [principle of least privilege](https://learn.microsoft.com/entra/identity-platform/secure-least-privileged-access) when configuring agent access. Microsoft Graph: Organizational context Microsoft Graph adds the final layer: who is asking (role-appropriate detail), what’s relevant (trending datasets), and who should review (data stewards). Fabric’s integration with Graph brings these signals into the data platform so agents produce contextually appropriate responses. Tying it together: Azure AI Foundry + Microsoft Fabric The real power of the intelligence platform concept emerges when you see how Azure AI Foundry and Microsoft Fabric are designed to work together. The integration pattern Azure AI Foundry provides the orchestration layer (conversations, tool selection, safety, response generation). Microsoft Fabric provides the data intelligence layer (data access, semantic context, structured query resolution). The integration follows a tool-calling pattern: 1.User prompt → End user asks a question through an AI Foundry-powered application. 2.Tool call → The agent selects the appropriate Fabric Data Agent and sends a natural-language query. 3.Semantic resolution → The Data Agent translates the query into DAX against the semantic model and executes it via OneLake. 4.Structured response → Results flow back through the stack, with each layer adding context (business definitions, permissions verification, data lineage). 5.User response → The AI Foundry agent presents a grounded, sourced answer to the user. Why these matters No custom ETL for agents — Agents query the intelligence platform directly No prompt-stuffing — The semantic model provides business context at query time No trust gap — Governed semantic models enforce row-level security and lineage No one-off integrations — Multiple agents reuse the same Data Agents Code Snippet 3 — Azure AI Foundry Agent with Fabric Data Agent Tool (Python) The following example shows how an Azure AI Foundry agent registers a Fabric Data Agent as a tool and uses it to answer a business question. The agent handles tool selection, query routing, and response grounding automatically. from azure.ai.projects import AIProjectClient from azure.ai.projects.models import FabricTool from azure.identity import DefaultAzureCredential # Connect to Azure AI Foundry project project_client = AIProjectClient.from_connection_string( credential=DefaultAzureCredential(), conn_str="<your-ai-foundry-connection-string>" ) # Register a Fabric Data Agent as a grounding tool # The connection references a Fabric workspace with semantic models fabric_tool = FabricTool(connection_id="<fabric-connection-id>") # Create an agent that uses the Fabric Data Agent for data queries agent = project_client.agents.create_agent( model="gpt-4o", name="Contoso Revenue Analyst", instructions="""You are a business analytics assistant for Contoso. Use the Fabric Data Agent tool to answer questions about revenue, margins, and growth. Always cite the source semantic model.""", tools=fabric_tool.definitions ) # Start a conversation thread = project_client.agents.create_thread() message = project_client.agents.create_message( thread_id=thread.id, role="user", content="What was Q3 revenue by region, and which region grew fastest?" ) # The agent automatically calls the Fabric Data Agent tool, # queries the semantic model, and returns a grounded response run = project_client.agents.create_and_process_run( thread_id=thread.id, agent_id=agent.id ) # Retrieve the agent's response messages = project_client.agents.list_messages(thread_id=thread.id) print(messages.data[0].content[0].text.value) # NOTE: Output shown is illustrative and based on the semantic model definition # Output (illustrative): # "Based on the Contoso Sales Analytics model, Q3 FY2026 revenue by region: # - North America: $42.3M (+8.2% YoY) # - Europe: $31.5M (+5.7% YoY) # - Asia Pacific: $18.9M (+12.1% YoY) — fastest growing # Source: Contoso Sales Analytics semantic model, OneLake" Key takeaway: The AI Foundry agent never writes SQL or DAX. It calls the Fabric Data Agent as a tool, which resolves the query against the semantic model. The response comes back grounded with source attribution — matching the five-step integration pattern described above. Figure 3: Each layer adds context — semantic models provide business definitions, Graph adds permissions awareness, and Data Agents provide the natural-language interface. Getting started: Practical next steps You don't need to redesign your entire data platform to begin this shift. Start with one high-value domain and expand incrementally. Step 1: Consolidate data access through OneLake Create OneLake shortcuts to your most critical data sources — core business metrics, customer data, financial records. No migration needed. [Create OneLake shortcuts](https://learn.microsoft.com/fabric/onelake/create-onelake-shortcut) Step 2: Build semantic models with business definitions For each major domain (sales, finance, operations), create a semantic model with key measures, table relationships, human-readable descriptions, and row-level security. [Create semantic models in Microsoft Fabric](https://learn.microsoft.com/fabric/data-warehouse/semantic-models) Step 3: Enable Data Agents (preview) Expose your semantic models as natural-language endpoints. Start with a single domain to validate the pattern. Note: Review the [preview terms](https://learn.microsoft.com/legal/microsoft-fabric-preview) and plan for API changes. [Fabric Data Agents overview](https://learn.microsoft.com/fabric/data-science/concept-data-agent) Step 4: Connect Azure AI Foundry agents Register Data Agents as tools in your AI Foundry agent configuration. Azure AI Foundry documentation Conclusion: The bottleneck isn't the model — it's the platform Models can reason, plan, and hold multi-turn conversations. But in the enterprise, the bottleneck for effective AI agents is the data platform underneath. Agents can’t reason over data they can’t find, apply business logic that isn’t encoded, respect permissions that aren’t enforced, or cite sources without lineage. The shift from storage to intelligence requires unified data access, a shared semantic layer, organizational context, and agent-ready APIs. Microsoft Fabric provides these capabilities, and its integration with Azure AI Foundry makes this intelligence layer accessible to AI agents. Disclaimer: Some features described in this post, including Fabric Data Agents, are currently in preview. Preview features may change before general availability, and their availability, functionality, and pricing may differ from the final release. See [Microsoft preview terms](https://learn.microsoft.com/legal/microsoft-fabric-preview) for details.The Business Foundation: Why Most Companies Aren’t Ready for Agentic AI
Before agents can execute decisions, organizations must redesign how they structure responsibility, data, governance, and operational context before autonomy can scale. The enterprise AI landscape has shifted. Organizations are moving beyond chatbots and isolated predictive models toward systems that can plan, decide, and execute multi-step work across finance, engineering operations, supply chains, and customer service. Many analysts now expect agentic AI to unlock major productivity gains across knowledge work. But despite the momentum, adoption remains limited. As of 2025, only about 2% of organizations have deployed agent-based systems at real operational scale, while most remain stuck in pilots. The reason is not model capability. It is readiness. The Core Problem Most organizations still treat AI adoption as a technical rollout exercise and measure progress through deployment indicators such as copilots enabled, pilots launched, or models evaluated. These metrics reflect experimentation activity, but they do not show whether an organization is ready to operate systems that make decisions and execute actions inside business workflows. Agentic systems do more than generate insights; they participate directly in operational processes. The gap between deploying AI tools and safely delegating decision-making authority to them is where many transformation efforts begin to stall. True enterprise readiness for agentic AI is not defined by how many models an organization deploys or how many pilots it launches. It depends on whether the organization can safely delegate bounded decisions to autonomous systems. In practice, this requires: Strategy and decision scoping: identifying where autonomous execution creates value and where human oversight must remain in place Process and decision-system maturity: redesigning workflows for human-agent collaboration with clear escalation boundaries Context-ready data foundations: ensuring agents operate on consistent, policy-aware operational context rather than fragmented data silos Governance and accountability structures: defining what agents may recommend, execute, escalate, or never touch, supported by auditability and oversight Team readiness and lifecycle management: preparing teams to supervise autonomous execution and managing agents as ongoing operational participants rather than static tools Coordination architecture readiness: aligning multiple agents across domains so local optimization does not create organizational conflict This article explains why traditional enterprise environments are not yet prepared for autonomous agents, what true agentic readiness actually looks like in practice, and the sequence of organizational changes required before decision-capable systems can be deployed safely at scale. I. The Readiness Illusion and the Root Causes of Failure Most organizations are deploying agentic systems into environments designed exclusively for human execution. That mismatch produces predictable friction across five structural layers. 1. Fragmented Operational Context (The Data Problem) Enterprises have a lot of data. What they often lack is usable context. Traditional systems record what happened. Agents also need to understand why something happened, how systems are connected, and where policy limits apply. In most organizations, customer systems, telemetry platforms, identity services, and finance tools do not stay aligned in real time. As a result, agents operate across disconnected information rather than a shared operational picture. This creates real risk. With generative AI, poor data quality usually produces a weak answer. With agentic AI, poor data quality can produce the wrong action at scale. More APIs, more pipelines, and more dashboards do not fix this by themselves. Without a shared semantic context across systems, agents can still make decisions that are internally logical but operationally wrong. For example, an agent may see that a customer received a large discount and conclude that future discounts should be limited, while missing that the original discount was approved because of a service outage and a retention risk. The data is available, but the business meaning behind it is not. 2. Undocumented Decision Systems Most organizations document workflows. However, very few document decision authority clearly enough for autonomous execution. Agents need to know where they are allowed to act, when they must escalate, and which decisions remain human-only. Without these boundaries, organizations often follow the same pattern: the first unexpected situation appears, confidence drops, and the agent is switched off. This is not a model problem. It is a decision-structure problem. Before deploying agents, organizations must be able to explain which decisions can be delegated and who remains responsible for each step. Many cannot yet do this. 3. The Governance Paradox Agentic systems do not fit traditional governance models. Most organizations still assume a simple structure: user → application → resource Agent-based systems introduce a new layer: user → agent → tools → resource This change affects access control, compliance processes, and audit visibility. Organizations usually buy agents like software tools but must manage them more like team members. That gap is rarely addressed before deployment begins. This issue is already visible today. Many enterprises are using vendor copilots and embedded AI features inside business systems without clear ownership, audit coverage, or governance rules. This creates a growing “shadow AI” layer even before intentional agent programs start. 4. Identity and Accountability Ambiguity Many organizations cannot clearly answer a simple question: who is responsible when an agent makes a mistake? In practice, agents often receive permissions that are broader than necessary, execution traces are difficult to follow across multiple systems, and accountability is split between IT, compliance, and business teams. Without clear attribution, autonomy introduces hidden risk instead of efficiency. Delegation without accountability is not automation. It is unmanaged risk. 5. Organizational Misalignment Most transformation programs still assume employees will use AI as a tool. Agentic environments change the role of employees from operators to supervisors. People are expected to review outcomes, guide behavior, and manage exceptions instead of executing every step themselves. Research from BCG shows that around 70% of AI project challenges come from people and process issues rather than technology. Organizations that invest in change management are significantly more likely to see successful results. Organizational readiness is not something to address later. It is required before agents can operate safely. Common Failure Patterns at a Glance Common failure patterns like these are already visible in real deployments. The Klarna case illustrates the challenge well. After replacing several hundred customer service roles with AI, the company later reported lower resolution quality for complex cases, declining satisfaction scores, and higher escalation rates, which led to renewed hiring in support roles. The outcome did not point to a failure of the model itself. It highlighted what happens when autonomous systems are introduced without the supporting process, governance, and team structures required for sustained operation. II. Defining True Agentic Readiness Agentic readiness is not just about having the right tools in place. It is about whether the organization has the capability to use autonomous systems safely and effectively. Definition Agentic readiness is the ability to safely delegate bounded operational decisions to autonomous systems while maintaining accountability, observability, and policy alignment across the full execution chain. Research consistently shows that organizations benefit from AI only when multiple maturity layers advance together. The MIT CISR AI Maturity Model, based on data from 721 companies, demonstrates that financial performance improves as organizations progress through the stages. Companies in early stages often perform below industry averages, while those reaching later stages perform significantly better. The key insight is that maturity is cumulative. Organizations cannot skip foundational steps and still expect reliable outcomes. For agentic systems, those cumulative layers include strategy alignment, decision-ready processes, context-ready data, governance structures, organizational roles, and technical architecture. When only some of these elements are in place, organizations produce pilots. When they advance together, organizations produce transformation. From Activity Metrics to Outcome Metrics One of the clearest signs of readiness is how an organization measures progress. Organizations at an early stage usually focus on activity: Number of models deployed Pilots launched Features enabled User onboarding numbers and API call volume More mature organizations focus on outcomes: Better decision quality and fewer errors Higher throughput for clearly defined tasks Consistent operation within safe autonomy boundaries Complete audit trails and accurate escalation handling This is not a semantic distinction. Organizations measuring activity invest indefinitely in pilots because they have no signal telling them a pilot has succeeded or failed. The measurement framework is itself a prerequisite for the transformation sequence. III. The Transformation Sequence Most Organizations Skip Many organizations begin agent adoption in the wrong order. Platforms are procured before governance is defined. Models are evaluated before workflows are structured. Autonomy is introduced before decision authority is mapped. The result is not faster progress. It is earlier failure, followed by expensive cleanup later. In traditional cloud transformation, architecture precedes automation. Agentic transformation follows the same rule: decision structure must exist before delegation can scale. Step 1: Strategic Alignment and Decision Scoping Organizations should begin by identifying where autonomy creates value safely — not where it is technically possible and not where ambitions are highest. Strong early candidates usually share the same characteristics: structured decisions, bounded scope, reversible actions, and high execution frequency. Typical examples include incident triage and routing, capacity classification, environment status updates, and prioritization support. These are good starting points not because they are simple, but because failures are visible, recoverable, and useful for learning. Delegation should grow gradually from bounded decision spaces toward broader authority. Organizations that struggle often start with highly visible, high-risk use cases because the business case looks attractive. Organizations that succeed usually begin with frequent, lower-impact decisions where feedback loops are short and improvements can happen quickly. Step 2: Process Maturity and Boundary Setting Agents do not fix broken workflows. They execute them faster. If a process depends on informal judgment, tribal knowledge, or undocumented exception handling, an agent will reproduce those weaknesses at machine speed. Before introducing autonomy, organizations should establish structured runbooks with clear execution paths, explicit escalation logic an agent can evaluate, defined exception-handling rules that do not rely on intuition, and clear boundaries between decisions an agent may take and those that must remain with humans. This level of discipline requires documentation precision that many organizations have never needed before. A statement such as “the engineer uses judgment” is not a runbook step. It is an undocumented dependency that will later appear as an agent failure. This is also where leaders face a practical choice: add agents on top of fragile legacy workflows, or redesign those workflows so delegation can happen safely. In many cases, the second path is slower at first but far more durable. Step 3: Data Context and Decision Awareness Agents cannot operate reliably in fragmented environments. The solution is not simply collecting more data. What they require is decision-aware context: structured knowledge about relationships between systems, service dependencies, environment classification, policy boundaries, and operational intent. This is a different challenge from building analytics platforms. Analytics depends on broad visibility across large datasets. Agentic execution depends on precise, current, and consistent information at the moment a decision is made. A customer record that is accurate enough for reporting may not be reliable enough for an agent executing a contract action. Because of this difference, data readiness becomes a leadership concern rather than only an infrastructure task. Microsoft’s digital transformation guidance captures this clearly with the principle “no AI without data”: organizations should identify critical data sources, establish governance ownership, improve quality, and define controlled access before introducing agents into operational workflows. Step 4: Governance and Delegation Redesign Organizations must explicitly define four categories of agent authority before deployment: What agents may recommend (advisory boundary) What agents may execute autonomously (execution boundary) What requires human approval before execution (escalation boundary) What remains permanently restricted regardless of confidence (prohibition boundary) These policies cannot remain static. Agentic systems require continuous supervision, not periodic review. Research supports this shift. Studies of governance professionals working with autonomous systems show that adopting traditional Enterprise Risk Management frameworks alone does not significantly reduce governance incidents. What makes the difference is integrating human oversight into execution loops and strengthening machine identity security. In practice, this means organizations need a delegated-autonomy governance function: a cross-functional group with representation from IT, compliance, legal, and business teams that continuously defines and monitors the boundaries of agent behavior. This is different from extending existing approval committees. Governance must move from acting as a gate before deployment to operating as a supervision layer throughout the lifecycle of the agent. This creates a basic operational tension: organizations adopt agents to reduce manual work, but safe autonomy requires stronger supervision, better observability, and tighter control over identity and permissions — especially in the early stages. Step 5: Operating Model Redesign: Operationalizing Human-Agent Collaboration Agentic systems create responsibilities that do not yet exist in most organizations. This shift is not mainly about replacing people with agents. It is about redesigning how people work with them, supervise them, and remain accountable for outcomes. New operational roles typically include: Agent reliability engineers who monitor performance, detect degradation, and define retraining triggers Policy designers who translate business rules into machine-evaluable decision logic Workflow supervisors who oversee autonomous execution and handle escalations Context curators who maintain the data foundations agents depend on for accurate reasoning Organizations that succeed with agents do not treat them as static automation tools. They treat them as managed participants inside workflows. That is why they need an HR layer for agents. An HR layer for agents means applying the same lifecycle thinking used for people to autonomous systems. Before an agent is allowed to operate, it needs a clearly defined role, scope, level of authority, and access to the right systems. Once deployed, its performance must be reviewed over time, its behavior monitored, and its permissions adjusted when quality drops or risks increase. When the agent no longer fits the workflow, it should be retired or replaced instead of being left running by default. In practice, this means agent management should include: Onboarding, by defining scope, authority, and access boundaries Supervision, through observability, escalation paths, and performance review Retraining or re-scoping, when quality declines or conditions change Retirement, when the agent no longer fits the process or creates more risk than value In higher-risk workflows, this HR layer must also include graceful degradation. For example, an underperforming agent may automatically lose write access, be moved to read-only mode, and hand control back to a human supervisor until its behavior is corrected. This shift also requires leadership readiness. The Harvard 2025 Global Leadership Development Study found that 71% of senior leaders now see the ability to lead through continuous change as critical, yet only 36% say AI is fully integrated into their strategy. That gap between intention and execution is where many organizational transformation programs begin to stall. Step 6: Coordination Architecture Readiness As organizations deploy agents across multiple domains, a new challenge appears: agents begin optimizing locally instead of organizationally. An agent focused on cost efficiency in one area may conflict with another agent responsible for quality assurance elsewhere. Without coordination structures, these conflicts often remain invisible until they surface as operational failures. Coordination architecture helps align agent behavior across the organization. It ensures policy consistency between agents, maintains a shared understanding of the operational environment, prevents conflicts when actions intersect, and supports stable communication between agents working together across workflows. This capability is not required for the first agent deployment. It becomes important as soon as organizations begin operating multiple agents in parallel. Many organizations encounter coordination problems earlier than expected, which is why coordination readiness belongs in the transformation sequence even if its implementation happens later. Local optimization is rarely what enterprises intend. Coordination architecture is how you prevent it from becoming what they get. IV. The Regulatory Clock Is Already Running For organizations operating in or serving European markets, readiness is no longer only a strategic question. It is also a regulatory one. The EU AI Act’s high-risk provisions take effect in August 2026, with potential penalties reaching €35 million or 7% of global revenue. Colorado’s AI Act follows in June 2026, and a growing number of U.S. states now require documented AI governance programs for specific sectors and use cases. The governance and data foundations described earlier in this article are therefore not only best practice. For many organizations, they are becoming compliance prerequisites. Treating readiness as optional before deployment increasingly means accepting regulatory exposure before value is realized. The transformation sequence described here is not a slower path to deployment. It is the only path that avoids accumulating technical and legal risk at the same time. V. Conclusion: Shifting Toward Outcome-Based Pragmatism Agentic systems rarely fail because language models are incapable. They fail because they are introduced into environments designed for human execution, governed by frameworks built for deterministic software, and evaluated using metrics that cannot distinguish a promising pilot from a production-ready capability. The readiness gap is structural and, in many cases, self-inflicted. Organizations skip foundational steps because platform procurement is faster, more visible, and easier to justify internally than operating-model redesign. The result is earlier failure, higher remediation cost, and — in regulated industries — increasing legal exposure. What this means in practice Organizations should stop measuring readiness through activity indicators and start measuring it through decision quality, execution safety, throughput improvement, and bounded autonomy performance. Governance and data foundations must be established before platform rollout. Organizational transition planning must begin before deployment. Decision authority must be defined before the first agent workflow is introduced. Only then can enterprises safely unlock the productivity gains promised by agentic systems — not because the technology suddenly becomes capable, but because the organization becomes ready to use it. Up Next in This Series Part 2 looks at the cloud foundation needed for safe agent deployment, including identity-first architecture, observability, policy controls, and the platform constraints that often appear only after design decisions have been made. Part 3 focuses on how to design agents that work reliably in enterprise environments, including RAG maturity, loop design, multi-agent coordination, and human oversight built into the architecture from the start. References Weinberg, A. I. (2025). A Framework for the Adoption and Integration of Generative AI in Midsize Organizations and Enterprises (FAIGMOE). Patel, R. (2026). Agentic AI Frameworks: A Complete Enterprise Guide for 2026. Space-O Technologies. Microsoft Learn. Agentic AI maturity model. Keenan, K. (2026). How the right context can reshape agentic AI’s productivity output. Business Insider / Reltio. Ransbotham, S., Kiron, D., Khodabandeh, S., Iyer, S., & Das, A. (2025). The Emerging Agentic Enterprise: How Leaders Must Navigate a New Age of AI. MIT Sloan Management Review & Boston Consulting Group.Foundry IQ: Unlocking ubiquitous knowledge for agents
Introducing Foundry IQ by Azure AI Search in Microsoft Foundry. Foundry IQ is a centralized knowledge layer that connects agents to data with the next generation of retrieval-augmented generation (RAG). Foundry IQ includes the following features: Knowledge bases: Available directly in the new Foundry portal, knowledge bases are reusable, topic-centric collections that ground multiple agents and applications through a single API. Automated indexed and federated knowledge sources – Expand what data an agent can reach by connecting to both indexed and remote knowledge sources. For indexed sources, Foundry IQ delivers automatic indexing, vectorization, and enrichment for text, images, and complex documents. Agentic retrieval engine in knowledge bases – A self-reflective query engine that uses AI to plan, select sources, search, rank and synthesize answers across sources with configurable “retrieval reasoning effort.” Enterprise-grade security and governance – Support for document-level access control, alignment with existing permissions models, and options for both indexed and remote data. Foundry IQ is available in public preview through the new Foundry portal and Azure portal with Azure AI Search. Foundry IQ is part of Microsoft's intelligence layer with Fabric IQ and Work IQ.39KViews6likes4CommentsMulti-agent systems on Azure: identity, monitoring, and security guardrails
I wrote this piece because I know security concerns around AI agents are one of the main things holding many companies back from getting started. There is a lot of excitement around what agents can do on Azure, especially as multi-agent systems become more practical to build. But for many teams, the real hesitation starts when questions come up around trust, identity, permissions, monitoring, and what happens when something goes wrong. This PDF is my attempt to break that down in a practical way, from an Azure architect’s perspective: what multi-agent systems are, where they can fail, and which security layers matter most if you want to build them responsibly in Azure. It is based on hands-on architecture experience, Microsoft guidance, and recent security thinking around agentic systems. Read it on https://medium.com/@SCSA_MJ/multi-agent-systems-on-azure-identity-monitoring-and-security-guardrails-b8b7c82a0c57:Building an Offline AI Interview Coach with Foundry Local, RAG, and SQLite
How to build a 100% offline, AI-powered interview preparation tool using Microsoft Foundry Local, Retrieval-Augmented Generation, and nothing but JavaScript. Foundry Local 100% Offline RAG + TF-IDF JavaScript / Node.js Contents Introduction What is RAG and Why Offline? Architecture Overview Setting Up Foundry Local Building the RAG Pipeline The Chat Engine Dual Interfaces: Web & CLI Testing Adapting for Your Own Use Case What I Learned Getting Started Introduction Imagine preparing for a job interview with an AI assistant that knows your CV inside and out, understands the job you're applying for, and generates tailored questions, all without ever sending your data to the cloud. That's exactly what Interview Doctor does. Interview Doctor's web UI, a polished, dark-themed interface running entirely on your local machine. In this post, I'll walk you through how I built an interview prep tool as a fully offline JavaScript application using: Foundry Local — Microsoft's on-device AI runtime SQLite — for storing document chunks and TF-IDF vectors RAG (Retrieval-Augmented Generation) — to ground the AI in your actual documents Express.js — for the web server Node.js built-in test runner — for testing with zero extra dependencies No cloud. No API keys. No internet required. Everything runs on your machine. What is RAG and Why Does It Matter? Retrieval-Augmented Generation (RAG) is a pattern that makes AI models dramatically more useful for domain-specific tasks. Instead of relying solely on what a model learned during training (which can be outdated or generic), RAG: Retrieves relevant chunks from your own documents Augments the model's prompt with those chunks as context Generates a response grounded in your actual data For Interview Doctor, this means the AI doesn't just ask generic interview questions, it asks questions specific to your CV, your experience, and the specific job you're applying for. Why Offline RAG? Privacy is the obvious benefit, your CV and job applications never leave your device. But there's more: No API costs — run as many queries as you want No rate limits — iterate rapidly during your prep Works anywhere — on a plane, in a café with bad Wi-Fi, anywhere Consistent performance — no cold starts, no API latency Architecture Overview Complete architecture showing all components and data flow. The application has two interfaces (CLI and Web) that share the same core engine: Document Ingestion — PDFs and markdown files are chunked and indexed Vector Store — SQLite stores chunks with TF-IDF vectors Retrieval — queries are matched against stored chunks using cosine similarity Generation — relevant chunks are injected into the prompt sent to the local LLM Step 1: Setting Up Foundry Local First, install Foundry Local: # Windows winget install Microsoft.FoundryLocal # macOS brew install microsoft/foundrylocal/foundrylocal The JavaScript SDK handles everything else — starting the service, downloading the model, and connecting: import { FoundryLocalManager } from "foundry-local-sdk"; import { OpenAI } from "openai"; const manager = new FoundryLocalManager(); const modelInfo = await manager.init("phi-3.5-mini"); // Foundry Local exposes an OpenAI-compatible API const openai = new OpenAI({ baseURL: manager.endpoint, // Dynamic port, discovered by SDK apiKey: manager.apiKey, }); ⚠️ Key Insight Foundry Local uses a dynamic port never hardcode localhost:5272 . Always use manager.endpoint which is discovered by the SDK at runtime. Step 2: Building the RAG Pipeline Document Chunking Documents are split into overlapping chunks of ~200 tokens. The overlap ensures important context isn't lost at chunk boundaries: export function chunkText(text, maxTokens = 200, overlapTokens = 25) { const words = text.split(/\s+/).filter(Boolean); if (words.length <= maxTokens) return [text.trim()]; const chunks = []; let start = 0; while (start < words.length) { const end = Math.min(start + maxTokens, words.length); chunks.push(words.slice(start, end).join(" ")); if (end >= words.length) break; start = end - overlapTokens; } return chunks; } Why 200 tokens with 25-token overlap? Small chunks keep retrieved context compact for the model's limited context window. Overlap prevents information loss at boundaries. And it's all pure string operations, no dependencies needed. TF-IDF Vectors Instead of using a separate embedding model (which would consume precious memory alongside the LLM), we use TF-IDF, a classic information retrieval technique: export function termFrequency(text) { const tf = new Map(); const tokens = text .toLowerCase() .replace(/[^a-z0-9\-']/g, " ") .split(/\s+/) .filter((t) => t.length > 1); for (const t of tokens) { tf.set(t, (tf.get(t) || 0) + 1); } return tf; } export function cosineSimilarity(a, b) { let dot = 0, normA = 0, normB = 0; for (const [term, freq] of a) { normA += freq * freq; if (b.has(term)) dot += freq * b.get(term); } for (const [, freq] of b) normB += freq * freq; if (normA === 0 || normB === 0) return 0; return dot / (Math.sqrt(normA) * Math.sqrt(normB)); } Each document chunk becomes a sparse vector of word frequencies. At query time, we compute cosine similarity between the query vector and all stored chunk vectors to find the most relevant matches. SQLite as a Vector Store Chunks and their TF-IDF vectors are stored in SQLite using sql.js (pure JavaScript — no native compilation needed): export class VectorStore { // Created via: const store = await VectorStore.create(dbPath) insert(docId, title, category, chunkIndex, content) { const tf = termFrequency(content); const tfJson = JSON.stringify([...tf]); this.db.run( "INSERT INTO chunks (...) VALUES (?, ?, ?, ?, ?, ?)", [docId, title, category, chunkIndex, content, tfJson] ); this.save(); } search(query, topK = 5) { const queryTf = termFrequency(query); // Score each chunk by cosine similarity, return top-K } } 💡 Why SQLite for Vectors? For a CV plus a few job descriptions (dozens of chunks), brute-force cosine similarity over SQLite rows is near-instant (~1ms). No need for Pinecone, Qdrant, or Chroma — just a single .db file on disk. Step 3: The RAG Chat Engine The chat engine ties retrieval and generation together: async *queryStream(userMessage, history = []) { // 1. Retrieve relevant CV/JD chunks const chunks = this.retrieve(userMessage); const context = this._buildContext(chunks); // 2. Build the prompt with retrieved context const messages = [ { role: "system", content: SYSTEM_PROMPT }, { role: "system", content: `Retrieved context:\n\n${context}` }, ...history, { role: "user", content: userMessage }, ]; // 3. Stream from the local model const stream = await this.openai.chat.completions.create({ model: this.modelId, messages, temperature: 0.3, stream: true, }); // 4. Yield chunks as they arrive for await (const chunk of stream) { const content = chunk.choices[0]?.delta?.content; if (content) yield { type: "text", data: content }; } } The flow is straightforward: vectorize the query, retrieve with cosine similarity, build a prompt with context, and stream from the local LLM. The temperature: 0.3 keeps responses focused — important for interview preparation where consistency matters. Step 4: Dual Interfaces — Web & CLI Web UI The web frontend is a single HTML file with inline CSS and JavaScript — no build step, no framework, no React or Vue. It communicates with the Express backend via REST and SSE: File upload via multipart/form-data Streaming chat via Server-Sent Events (SSE) Quick-action buttons for common follow-up queries (coaching tips, gap analysis, mock interview) The setup form with job title, seniority level, and a pasted job description — ready to generate tailored interview questions. CLI The CLI provides the same experience in the terminal with ANSI-coloured output: npm run cli It walks you through uploading your CV, entering the job details, and then generates streaming questions. Follow-up questions work interactively. Both interfaces share the same ChatEngine class, they're thin layers over identical logic. Edge Mode For constrained devices, toggle Edge mode to use a compact system prompt that fits within smaller context windows: Edge mode activated, uses a minimal prompt for devices with limited resources. Step 5: Testing Tests use the Node.js built-in test runner, no Jest, no Mocha, no extra dependencies: import { describe, it } from "node:test"; import assert from "node:assert/strict"; describe("chunkText", () => { it("returns single chunk for short text", () => { const chunks = chunkText("short text", 200, 25); assert.equal(chunks.length, 1); }); it("maintains overlap between chunks", () => { // Verifies overlapping tokens between consecutive chunks }); }); npm test Tests cover the chunker, vector store, config, prompts, and server API contract, all without needing Foundry Local running. Adapting for Your Own Use Case Interview Doctor is a pattern, not just a product. You can adapt it for any domain: What to Change How Domain documents Replace files in docs/ with your content System prompt Edit src/prompts.js Chunk sizes Adjust config.chunkSize and config.chunkOverlap Model Change config.model — run foundry model list UI Modify public/index.html — it's a single file Ideas for Adaptation Customer support bot — ingest your product docs and FAQs Code review assistant — ingest coding standards and best practices Study guide — ingest textbooks and lecture notes Compliance checker — ingest regulatory documents Onboarding assistant — ingest company handbooks and processes What I Learned Offline AI is production-ready. Foundry Local + small models like Phi-3.5 Mini are genuinely useful for focused tasks. You don't need vector databases for small collections. SQLite + TF-IDF is fast, simple, and has zero infrastructure overhead. RAG quality depends on chunking. Getting chunk sizes right for your use case is more impactful than the retrieval algorithm. The OpenAI-compatible API is a game-changer. Switching from cloud to local was mostly just changing the baseURL . Dual interfaces are easy when you share the engine. The CLI and Web UI are thin layers over the same ChatEngine class. ⚡ Performance Notes On a typical laptop (no GPU): ingestion takes under 1 second for ~20 documents, retrieval is ~1ms, and the first LLM token arrives in 2-5 seconds. Foundry Local automatically selects the best model variant for your hardware (CUDA GPU, NPU, or CPU). Getting Started git clone https://github.com/leestott/interview-doctor-js.git cd interview-doctor-js npm install npm run ingest npm start # Web UI at http://127.0.0.1:3000 # or npm run cli # Interactive terminal The full source code is on GitHub. Star it, fork it, adapt it — and good luck with your interviews! Resources Foundry Local — Microsoft's on-device AI runtime Foundry Local SDK (npm) — JavaScript SDK Foundry Local GitHub — Source, samples, and documentation Local RAG Reference — Reference RAG implementation Interview Doctor (JavaScript) — This project's source codeEstablish an Oracle Database Connection hosted on Azure VM via AI Foundry Agent
I have came across a requirement to create a AI Foundry agent that will accept requests from user like below: a. "I want to connect to abcprd database hosted on subscription sub1, and resource group rg1 and check the AWR report from xAM-yPM on a specific date (eg 21-Oct-2025) b. Check locking session/RMAN backup failures/active sessions from the database abcprd hosted on subscription sub1, and resource group rg1. The agent should be able to fetch the relevant query from knowledge base . connect to the database and run the report for the duration mentioned. It should then fetch the report and pass it to the LLM (GPT 4.1 in our case) for investigations. I am looking for approach to connect to the oracle database based on user's request and execute the query obtained from knowledge base.110Views0likes2CommentsMicrosoft Foundry Labs: A Practical Fast Lane from Research to Real Developer Work
Why developers need a fast lane from research → prototypes AI engineering has a speed problem, but it is not a shortage of announcements. The hard part is turning research into a useful prototype before the next wave of models, tools, or agent patterns shows up. That gap matters. AI engineers want to compare quality, latency, and cost before they wire a model into a product. Full-stack teams want to test whether an agent workflow is real or just demo. Platform and operations teams want to know when an experiment can graduate into something observable and supportable. Microsoft makes that case directly in introducing Microsoft Foundry Labs: breakthroughs are arriving faster, and time from research to product has compressed from years to months. If you build real systems, the question is not "What is the coolest demo?" It is "Which experiments are worth my next hour, and how do I evaluate them without creating demo-ware?" That is where Microsoft Foundry Labs becomes interesting. What is Microsoft Foundry Labs? Microsoft Foundry Labs is a place to explore early-stage experiments and prototypes from Microsoft, with an explicit focus on research-driven innovation. The homepage describes it as a way to get a glimpse of potential future directions for AI through experimental technologies from Microsoft Research and more. The announcement adds the operating idea: Labs is a single access point for developers to experiment with new models from Microsoft, explore frameworks, and share feedback. That framing matters. Labs is not just a gallery of flashy ideas. It is a developer-facing exploration surface for projects that are still close to research: models, agent systems, UX ideas, and tool experiments. Here's some things you can do on Labs: Play with tomorrow’s AI, today: 30+ experimental projects—from models to agents—are openly available to fork and build upon, alongside direct access to breakthrough research from Microsoft. Go from prototype to production, fast: Seamless integration with Microsoft Foundry gives you access to 11,000+ models with built-in compute, safety, observability, and governance—so you can move from local experimentation to full-scale production without complex containerization or switching platforms. Build with the people shaping the future of AI: Join a thriving community of 25,000+ developers across Discord and GitHub with direct access to Microsoft researchers and engineers to share feedback and help shape the most promising technologies. What Labs is not: it is not a promise that every project has a production deployment path today, a long-term support commitment, or a hardened enterprise operating model. Spotlight: a few Labs experiments worth a developer's attention Phi-4-Reasoning-Vision-15B: A compact open-weight multimodal reasoning model that is interesting if you care about the quality-versus-efficiency tradeoff in smaller reasoning systems. BitNet: A native 1-bit large language model that is compelling for engineers who care about memory, compute, and energy efficiency. Fara-7B: An ultra-compact agentic small language model designed for computer use, which makes it relevant for builders exploring UI automation and on-device agents. OmniParser V2: A screen parsing module that turns interfaces into actionable elements, directly relevant to computer-use and UI-interaction agents. If you want to inspect actual code, the Labs project pages also expose official repository links for some of these experiments, including OmniParser, Magentic-UI, and BitNet. Labs vs. Foundry: how to think about the boundary The simplest mental model is this: Labs is the exploration edge; Foundry is the platform layer. The Microsoft Foundry documentation describes the broader platform as "the AI app and agent factory" to build, optimize, and govern AI apps and agents at scale. That is a different promise from Labs. Foundry is where you move from curiosity to implementation: model access, agent services, SDKs, observability, evaluation, monitoring, and governance. Labs helps you explore what might matter next. Foundry helps you build, optimize, and govern what matters now. Labs is where you test a research-shaped idea. Foundry is where you decide whether that idea can survive integration, evaluation, tracing, cost controls, and production scrutiny. That also means Labs is not a replacement for the broader Foundry workflow. If an experiment catches your attention, the next question is not "Can I ship this tomorrow?" It is "What is the integration path, and how will I measure whether it deserves promotion?" What's real today vs. what's experimental Real today: Labs is live as an official exploration hub, and Foundry is the broader platform for building, evaluating, monitoring, and governing AI apps and agents. Experimental by design: Labs projects are presented as experiments and prototypes, so they still need validation for your use case. A developer's lens: Models, Agents, Observability What makes Labs useful is not that it shows new things. It is that it gives developers a way to inspect those things through the same three concerns that matter in every serious AI system: model choice, agent design, and observability. Diagram description: imagine a loop with three boxes in a row: Models, Agents, and Observability. A forward arrow runs across the row, and a feedback arrow loops from Observability back to Models. The point is that evaluation data should change both model choices and agent design, instead of arriving too late. Models: what to look for in Labs experiments If you are model-curious, Labs should trigger an evaluation mindset, not a fandom mindset. When you see something like Phi-4-Reasoning-Vision-15B or BitNet on the Labs homepage, ask three things: what capability is being demonstrated, what constraints are obvious, and what the integration path would look like. This is where the Microsoft Foundry Playgrounds mindset is useful even if you started in Labs. The documentation emphasizes model comparison, prompt iteration, parameter tuning, tools, safety guardrails, and code export. It also pushes the right pre-production questions: price-to-performance, latency, tool integration, and code readiness. That is how I would use Labs for models: not to choose winners, but to generate hypotheses worth testing. If a Labs experiment looks promising, move quickly into a small evaluation matrix around capability, latency, cost, and integration friction. Agents: what Labs unlocks for agent builders Labs is especially interesting for agent builders because many of the projects point toward orchestration and tool-use patterns that matter in practice. The official announcement highlights projects across models and agentic frameworks, including Magentic-One and OmniParser v2. On the homepage, projects such as Fara-7B, OmniParser V2, TypeAgent, and Magentic-UI point in a similar direction: agents get more useful when they can reason over tools, interfaces, plans, and human feedback loops. For working developers, that means Labs can act as a scouting surface for agent patterns rather than just agent demos. Look for UI or computer-use style agents when your system needs to act through an interface rather than an API. Look for planning or tool-selection patterns when orchestration matters more than raw model quality. My suggestion: when a Labs project looks relevant to agent work, do not ask "Can I copy this architecture?" Ask "Which agent pattern is being explored here, and under what constraints would it be useful in my system?" Observability: how to experiment responsibly and measure what matters Observability is where prototypes usually go to die, because teams postpone it until after they have something flashy. That is backwards. If you care about real systems, tracing, evaluation, monitoring, and governance should start during prototyping. The Microsoft Foundry documentation already puts that operating model in plain view through guidance for tracing applications, evaluating agentic workflows, and monitoring generative AI apps. The Microsoft Foundry Playgrounds page is also explicit that the agents playground supports tracing and evaluation through AgentOps. At the governance layer, the AI gateway in Azure API Management documentation reinforces why this matters beyond demos. It covers monitoring and logging AI interactions, tracking token metrics, logging prompts and completions, managing quotas, applying safety policies, and governing models, agents, and tools. You do not need every one of those controls on day one, but you do need the habit: if a prototype cannot tell you what it did, why it failed, and what it cost, it is not ready to influence a roadmap. "Pick one and try it": a 20-minute hands-on path Keep this lightweight and tool-agnostic. The point is not to memorize a product UI. The point is to run a disciplined experiment. Browse Labs and pick an experiment aligned to your work. Start at Microsoft Foundry Labs and choose one project that is adjacent to a real problem you have: model efficiency, multimodal reasoning, UI agents, debugging workflows, or human-in-the-loop design. Read the project page and jump to the repo or paper if available. Use the Labs entry to understand the claim being made. Then read the supporting material, not just the summary sentence. Define one small test task and explicit success criteria. Keep it concrete: latency budget, accuracy target, cost ceiling, acceptable safety behavior, or failure rate under a narrow scenario. Capture telemetry from the start. At minimum, keep prompts or inputs, outputs, intermediate decisions, and failures. If the experiment involves tools or agents, include tool choices and obvious reasons for failure or recovery. Make a hard call. Decide whether to keep exploring or wait for a stronger production-grade path. "Interesting" is not the same as "ready for integration." Minimal experiment logger (my suggestion): if you want a lightweight way to avoid demo-ware, even a local JSONL log is enough to capture prompts, outputs, decisions, failures, and latency while you compare ideas from Labs. import json import time from pathlib import Path LOG_PATH = Path("experiment-log.jsonl") def record_event(name, payload): # Append one event per line so runs are easy to diff and analyze later. with LOG_PATH.open("a", encoding="utf-8") as handle: handle.write(json.dumps({"event": name, **payload}) + "\n") def run_experiment(user_input): started = time.time() try: # Replace this stub with your real model or agent call. output = user_input.upper() decision = "keep exploring" if len(output) < 80 else "wait" record_event( "experiment_result", { "input": user_input, "output": output, "decision": decision, "latency_ms": round((time.time() - started) * 1000, 2), "failure": None, }, ) except Exception as error: record_event( "experiment_result", { "input": user_input, "output": None, "decision": "failed", "latency_ms": round((time.time() - started) * 1000, 2), "failure": str(error), }, ) raise if __name__ == "__main__": run_experiment("Summarize the constraints of this Labs project.") That script is intentionally boring. That is the point. It gives you a repeatable, runnable starting point for comparing experiments without pretending you already have a full observability stack. Practical tips: how I evaluate Labs experiments before betting a roadmap on them Separate the idea from the implementation path. A strong research direction can still have a weak near-term integration story. Test one workload, not ten. Pick a narrow task that resembles your production reality and see whether the experiment moves the needle. Track cost and latency as first-class metrics. A novel capability that breaks your budget or response-time envelope is still a failed fit. Treat agent demos skeptically unless you can inspect behavior. Tool calls, traces, failure cases, and recovery paths matter more than polished output. Common pitfalls are predictable here. Do not confuse a research win with a deployment path. Labs is for exploration, so you still need to validate integration, safety, and operations. Do not evaluate with vague prompts. Use a narrow task and explicit success criteria, or you will end up comparing vibes instead of outcomes. Do not skip telemetry because the prototype is small. If you cannot inspect failures early, the prototype will teach you very little. Do not ignore known limitations. For example, the Fara-7B project page explicitly notes challenges on more complex tasks, instruction-following mistakes, and hallucinations, which is exactly the kind of constraint you should carry into evaluation. What to explore next Azure AI Foundry Labs matters because it gives developers a practical way to explore research-shaped ideas before they harden into mainstream patterns. The smart move is to use Labs as an input into better platform decisions: explore in Labs, validate with the discipline encouraged by Foundry playgrounds, and then bring the learnings back into the broader Foundry workflow. Takeaway 1: Labs is an exploration surface for early-stage, research-driven experiments and prototypes, not a blanket promise of production readiness. Takeaway 2: The right workflow is Labs for discovery, then Microsoft Foundry for implementation, optimization, evaluation, monitoring, and governance. Takeaway 3: Tracing, evaluations, and telemetry should start during prototyping, because that is how you avoid confusing a compelling demo with a viable system. If you are curious, start with Microsoft Foundry Labs, read the official context in Introducing Microsoft Foundry Labs, and then map what you learn into the platform guidance in Microsoft Foundry documentation. Try this next Open Microsoft Foundry Labs and choose one experiment that matches a real workload you care about. Use the mindset from Microsoft Foundry Playgrounds to define a small validation task around quality, latency, cost, and safety. Write down the minimum telemetry you need before continuing: inputs, outputs, decisions, failures, and token or cost signals. Read the relevant operating guidance in AI gateway in Azure API Management if your experiment may eventually need monitoring, quotas, safety policies, or governance. Promote only the experiments that can explain their value clearly in a Foundry-shaped build, evaluation, and observability workflow.Vectorless Reasoning-Based RAG: A New Approach to Retrieval-Augmented Generation
Introduction Retrieval-Augmented Generation (RAG) has become a widely adopted architecture for building AI applications that combine Large Language Models (LLMs) with external knowledge sources. Traditional RAG pipelines rely heavily on vector embeddings and similarity search to retrieve relevant documents. While this works well for many scenarios, it introduces challenges such as: Requires chunking documents into small segments Important context can be split across chunks Embedding generation and vector databases add infrastructure complexity A new paradigm called Vectorless Reasoning-Based RAG is emerging to address these challenges. One framework enabling this approach is PageIndex, an open-source document indexing system that organizes documents into a hierarchical tree structure and allows Large Language Models (LLMs) to perform reasoning-based retrieval over that structure. Vectorless Reasoning-Based RAG Instead of vectors, this approach uses structured document navigation. User Query ->Document Tree Structure ->LLM Reasoning ->Relevant Nodes Retrieved ->LLM Generates Answer This mimics how humans read documents: Look at the table of contents Identify relevant sections Read the relevant content Answer the question Core features No Vector Database: It relies on document structure and LLM reasoning for retrieval. It does not depend on vector similarity search. No Chunking: Documents are not split into artificial chunks. Instead, they are organized using their natural structure, such as pages and sections. Human-like Retrieval: The system mimics how human experts read documents. It navigates through sections and extracts information from relevant parts. Better Explainability and Traceability: Retrieval is based on reasoning. The results can be traced back to specific pages and sections. This makes the process easier to interpret. It avoids opaque and approximate vector search, often called “vibe retrieval.” When to Use Vectorless RAG Vectorless RAG works best when: Data is structured or semi-structured Documents have clear metadata Knowledge sources are well organized Queries require reasoning rather than semantic similarity Examples: enterprise knowledge bases internal documentation systems compliance and policy search healthcare documentation financial reporting Implementing Vectorless RAG with Azure AI Foundry Step 1 : Install Pageindex using pip command, from pageindex import PageIndexClient import pageindex.utils as utils # Get your PageIndex API key from https://dash.pageindex.ai/api-keys PAGEINDEX_API_KEY = "YOUR_PAGEINDEX_API_KEY" pi_client = PageIndexClient(api_key=PAGEINDEX_API_KEY) Step 2 : Set up your LLM Example using Azure OpenAI: from openai import AsyncAzureOpenAI client = AsyncAzureOpenAI( api_key=AZURE_OPENAI_API_KEY, azure_endpoint=AZURE_OPENAI_ENDPOINT, api_version=AZURE_OPENAI_API_VERSION ) async def call_llm(prompt, temperature=0): response = await client.chat.completions.create( model=AZURE_DEPLOYMENT_NAME, messages=[{"role": "user", "content": prompt}], temperature=temperature ) return response.choices[0].message.content.strip() Step 3: Page Tree Generation import os, requests pdf_url = "https://arxiv.org/pdf/2501.12948.pdf" //give the pdf url for tree generation, here given one for example pdf_path = os.path.join("../data", pdf_url.split('/')[-1]) os.makedirs(os.path.dirname(pdf_path), exist_ok=True) response = requests.get(pdf_url) with open(pdf_path, "wb") as f: f.write(response.content) print(f"Downloaded {pdf_url}") doc_id = pi_client.submit_document(pdf_path)["doc_id"] print('Document Submitted:', doc_id) Step 4 : Print the generated pageindex tree structure if pi_client.is_retrieval_ready(doc_id): tree = pi_client.get_tree(doc_id, node_summary=True)['result'] print('Simplified Tree Structure of the Document:') utils.print_tree(tree) else: print("Processing document, please try again later...") Step 5 : Use LLM for tree search and identify nodes that might contain relevant context import json query = "What are the conclusions in this document?" tree_without_text = utils.remove_fields(tree.copy(), fields=['text']) search_prompt = f""" You are given a question and a tree structure of a document. Each node contains a node id, node title, and a corresponding summary. Your task is to find all nodes that are likely to contain the answer to the question. Question: {query} Document tree structure: {json.dumps(tree_without_text, indent=2)} Please reply in the following JSON format: {{ "thinking": "<Your thinking process on which nodes are relevant to the question>", "node_list": ["node_id_1", "node_id_2", ..., "node_id_n"] }} Directly return the final JSON structure. Do not output anything else. """ tree_search_result = await call_llm(search_prompt) Step 6 : Print retrieved nodes and reasoning process node_map = utils.create_node_mapping(tree) tree_search_result_json = json.loads(tree_search_result) print('Reasoning Process:') utils.print_wrapped(tree_search_result_json['thinking']) print('\nRetrieved Nodes:') for node_id in tree_search_result_json["node_list"]: node = node_map[node_id] print(f"Node ID: {node['node_id']}\t Page: {node['page_index']}\t Title: {node['title']}") Step 7: Answer generation node_list = json.loads(tree_search_result)["node_list"] relevant_content = "\n\n".join(node_map[node_id]["text"] for node_id in node_list) print('Retrieved Context:\n') utils.print_wrapped(relevant_content[:1000] + '...') answer_prompt = f""" Answer the question based on the context: Question: {query} Context: {relevant_content} Provide a clear, concise answer based only on the context provided. """ print('Generated Answer:\n') answer = await call_llm(answer_prompt) utils.print_wrapped(answer) When to Use Each Approach Both vector-based RAG and vectorless RAG have their strengths. Choosing the right approach depends on the nature of the documents and the type of retrieval required. When to Use Vector Database–Based RAG Vector-based retrieval works best when dealing with large collections of unrelated or loosely structured documents. In such cases, semantic similarity is often sufficient to identify relevant information quickly. Use vector RAG when: Searching across many independent documents Semantic similarity is sufficient to locate relevant content Real-time retrieval is required over very large datasets Common use cases include: Customer support knowledge bases Conversational chatbots Product and content search systems When to Use Vectorless RAG Vectorless approaches such as PageIndex are better suited for long, structured documents where understanding the logical organization of the content is important. Use vectorless RAG when: Documents contain clear hierarchical structure Logical reasoning across sections is required High retrieval accuracy is critical Typical examples include: Financial filings and regulatory reports Legal documents and contracts Technical manuals and documentation Academic and research papers In these scenarios, navigating the document structure allows the system to identify the exact section that logically contains the answer, rather than relying only on semantic similarity. Conclusion Vector databases significantly advanced RAG architectures by enabling scalable semantic search across large datasets. However, they are not the optimal solution for every type of document. Vectorless approaches such as PageIndex introduce a different philosophy: instead of retrieving text that is merely semantically similar, they retrieve text that is logically relevant by reasoning over the structure of the document. As RAG architectures continue to evolve, the future will likely combine the strengths of both approaches. Hybrid systems that integrate vector search for broad retrieval and reasoning-based navigation for precision may offer the best balance of scalability and accuracy for enterprise AI applications.3.3KViews2likes0Comments