automation
434 TopicsXDR advanced hunting region specific endpoints
Hi, I am exploring XDR advanced hunting API to fetch data specific to Microsoft Defender for Endpoint tenants. The official documentation (https://learn.microsoft.com/en-us/defender-xdr/api-advanced-hunting) mentions to switch to Microsoft Graph advanced hunting API. I had below questions related to it: 1. To fetch the region specific(US , China, Global) token and Microsoft Graph service root endpoints(https://learn.microsoft.com/en-us/graph/deployments#app-registration-and-token-service-root-endpoints ) , is the recommended way to fetch the OpenID configuration document (https://learn.microsoft.com/en-us/entra/identity-platform/v2-protocols-oidc#fetch-the-openid-configuration-document) for a tenant ID and based on the response, the region specific SERVICE/TOKEN endpoints could be fetched? Since using it, there is no need to maintain different end points for tenants in different regions. And do we use the global service URL https://login.microsoftonline.com to fetch OpenID config document for a tenantID in any region? 2. As per the documentation, Microsoft Graph Advanced hunting API is not supported in China region (https://learn.microsoft.com/en-us/graph/api/security-security-runhuntingquery?view=graph-rest-1.0&tabs=http). In this case, is it recommended to use Microsoft XDR Advanced hunting APIs(https://learn.microsoft.com/en-us/defender-xdr/api-advanced-hunting) to support all region tenants(China, US, Global)?48Views0likes1CommentUpdate content package Metadata
Hello Sentinel community and Microsoft. Ive been working on a script where i use this command: https://learn.microsoft.com/en-us/rest/api/securityinsights/content-package/install?view=rest-securityinsights-2024-09-01&tabs=HTTP Ive managed to successfully create everything from retrieving whats installed, uninstalling, reinstalling and lastly updating (updating needed to be "list, delete, install" however :'), there was no flag for "update available"). However, now to my issue. As this work like a charm through powershell, the metadata and hyperlinking is not being deployed - at all. So i have my 40 content packages successfully installed through the REST-api, but then i have to visit the content hub in sentinel in the GUI, filter for "installed" and mark them all, then press "install". When i do this the metadata and hyperlinking is created. (Its most noticeable that the analytic rules for the content hubs are not available under analytic rules -> Rule templates after installing through the rest api). But once you press install button in the GUI, they appear. So i looked in to the request that is made when pressing the button. It uses another API version, fine, i can add that to my script. But it also uses 2 variables that are not documented and encrypted-data. they are called c and t: Im also located in EU and it makes a request to SentinelUS. im OK with that, also as mentioned, another API version (2020-06-01) while the REST APi to install content packages above has 2024-09-01. NP. But i can not simulate this last request as the variables are encrypted and not available through the install rest api. They are also not possible to simulate. it ONLY works in the GUI when pressing install. Lastly i get another API version back when it successfully ran through install in GUI, so in total its 3 api versions. Here is my code snippet i tried (it is basically a mimic of the post request in the network tab of the browser then pressing "install" on the package in content hub, after i successfully installed it through the official rest api). function Refresh-WorkspaceMetadata { param ( [Parameter(Mandatory = $true)] [string]$SubscriptionId, [Parameter(Mandatory = $true)] [string]$ResourceGroup, [Parameter(Mandatory = $true)] [string]$WorkspaceName, [Parameter(Mandatory = $true)] [string]$AccessToken ) # Use the API version from the portal sample $apiVeri = "?api-version=" $RefreshapiVersion = "2020-06-01" # Build the batch endpoint URL with the query string on the batch URI $batchUri = "https://management.azure.com/\$batch$apiVeri$RefreshapiVersion" # Construct a relative URL for the workspace resource. # Append dummy t and c parameters to mimic the portal's request. $workspaceUrl = "/subscriptions/$SubscriptionId/resourceGroups/$ResourceGroup/providers/Microsoft.OperationalInsights/workspaces/$WorkspaceName$apiVeri$RefreshapiVersion&t=123456789&c=dummy" # Create a batch payload with several GET requests $requests = @() for ($i = 0; $i -lt 5; $i++) { $requests += @{ httpMethod = "GET" name = [guid]::NewGuid().ToString() requestHeaderDetails = @{ commandName = "Microsoft_Azure_SentinelUS.ContenthubWorkspaceClient/get" } url = $workspaceUrl } } $body = @{ requests = $requests } | ConvertTo-Json -Depth 5 try { $response = Invoke-RestMethod -Uri $batchUri -Method Post -Headers @{ "Authorization" = "Bearer $AccessToken" "Content-Type" = "application/json" } -Body $body Write-Host "[+] Workspace metadata refresh triggered successfully." -ForegroundColor Green } catch { Write-Host "[!] Failed to trigger workspace metadata refresh. Error: $_" -ForegroundColor Red } } Refresh-WorkspaceMetadata -SubscriptionId $subscriptionId -ResourceGroup $resourceGroup -WorkspaceName $workspaceName -AccessToken $accessToken (note: i have variables higher up in my script for subscriptionid, resourcegroup, workspacename and token etc). Ive tried with and without mimicing the T and C variable. none works. So for me, currently, installing content hub packages for sentinel is always: Install through script to get all 40 packages Visit webpage, filter for 'Installed', mark them and press 'Install' You now have all metadata and hyperlinking available to you in your Sentinel (such as hunting rules, analytic rules, workbooks, playbooks -templates). Anyone else manage to get around this or is it "GUI" gated ? Greatly appreciated.Solved281Views1like6CommentsDetecting browser anomalies to disrupt attacks early
Uncover the secrets of early attack disruption with browser anomaly detections! This blog post explores how Microsoft Defender XDR leverages advanced techniques to identify unusual browser activities and stop cyber threats in their tracks. Learn about the importance of monitoring unusual browser activities, session hijacking, Business Email Compromise (BEC), and other critical attack paths. With real-world examples and insights into the systematic approach used by Defender XDR, you'll gain a deeper understanding of how to enhance your organization's security posture. Don't miss out on this essential read for staying ahead of cyber threats!9.3KViews6likes1CommentAzure Policies for Automating Azure Governance - Automating Policies
In the earlier post, I covered issues and concerns organizations may face and how many built in Azure policies can address these problems. Now we are going to take it a step further and discuss how to enforce policies and automate their creation9.1KViews1like1CommentThe Future of AI: From Noise to Insight - An AI Agent for Customer Feedback
This post explores how Microsoft’s AI Futures team built a multi-agent system to transform scattered customer feedback into actionable insights. The solution aggregates feedback from multiple channels, uses advanced language models to cluster themes, summarize content, and identify sentiment, and delivers prioritized insights directly in Microsoft Teams. With human-in-the-loop safeguards, the system accelerates triage, prioritization, and follow-ups while maintaining compliance and traceability. Future enhancements include richer automation, trend visualization, and expanded feedback sources.310Views0likes0CommentsAutomating Microsoft Sentinel: Part 2: Automate the mundane away
Welcome to the second entry of our blog series on automating Microsoft Sentinel. In this series, we’re showing you how to automate various aspects of Microsoft Sentinel, from simple automation of Sentinel Alerts and Incidents to more complicated response scenarios with multiple moving parts. So far, we’ve covered Part 1: Introduction to Automating Microsoft Sentinel where we talked about why you would want to automate as well as an overview of the different types of automation you can do in Sentinel. Here is a preview of what you can expect in the upcoming posts [we’ll be updating this post with links to new posts as they happen]: Part 1: Introduction to Automating Microsoft Sentinel Part 2: Automation Rules [You are here] – Automate the mundane away Part 3: Playbooks 1 – Playbooks Part I – Fundamentals Part 4: Playbooks 2 – Playbooks Part II – Diving Deeper Part 5: Azure Functions / Custom Code Part 6: Capstone Project (Art of the Possible) – Putting it all together Part 2: Automation Rules – Automate the mundane away Automation rules can be used to automate Sentinel itself. For example, let’s say there is a group of machines that have been classified as business critical and if there is an alert related to those machines, then the incident needs to be assigned to a Tier 3 response team and the severity of the alert needs to be raised to at least “high”. Using an automation rule, you can take one analytic rule, apply it to the entire enterprise, but then have an automation rule that only applies to those business-critical systems to make those changes. That way only the items that need that immediate escalation receive it, quickly and efficiently. Automation Rules In Depth So, now that we know what Automation Rules are, let’s dive in to them a bit deeper to better understand how to configure them and how they work. Creating Automation Rules There are three main places where we can create an Automation Rule: 1) Navigating to Automation under the left menu 2) In an existing Incident via the “Actions” button 3) When writing an Analytic Rule, under the “Automated response” tab The process for each is generally the same, except for the Incident route and we’ll break that down more in a bit. When we create an Automation Rule, we need to give the rule a name. It should be descriptive and indicative of what the rule is going to do and what conditions it applies to. For example, a rule that automatically resolves an incident based on a known false positive condition on a server named SRV02021 could be titled “Automatically Close Incident When Affected Machine is SRV02021” but really it’s up to you to decide what you want to name your rules. Trigger The next thing we need to define for our Automation Rule is the Trigger. Triggers are what cause the automation rule to begin running. They can fire when an incident is created or updated, or when an alert is created. Of the two options (incident based or alert based), it’s preferred to use incident triggers as they’re potentially the aggregation of multiple alerts and the odds are that you’re going to want to take the same automation steps for all of the alerts since they’re all related. It’s better to reserve alert-based triggers for scenarios where an analytic rule is firing an alert, but is set to not create an incident. Conditions Conditions are, well, the conditions to which this rule applies. There are two conditions that are always present: The Incident provider and the Analytic rule name. You can choose multiple criterion and steps. For example, you could have it apply to all incident providers and all rules (as shown in the picture above) or only a specific provider and all rules, or not apply to a particular provider, etc. etc. You can also add additional Conditions that will either include or exclude the rule from running. When you create a new condition, you can build it out by multiple properties ranging from information about the Incident all the way to information about the Entities that are tagged in the incident Remember our earlier Automation Rule title where we said this was a false positive about a server name SRV02021? This is where we make the rule match that title by setting the Condition to only fire this automation if the Entity has a host name of “SRV2021” By combining AND and OR group clauses with the built in conditional filters, you can make the rule as specific as you need it to be. You might be thinking to yourself that it seems like while there is a lot of power in creating these conditions, it might be a bit onerous to create them for each condition. Recall earlier where I said the process for the three ways of creating Automation Rules was generally the same except using the Incident Action route? Well, that route will pre-fill variables for that selected instance. For example, for the image below, the rule automatically took the rule name, the rules it applies to as well as the entities that were mapped in the incident. You can add, remove, or modify any of the variables that the process auto-maps. NOTE: In the new Unified Security Operations Platform (Defender XDR + Sentinel) that has some new best practice guidance: If you've created an automation using "Title" use "Analytic rule name" instead. The Title value could change with Defender's Correlation engine. The option for "incident provider" has been removed and replaced by "Alert product names" to filter based on the alert provider. Actions Now that we’ve tuned our Automation Rule to only fire for the situations we want, we can now set up what actions we want the rule to execute. Clicking the “Actions” drop down list will show you the options you can choose When you select an option, the user interface will change to map to your selected option. For example, if I choose to change the status of the Incident, the UX will update to show me a drop down menu with options about which status I would like to set. If I choose other options (Run playbook, change severity, assign owner, add tags, add task) the UX will change to reflect my option. You can assign multiple actions within one Automation Rule by clicking the “Add action” button and selecting the next action you want the system to take. For example, you might want to assign an Incident to a particular user or group, change its severity to “High” and then set the status to Active. Notably, when you create an Automation rule from an Incident, Sentinel automatically sets a default action to Change Status. It sets the automation up to set the Status to “Closed” and a “Benign Positive – Suspicious by expected”. This default action can be deleted and you can then set up your own action. In a future episode of this blog we’re going to be talking about Playbooks in detail, but for now just know that this is the place where you can assign a Playbook to your Automation Rules. There is one other option in the Actions menu that I wanted to specifically talk about in this blog post though: Incident Tasks Incident Tasks Like most cybersecurity teams, you probably have a run book of the different tasks or steps that your analysts and responders should take for different situations. By using Incident Tasks, you can now embed those runbook steps directly in the Incident. Incident tasks can be as lightweight or as detailed as you need them to be and can include rich formatting, links to external content, images, etc. When an incident with Tasks is generated, the SOC team will see these tasks attached as part of the Incident and can then take the defined actions and check off that they’ve been completed. Rule Lifetime and Order There is one last section of Automation rules that we need to define before we can start automating the mundane away: when should the rule expire and in what order should the rule run compared to other rules. When you create a rule in the standalone automation UX, the default is for the rule to expire at an indefinite date and time in the future, e.g. forever. You can change the expiration date and time to any date and time in the future. If you are creating the automation rule from an Incident, Sentinel will automatically assume that this rule should have an expiration date and time and sets it automatically to 24 hours in the future. Just as with the default action when created from an incident, you can change the date and time of expiration to any datetime in the future, or set it to “Indefinite” by deleting the date. Conclusion In this blog post, we talked about Automation Rules in Sentinel and how you can use them to automate mundane tasks in Sentinel as well as leverage them to help your SOC analysts be more effective and consistent in their day-to-day with capabilities like Incident Tasks. Stay tuned for more updates and tips on automating Microsoft Sentinel!1.5KViews2likes1CommentDefender for Endpoint Firewall Rules Not Applying to Devices
Hello Security Experts, I’m currently deploying Microsoft Defender for Business and trying to enforce firewall configurations directly from the Defender portal. However, I’ve noticed that the settings are not applying to any of the onboarded devices — nothing changes on the endpoints. Do firewall rules in Defender for Endpoint require Intune to be enforced, or should they work standalone? And if Intune isn’t used, what’s the best approach to apply consistent Defender firewall rules across devices? Thanks, Luca28Views0likes1Comment[DevOps] dps.sentinel.azure.com no longer responds
Hello, Ive been using Repository connections in sentinel to a central DevOps for almost two years now. Today i got my first automated email on error for a webhook related to my last commit from the central repo to my Sentinel intances. Its a webhook that is automticly created in connections that are made the last year (the once from 2 years ago dont have this webhook automaticly created). The hook is found in devops -> service hooks -> webhooks "run state change" for each connected sentinel However, after todays run (which was successfull, all content deployed) this hook generates alerts. It says it cant reach: (EU in my case) eu.prod.dps.sentinel.azure.com full url: https://eu.prod.dps.sentinel.azure.com/webhooks/ado/workspaces/[REDACTED]/sourceControls/[REDACTED] So, what happened to this domain? why is it no longer responding and when was it going offline? I THINK this is the hook that sets the status under Sentinel -> Repositories in the GUI. this success status in screenshoot is from 2025/02/06, no new success has been registered in the receiving Sentinel instance. For the Sentinel that is 2 year old and dont have a hook in my DevOps that last deployment status says "Unknown" - so im fairly sure thats what the webhook is doing. So a second question would be, how can i set up a new webhook ? (it want ID and password of the "Azure Sentinel Content Deployment App" - i will never know that password....) so i cant manually add ieather (if the URL ever comes back online or if a new one exists?). please let me know.134Views2likes3CommentsUpdate 'Update-AzWvdSessionHost' cmdlet
Today via the PowerShell cmdlet 'Update-AzWvdSessionHost', an administrator can assign a user to a session host without the user being assigned to the applicationgroup. This can cause some confusion to administrators if they are able to perform this task as the user will not be able to see the host in the Windows App. The suggestion would be to either put in a check which denies the assignment if the user is not associated with the applicationgroup directly or indirectly via group association. Or, update the cmdlet to also add an assignment to the application group by adding a required parameter which would assign the user to the application group. It's a small tweak but it may help with the overall stability of the Desktop.Virtualization PowerShell stack. Thanks!14Views0likes0CommentsContext-Aware RAG System with Azure AI Search to Cut Token Costs and Boost Accuracy
🚀 Introduction As AI copilots and assistants become integral to enterprises, one question dominates architecture discussions: “How can we make large language models (LLMs) provide accurate, source-grounded answers — without blowing up token costs?” Retrieval-Augmented Generation (RAG) is the industry’s go-to strategy for this challenge. But traditional RAG pipelines often use static document chunking, which breaks semantic context and drives inefficiencies. To address this, we built a context-aware, cost-optimized RAG pipeline using Azure AI Search and Azure OpenAI, leveraging AI-driven semantic chunking and intelligent retrieval. The result: accurate answers with up to 85% lower token consumption. Majorly in this blog we are considering: Tokenization Chunking The Problem with Naive Chunking Most RAG systems split documents by token or character count (e.g., every 1,000 tokens). This is easy to implement but introduces real-world problems: 🧩 Loss of context — sentences or concepts get split mid-idea. ⚙️ Retrieval noise — irrelevant fragments appear in top results. 💸 Higher cost — you often send 5× more text than necessary. These issues degrade both accuracy and cost efficiency. 🧠 Context-Aware Chunking: Smarter Document Segmentation Instead of breaking text arbitrarily, our system uses an LLM-powered preprocessor to identify semantic boundaries — meaning each chunk represents a complete and coherent concept. Example Naive chunking: “Azure OpenAI Service offers… [cut] …integrates with Azure AI Search for intelligent retrieval.” Context-aware chunking: “Azure OpenAI Service provides access to models like GPT-4o, enabling developers to integrate advanced natural language understanding and generation into their applications. It can be paired with Azure AI Search for efficient, context-aware information retrieval.” ✅ The chunk is self-contained and semantically meaningful. This allows the retriever to match queries with conceptually complete information rather than partial sentences — leading to precision and fewer chunks needed per query. Architecture Diagram Chunking Service: Purpose: Transforms messy enterprise data (wikis, PDFs, transcripts, repos, images) into structured, model-friendly chunks for Retrieval-Augmented Generation (RAG). ChallengeChunking FixLLM context limitsBreaks docs into smaller piecesEmbedding sizeKeeps within token boundsRetrieval accuracyGranular, relevant sections onlyNoiseRemoves irrelevant blocksTraceabilityChunk IDs for auditabilityCost/latencyRe-embed only changed chunks The Chunking Flow (End-to-End) The Chunking Service sits in the ingestion pipeline and follows this sequence: Ingestion: Raw text arrives from sources (wiki, repo, transcript, PDF, image description). Token-aware splitting: Large text is cut into manageable pre-chunks with a 100-token overlap, ensuring no semantic drift across boundaries. Semantic segmentation: Each pre-chunk is passed to an Azure OpenAI Chat model with a structured prompt. Output = JSON array of semantic chunks (sectiontitle, speaker, content). Optional overlap injection: Character-level overlap can be applied across chunks for discourse-heavy text like meeting transcripts. Embedding generation: Each chunk is passed to Azure OpenAI Embeddings API (text-embedding-3-small), producing a 1536-dimension vector. Indexing: Chunks (text + vectors) are uploaded to Azure AI Search. Retrieval: During question answering or document generation, the system pulls top-k chunks, concatenates them, and enriches the prompt for the LLM. Resilience & Traceability The service is built to handle real-world pipeline issues. It retries once on rate limits, validates JSON outputs, and fails fast on malformed data instead of silently dropping chunks. Each chunk is assigned a unique ID (chunk_<sequence>_<sourceTag>), making retrieval auditable and enabling selective re-embedding when only parts of a document change. ☁️ Why Azure AI Search Matters Here Azure AI Search (formerly Cognitive Search) is the heart of the retrieval pipeline. Key Roles: Vector Search Engine: Stores embeddings of chunks and performs semantic similarity search. Hybrid Search (Keyword + Vector): Combines lexical and semantic matching for high precision and recall. Scalability: Supports millions of chunks with blazing-fast search latency. Metadata Filtering: Enables fine-grained retrieval (e.g., by document type, author, section). Native Integration with Azure OpenAI: Allows a seamless, end-to-end RAG pipeline without third-party dependencies. In short, Azure AI Search provides the speed, scalability, and semantic intelligence to make your RAG pipeline enterprise-grade. 💡 Importance of Azure OpenAI Azure OpenAI complements Azure AI Search by providing: High-quality embeddings (text-embedding-3-large) for accurate vector search. Powerful generative reasoning (GPT-4o or GPT-4.1) to craft contextually relevant answers. Security and compliance within your organization’s Azure boundary — critical for regulated environments. Together, these two services form the retrieval (Azure AI Search) and generation (Azure OpenAI) halves of your RAG system. 💰 Token Efficiency By limiting the model’s input to only the most relevant, semantically meaningful chunks, you drastically reduce prompt size and cost. Approach Tokens per Query Typical Cost Accuracy Full-document prompt ~15,000–20,000 Very high Medium Fixed-size RAG chunks ~5,000–8,000 Moderate Medium-high Context-aware RAG (this approach) ~2,000–3,000 Low High 💰 Token Cost Reduction Analysis Let’s quantify it: Step Naive Approach (no RAG) Your Approach (Context-Aware RAG) Prompt context size Entire document (e.g., 15,000 tokens) Top 3 chunks (e.g., 2,000 tokens) Tokens per query ~16,000 (incl. user + system) ~2,500 Cost reduction — ~84% reduction in token usage Accuracy Often low (hallucinations) Higher (targeted retrieval) That’s roughly an 80–85% reduction in token usage while improving both accuracy and response speed. 🧱 Tech Stack Overview Component Service Purpose Chunking Engine Azure OpenAI (GPT models) Generate context-aware chunks Embedding Model Azure OpenAI Embedding API Create high-dimensional vectors Retriever Azure AI Search Perform hybrid and vector search Generator Azure OpenAI GPT-4o Produce final answer Orchestration Layer Python / FastAPI / .NET c# Handle RAG pipeline 🔍 The Bottom Line By adopting context-aware chunking and Azure AI Search-powered RAG, you achieve: ✅ Higher accuracy (contextually complete retrievals) 💸 Lower cost (token-efficient prompts) ⚡ Faster latency (smaller context per call) 🧩 Scalable and secure architecture (fully Azure-native) This is the same design philosophy powering Microsoft Copilot and other enterprise AI assistants today. 🧪 Real-Life Example: Context-Aware RAG in Action To bring this architecture to life, let’s walk through a simple example of how documents can be chunked, embedded, stored in Azure AI Search, and then queried to generate accurate, cost-efficient answers. Imagine you want to build an internal knowledge assistant that answers developer questions from your company’s Azure documentation. ⚙️ Step 1: Intelligent Document Chunking We’ll use a small LLM call to segment text into context-aware chunks — rather than fixed token counts //Context Aware Chunking //text can be your retrieved text from any page/ document private async Task<List<SemanticChunk>> AzureOpenAIChunk(string text) { try { string prompt = $@" Divide the following text into logical, meaningful chunks. Each chunk should represent a coherent section, topic, or idea. Return the result as a JSON array, where each object contains: - sectiontitle - speaker (if applicable, otherwise leave empty) - content Do not add any extra commentary or explanation. Only output the JSON array. Do not give content an array, try to keep all in string. TEXT: {text}" var client = GetAzureOpenAIClient(); var chatCompletionsOptions = new ChatCompletionOptions { Temperature = 0, FrequencyPenalty = 0, PresencePenalty = 0 }; var Messages = new List<OpenAI.Chat.ChatMessage> { new SystemChatMessage("You are a text processing assistant."), new UserChatMessage(prompt) }; var chatClient = client.GetChatClient( deploymentName: _appSettings.Agent.Model); var response = await chatClient.CompleteChatAsync(Messages, chatCompletionsOptions); string responseText = response.Value.Content[0].Text.ToString(); string cleaned = Regex.Replace(responseText, @"```[\s\S]*?```", match => { var match1 = match.Value.Replace("```json", "").Trim(); return match1.Replace("```", "").Trim(); }); // Try to parse the response as JSON array of chunks return CreateChunkArray(cleaned); } catch (JsonException ex) { _logger.LogError("Failed to parse GPT response: " + ex.Message); throw; } catch (Exception ex) { _logger.LogError("Error in AzureOpenAIChunk: " + ex.Message); throw; } } 🧠 Step 2: Adding Overlaps for better result We are adding overlapping between chunks for better and accurate answers. Overlapping window can be modified based on the documents. public List<SemanticChunk> AddOverlap(List<SemanticChunk> chunks, string IDText, int overlapChars = 0) { var overlappedChunks = new List<SemanticChunk>(); for (int i = 0; i < chunks.Count; i++) { var current = chunks[i]; string previousOverlap = i > 0 ? chunks[i - 1].Content[^Math.Min(overlapChars, chunks[i - 1].Content.Length)..] : ""; string combinedText = previousOverlap + "\n" + current.Content; var Id = $"chunk_{i + '_' + IDText}"; overlappedChunks.Add(new SemanticChunk { Id = Regex.Replace(Id, @"[^A-Za-z0-9_\-=]", "_"), Content = combinedText, SectionTitle = current.SectionTitle }); } return overlappedChunks; } 🧠 Step 3: Generate and Store Embeddings in Azure AI Search We convert each chunk into an embedding vector and push it to an Azure AI Search index. public async Task<List<SemanticChunk>> AddEmbeddings(List<SemanticChunk> chunks) { var client = GetAzureOpenAIClient(); var embeddingClient = client.GetEmbeddingClient("text-embedding-3-small"); foreach (var chunk in chunks) { // Generate embedding using the EmbeddingClient var embeddingResult = await embeddingClient.GenerateEmbeddingAsync(chunk.Content).ConfigureAwait(false); chunk.Embedding = embeddingResult.Value.ToFloats(); } return chunks; } public async Task UploadDocsAsync(List<SemanticChunk> chunks) { try { var indexClient = GetSearchindexClient(); var searchClient = indexClient.GetSearchClient(_indexName); var result = await searchClient.UploadDocumentsAsync(chunks); } catch (Exception ex) { _logger.LogError("Failed to upload documents: " + ex); throw; } } 🤖 Step 4: Generate the Final Answer with Azure OpenAI Now we combine the top chunks with the user query to create a cost-efficient, context-rich prompt. P.S. : Here in this example we have used semantic kernel agent , in real time any agent can be used and any prompt can be updated. var context = await _aiSearchService.GetSemanticSearchresultsAsync(UserQuery); // Gets chunks from Azure AI Search //here UserQuery is query asked by user/any question prompt which need to be answered. string questionWithContext = $@"Answer the question briefly in short relevant words based on the context provided. Context : {context}. \n\n Question : {UserQuery}?"; var _agentModel = new AgentModel() { Model = _appSettings.Agent.Model, AgentName = "Answering_Agent", Temperature = _appSettings.Agent.Temperature, TopP = _appSettings.Agent.TopP, AgentInstructions = $@"You are a cloud Migration Architect. " + "Analyze all the details from top to bottom in context based on the details provided for the Migration of APP app using Azure Services. Do not assume anything." + "There can be conflicting details for a question , please verify all details of the context. If there are any conflict please start your answer with word - **Conflict**." + "There might not be answers for all the questions, please verify all details of the context. If there are no answer for question just mention - **No Information**" }; _agentModel = await _agentService.CreateAgentAsync(_agentModel); _agentModel.QuestionWithContext = questionWithContext; var modelWithResponse = await _agentService.GetAnswerAsync(_agentModel); 🧠 Final Thoughts Context-aware RAG isn’t just a performance optimization — it’s an architectural evolution. It shifts the focus from feeding LLMs more data to feeding them the right data. By letting Azure AI Search handle intelligent retrieval and Azure OpenAI handle reasoning, you create an efficient, explainable, and scalable AI assistant. The outcome: Smarter answers, lower costs, and a pipeline that scales with your enterprise. Wiki Link: Tokenization and Chunking IP Link: AI Migration Accelerator996Views4likes0Comments