OneLake
7 TopicsBuilding Enterprise Voice-Enabled AI Agents with Azure Voice Live API
The sample application covered in this post demonstrates two approaches in an end-to-end solution that includes product search, order management, automated shipment creation, intelligent analytics, and comprehensive business intelligence through Microsoft Fabric integration. Use Case Scenario: Retail Fashion Agent Core Business Capabilities: Product Discovery and Ordering: Natural language product search across fashion categories (Winter wear, Active wear, etc.) and order placement. REST APIs hosted in Azure Function Apps provide this functionality and a Swagger definition is configured in the Application for tool action. Automated Fulfillment: Integration with Azure Logic Apps for shipment creation in Azure SQL Database Policy Support: Vector-powered QnA for returns, payment issues, and customer policies. Azure AI Search & File Search capabilities are used for this requirement. Conversation Analytics: AI-powered analysis using GPT-4o for sentiment scoring and performance evaluation. The Application captures the entire conversation between the customer and Agent and sends them to an Agent running in Azure Logic Apps to perform call quality assessment, before storing the results in Azure CosmosDB. When during the voice call the customer indicates that the conversation can be concluded, the Agent autonomously sends the conversation history to the Azure Logic App to perform quality assessment. Advanced Analytics Pipeline: Real-time Data Mirroring: Automatic synchronization from Azure Cosmos DB to Microsoft Fabric OneLake Business Intelligence: Custom Data Agents in Fabric for trend analysis and insights Executive Dashboards: Power BI reports for comprehensive performance monitoring Technical Architecture Overview The solution presents two approaches, each optimized for different enterprise scenarios: đŻApproach 1: Direct Model Integration with GPT-Realtime Architecture Components This approach provides direct integration with Azure Voice Live API using GPT-Realtime model for immediate speech-to-speech conversational experiences without intermediate text processing. The Application connects to the Voice Live API uses a Web socket connection. The semantics of this API are similar to the one used when connecting to the GPT-Realtime API directly. The Voice Live API provides additional configurability, like the choice of a custom Voice from Azure Speech Services, options for echo cancellation, noise reduction and plugging an Avatar integration. Core Technical Stack: GPT-Realtime Model: Direct audio-to-audio processing Azure Speech Voice: High-quality TTS synthesis (en-IN-AartiIndicNeural) WebSocket Communication: Real-time bidirectional audio streaming Voice Activity Detection: Server-side VAD for natural conversation flow Client-Side Function Calling: Full control over tool execution logic Key Session Configuration The Direct Model Integration uses the session configuration below: session_config = { "input_audio_sampling_rate": 24000, "instructions": system_instructions, "turn_detection": { "type": "server_vad", "threshold": 0.5, "prefix_padding_ms": 300, "silence_duration_ms": 500, }, "tools": tools_list, "tool_choice": "auto", "input_audio_noise_reduction": {"type": "azure_deep_noise_suppression"}, "input_audio_echo_cancellation": {"type": "server_echo_cancellation"}, "voice": { "name": "en-IN-AartiIndicNeural", "type": "azure-standard", "temperature": 0.8, }, "input_audio_transcription": {"model": "whisper-1"}, } Configuration Highlights: 24kHz Audio Sampling: High-quality audio processing for natural speech Server VAD: Optimized threshold (0.5) with 300ms padding for natural conversation flow Azure Deep Noise Suppression: Advanced noise reduction for clear audio Indic Voice Support: en-IN-AartiIndicNeural for localized customer experience Whisper-1 Transcription: Accurate speech recognition for conversation logging Connecting to the Azure Voice Live API The voicelive_modelclient.py demonstrates advanced WebSocket handling for real-time audio streaming: def get_websocket_url(self, access_token: str) -> str: """Generate WebSocket URL for Voice Live API.""" azure_ws_endpoint = endpoint.rstrip("/").replace("https://", "wss://") return ( f"{azure_ws_endpoint}/voice-live/realtime?api-version={api_version}" f"&model={model_name}" f"&agent-access-token={access_token}" ) async def connect(self): if self.is_connected(): # raise Exception("Already connected") self.log("Already connected") # Get access token access_token = self.get_azure_token() # Build WebSocket URL and headers ws_url = self.get_websocket_url(access_token) self.ws = await websockets.connect( ws_url, additional_headers={ "Authorization": f"Bearer {self.get_azure_token()}", "x-ms-client-request-id": str(uuid.uuid4()), }, ) print(f"Connected to Azure Voice Live API....") asyncio.create_task(self.receive()) await self.update_session() Function Calling Implementation The Direct Model Integration provides client-side function execution with complete control: tools_list = [ { "type": "function", "name": "perform_search_based_qna", "description": "call this function to respond to the user query on Contoso retail policies, procedures and general QnA", "parameters": { "type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"], }, }, { "type": "function", "name": "create_delivery_order", "description": "call this function to create a delivery order based on order id and destination location", "parameters": { "type": "object", "properties": { "order_id": {"type": "string"}, "destination": {"type": "string"}, }, "required": ["order_id", "destination"], }, }, { "type": "function", "name": "perform_call_log_analysis", "description": "call this function to analyze call log based on input call log conversation text", "parameters": { "type": "object", "properties": { "call_log": {"type": "string"}, }, "required": ["call_log"], }, }, { "type": "function", "name": "search_products_by_category", "description": "call this function to search for products by category", "parameters": { "type": "object", "properties": { "category": {"type": "string"}, }, "required": ["category"], }, }, { "type": "function", "name": "order_products", "description": "call this function to order products by product id and quantity", "parameters": { "type": "object", "properties": { "product_id": {"type": "string"}, "quantity": {"type": "integer"}, }, "required": ["product_id", "quantity"], }, } ] đ¤ Approach 2: Azure AI Foundry Agent Integration Architecture Components This approach leverages existing Azure AI Foundry Service Agents, providing enterprise-grade voice capabilities as a clean wrapper over pre-configured agents. It does not entail any code changes to the Agent itself to voice enable it. Core Technical Stack: Azure Fast Transcript: Advanced multi-language speech-to-text processing Azure AI Foundry Agent: Pre-configured Agent with autonomous capabilities GPT-4o-mini Model: Agent-configured model for text processing Neural Voice Synthesis: Indic language optimized TTS Semantic VAD: Azure semantic voice activity detection Session Configuration The Agent Integration approach uses advanced semantic voice activity detection: session_config = { "input_audio_sampling_rate": 24000, "turn_detection": { "type": "azure_semantic_vad", "threshold": 0.3, "prefix_padding_ms": 200, "silence_duration_ms": 200, "remove_filler_words": False, "end_of_utterance_detection": { "model": "semantic_detection_v1", "threshold": 0.01, "timeout": 2, }, }, "input_audio_noise_reduction": {"type": "azure_deep_noise_suppression"}, "input_audio_echo_cancellation": {"type": "server_echo_cancellation"}, "voice": { "name": "en-IN-AartiIndicNeural", "type": "azure-standard", "temperature": 0.8, }, "input_audio_transcription": {"model": "azure-speech", "language": "en-IN, hi-IN"}, } Key Differentiators: Semantic VAD: Intelligent voice activity detection with utterance prediction Multi-language Support: Azure Speech with en-IN and hi-IN language support End-of-Utterance Detection: AI-powered conversation turn management Filler Word Handling: Configurable processing of conversational fillers Agent Integration Code The voicelive_client.py demonstrates seamless integration with Azure AI Foundry Agents. Notice that we need to provide the Azure AI Foundry Project Name and an ID of the Agent in it. We do not need to pass the model's name here, since the Agent is already configured with one. def get_websocket_url(self, access_token: str) -> str: """Generate WebSocket URL for Voice Live API.""" azure_ws_endpoint = endpoint.rstrip("/").replace("https://", "wss://") return ( f"{azure_ws_endpoint}/voice-live/realtime?api-version={api_version}" f"&agent-project-name={project_name}&agent-id={agent_id}" f"&agent-access-token={access_token}" ) async def connect(self): """Connects the client using a WS Connection to the Realtime API.""" if self.is_connected(): # raise Exception("Already connected") self.log("Already connected") # Get access token access_token = self.get_azure_token() # Build WebSocket URL and headers ws_url = self.get_websocket_url(access_token) self.ws = await websockets.connect( ws_url, additional_headers={ "Authorization": f"Bearer {self.get_azure_token()}", "x-ms-client-request-id": str(uuid.uuid4()), }, ) print(f"Connected to Azure Voice Live API....") asyncio.create_task(self.receive()) await self.update_session() Advanced Analytics Pipeline GPT-4o Powered Call Analysis The solution implements conversation analytics using Azure Logic Apps with GPT-4o: { "functions": [ { "name": "evaluate_call_log", "description": "Evaluate call log for Contoso Retail customer service call", "parameters": { "properties": { "call_reason": { "description": "Categorized call reason from 50+ predefined scenarios", "type": "string" }, "customer_satisfaction": { "description": "Overall satisfaction assessment", "type": "string" }, "customer_sentiment": { "description": "Emotional tone analysis", "type": "string" }, "call_rating": { "description": "Numerical rating (1-5 scale)", "type": "number" }, "call_rating_justification": { "description": "Detailed reasoning for rating", "type": "string" } } } } ] } Microsoft Fabric Integration The analytics pipeline extends into Microsoft Fabric for enterprise business intelligence: Fabric Integration Features: Real-time Data Mirroring: Cosmos DB to OneLake synchronization Custom Data Agents: Business-specific analytics agents in Fabric Copilot Integration: Natural language business intelligence queries Power BI Dashboards: Interactive reports and executive summaries Artefacts for reference The source code of the solution is available in the GitHub Repo here. An article on this topic is published on LinkedIn here A video recording of the demonstration of this App is available below: Part1 - walkthrough of the Agent configuration in Azure AI Foundry - here Part2 - demonstration of the Application that integrates with the Azure Voice Live API - here Part 3 - demonstration of the Microsoft Fabric Integration, Data Agents, Copilot in Fabric and Power BI for insights and analysis - here Conclusion Azure Voice Live API enables enterprises to build sophisticated voice-enabled AI assistants using two distinct architectural approaches. The Direct Model Integration provides ultra-low latency for real-time applications, while the Azure AI Foundry Agent Integration offers enterprise-grade governance and autonomous operation. Both approaches deliver the same comprehensive business capabilities: Natural voice interactions with advanced VAD and noise suppression Complete retail workflow automation from inquiry to fulfillment AI-powered conversation analytics with sentiment scoring Enterprise business intelligence through Microsoft Fabric integration The choice between approaches depends on your specific requirements: Choose Direct Model Integration for custom function calling and minimal latency Choose Azure AI Foundry Agent Integration for enterprise governance and existing investments147Views1like0CommentsData security controls in OneLake
Unify and secure your data â no matter where it lives â without sacrificing control using OneLake security, part of Microsoft Fabric. With granular permissions down to the row, column, and table level, you can confidently manage access across engines like Power BI, Spark, and T-SQL, all from one place. Discover, label, and govern your data with clarity using the integrated OneLake catalog that surfaces the right items fast. Aaron Merrill, Microsoft Fabric Principal Program Manager, shows how you can stay in control, from security to discoverability â owning, sharing, and protecting data on your terms. Protect sensitive information at scale. Set precise data access rules â down to individual rows. Check out OneLake security in Microsoft Fabric. No data duplication needed. Hide sensitive columns while still allowing access to relevant data. See it here with OneLake security. Built-in compliance insights. Streamline discovery, governance, and sharing. Get started with the OneLake catalog. QUICK LINKS: 00:00 â OneLake & Microsoft Fabric core concepts 01:28 â Table level security 02:11 â Column level security 03:06 â Power BI report 03:28 â Row level security 04:23 â Data classification options 05:19 â OneLake catalog 06:22 â View and manage data 06:48 â Governance 07:36 â Microsoft Fabric integration 07:59 â Wrap up Link References Check out our blog at https://aka.ms/OneLakeSecurity Sign up for a 60-day free trial at https://fabric.microsoft.com Unfamiliar with Microsoft Mechanics? As Microsoftâs official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft. Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast Keep getting this insider knowledge, join us on social: Follow us on Twitter: https://twitter.com/MSFTMechanics Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/ Enjoy us on Instagram: https://www.instagram.com/msftmechanics/ Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics Video Transcript: -As you build AI and analytic workloads, unifying your data from wherever it lives and making it accessible doesnât have to come at the cost of security. In fact, today we dive deeper into Microsoftâs approach to data unification, accessibility, and security with OneLake, part of Microsoft Fabric, where weâll focus on OneLakeâs security control set and how it compliments data discovery via the new OneLake catalog. -Now, in case youâre new to OneLake and Microsoft Fabric, Iâll start by explaining a few core concepts. OneLake is the logical multi-cloud data lake that is foundational to Microsoft Fabric, Microsoftâs fully managed data analytics and AI platform. OneLake, with its support for open data formats, provides a single and unified place across your entire company for data to be discovered, accessed, and controlled across your data estate. Data can reside anywhere, and you can connect to it using shortcuts or via mirroring. And once in OneLake, you have a single place where data can be centrally classified and labeled as the basis for policy controls. You can then configure granular, role-based permissions that can apply down to the folder level for unstructured data and by table for structured data. -Then all the way down to the column and row levels within each table. This way, security is enforced across all connected data. Meaning that whether youâre accessing the data through Spark, Power BI, T-SQL, or any other engine, itâs protected and you have the controls to allow or limit access to data on your terms. In fact, let me show you a few examples for enforcing OneLake security at all of these levels. Iâll start with an example showing OneLake security at the table level. I want to grant our suppliers team access to a specific table in this lakehouse. Iâll create a OneLake security role to do that. So Iâll just give it a name, SuppliersReaders. Then Iâll choose selected data and find the table that I want to share by expanding the table list, pick suppliers and then confirm. -Now, I just need to assign the right users. Iâll just add Mona in this case, and create the role. Then if I move over to Monaâs experience, I can run queries against the supplier data in the SQL endpoint. But if I try to query any other table, Iâm blocked, as you can see here. Now, let me show you another option. This time, Iâll lock access down to the column level. I want to grant our customer relations team access to the data they need, but I donât want to give them access to PII data. Using OneLake security controls, I can create a role that restricts access to sensitive columns. Like before, Iâll name it. Then I need to select my data. This time, Iâll choose three different tables for customer and order data. But notice this grayed out legacy orders table here that we would like to apply column security to as well. I donât own the permissions for this table because itâs a shortcut to other data. However, the owner of that data can grant permission to it using the steps Iâll show next. From the role I just created, Iâll expand on my tables. And for the customerâs table, Iâll enable column security. Once I confirm, I can select the columns I want to remove and that we donât want them to see and save it. -Now, letâs look at the results of this from another engine, Power BI, while building a report. Iâll choose a semantic model for my Power BI report. With the column level security in place, notice the sensitive columns I removed before, contact name and address, are hidden from me. And when I expand the legacy orders table, which was a shortcut, itâs also not showing PII columns. Now, some scenarios require that security controls are applied where records might be interspersed with the same table, so a row level filter is needed. For example, our US-based HR team should only see data for US-based employees. Iâve created another security role with the right data selected, HRUS. -Now, Iâll move to my tables and choose from the options for this employeeâs table and Iâll select row security. Row level security in OneLake uses SQL statements to limit what people can see. Iâll do that here with a simple select statement to limit country to USA. Now, from the HR teamâs perspective, they can start to query the data using another engine, Spark, to analyze employer retention. But only across US based employees, as you can see from the country column. And as mentioned, this applies to all engines, no matter how you access it, including the Parquet files directly in OneLake. Next, letâs move on to data classification options that can be used to inform policy controls. Here, the good news is the same labels youâve defined in Microsoft Purview for your organization used in Microsoft 365 for emails, messaging, files, sites, and meetings can be applied to data items in OneLake. -Additionally, Microsoft Purview policy controls can be used to automatically label content in OneLake. And another benefit I can show you from the lineage view is label inheritance. Notice this Lakehouse is labeled Non-Business, as is NorthwindTest, but look at the connected data items on the right of NorthwindTest. They are also non-business. If I move into the test lakehouse and apply a label either automatically or manually to my data, like Iâm doing here, then I move back to the lineage view. My downstream data items like this model and the SQL analytics endpoint below it have automatically inherited the upstream label. -So now weâve explored OneLake security controls, their implementation, and enforcement, letâs look at how this works hand in hand with the OneLake catalog for data discovery and management. First, to know that youâre in the right place, you can use branded domains to organize collections of data. Iâll choose the sales domain. To get the data I want, I can see my items as the ones I own, endorsed items, and my favorites. I can filter by workspace. And on top, I can select the type of data item that Iâm looking for. Then if I move over to tags, I can find ones associated with cost centers, dates, or other collection types. -Now, letâs take a look at a data item. This shows me more detail, like the owner and location. I can also see table schemas and more below. I can preview data within the tables directly from here. Then using the lineage tab, it shows me a list of connected and related items. Lastly, the monitor tab lets me track data refresh history. Now, let me show you how as a data owner you can view and manage these data items. From the settings of this lakehouse, I can change its properties and metadata, such as the endorsement or update the sensitivity label. And as the data owner, I can also share it securely internally or even externally with approved recipients. Iâll choose a colleague, dave@contoso.com, and share it. -Next, the govern tab in the OneLake catalog gives you even more control as a data owner, as well as recommendations to make data more secure and compliant. Youâll find it on the OneLake catalog main page. This gives me key insights at a glance, like the number and type of items I own. And when I click into view more, I see additional information like my data hierarchy. Below that, item inventory and data refresh status. Sensitivity label coverage gives me an idea of how compliant my data items are. And I can assess data completeness based on whether an item is properly tagged, described, and endorsed across the items I own. Back on the main view, I can see governance actions tailored specifically to my data, like increasing sensitivity label, coverage, and more. -The OneLake catalog is integrated across Microsoft Fabric experiences to help people quickly discover the items they need. And itâs also integrated with your favorite Office apps, including Microsoft Excel, where you can use the get data control to select and access data in OneLake. And right in context, without leaving the app, you can define what you want and pull it directly into your Excel file for analysis. The OneLake catalog is the one place where you can discover the data that you want and manage the data that you own. And combined with OneLake security controls, you can do all of this without increasing your data security risks. -To find out more and get started, check out our blog at aka.ms/OneLakeSecurity. Also, be sure to sign up for a 60 day free trial at fabric.microsoft.com. And keep watching Mechanics for the latest updates across Microsoft, subscribe to our channel, and thanks for watching.347Views0likes0CommentsMGDC for SharePoint FAQ: How to flatten datasets for SQL or Fabric
When you get your data from Microsoft Graph Data Connect (MGDC), you will typically get that data as a collection of JSON objects in an Azure Data Lake Storage (ADLS) Gen2 storage account. For those handling large datasets, it might be useful to move the data to a SQL Server or to OneLake (lakehouse). In those cases, you might need to flatten the datasets. This post describes how to do that. If youâre not familiar with MGDC for SharePoint, start with https://aka.ms/SharePointData. 1. Flattening Most of the MGDC for SharePoint datasets come with nested objects. That means that a certain object has other objects inside it. For instance, if you have a SharePoint Groups object, it might have multiple Group Members inside. If you have a SharePoint Permissions object, you could have many Permissions Recipients (also known as Sharees). For each SharePoint File object, you will have a single Author object inside. When you convert the datasets from JSON to other formats, it is possible that these other formats require (or perform better) if you donât have any objects inside objects. To overcome that, you can turn those child objects into properties of the parent object. For instance, instead of having the File object with an Author object inside, you can have multiple author-related columns. For instance, you could have Author.Name and Author.Email as properties of the flattened File object. 2. Nested Objects You can get the full list of SharePoint datasets in MGDC at https://aka.ms/SharePointDatasets. Here is a table with a list of objects and their nested objects: Object How many? Primary Key Nested Object How many? Add to Primary Key Sites 1 per Site Id RootWeb 1 per Site Sites 1 per Site Id StorageMetrics 1 per Site Sites 1 per Site Id SensitivityLabelInfo 1 per Site Sites 1 per Site Id Owner 1 per Site Sites 1 per Site Id SecondaryContact 1 per Site Groups 1 per Group SiteId + GroupId Owner 1 per Group Groups 1 per Group SiteId + GroupId Members 1 per Member COALESCE(AADObjectId, Email, Name) Permissions 1 per Permission SiteId + ScopeId + RoleDefintion + LinkId SharedWithCount 1 per Recipient Type Type Permissions 1 per Permission SiteId + ScopeId + RoleDefintion + LinkId SharedWith 1 per Recipient or Sharee COALESCE(AADObjectId, Email, Name) Files 1 per File SiteId + WebId + ListId + ItemId Author 1 per File Files 1 per File SiteId + WebId + ListId + ItemId ModifiedBy 1 per File When you flatten a dataset and there is an object with multiple objects inside (like Group Members or Permission Recipients), the number of rows will increase. You also need to add to primary key to keep it unique. Also note that the File Actions, Sync Health and Sync Errors datasets do not have any nested objects. 3. One Object per Parent When the nested object has only one instance, things are simple. As we described for the Author nested object inside the File object, you promote the properties of the nested object to be properties of the parent object. This is because the Author is defined as the user that initially created the file. There is always one and only one Author. This can happen even happen multiple times for the same object. The File also has a ModifiedBy property. That is the single user that last changed the file. In that case, there is also only one ModifiedBy per File. The Site object also includes several properties in this style, like RootWeb, StorageMetrics, SensitivityLabelInfo, Owner and SecondaryContact. Note that, in the context of the Site object, there is only one owner. Actually two, but that second one is tracked in a separate object called SecondaryContact which is effectively the secondary owner. 4. Multiple Objects per Parent The SharePoint Permissions dataset has a special condition that might create trouble for flattening. There are two sets of nested objects with multiple objects each: SharedWith and SharedWithCount. SharedWith has the list of Recipients and SharedWithCount has a list of Recipient Types. If you just let the tools flatten it, you will end up a cross join of the two. As an example, if you have 4 recipients in an object and 2 types of recipients (internal users and external users, for instance) you will end up with 20 objects in the flattened dataset instead of the expected 10 objects (one per recipient). To avoid this, in this specific condition, I would recommend just excluding the SharedWithCount column from the object before flattening. 5. Conclusion I hope this clarifies how you can flatten the MGDC for SharePoint datasets, particularly SharePoint Permissions dataset. For further details about the MGDC for SharePoint, https://aka.ms/SharePointData.Advanced Time Series Anomaly Detector in Fabric
Anomaly Detector, one of Azure AI services, enables you to monitor and detect anomalies in your time series data. This service is being retired by October 2026, and as part of the migration process the anomaly detection algorithms were open sourced and published by a new Python package and we offer a time series anomaly detection workflow in Microsoft Fabric data platform.2.8KViews2likes0CommentsUnderstanding OneLake Architecture: The OneDrive for Data
Learn how OneLake simplifies data engineering. Data engineers face many difficulties every day. Data sources are diverse and fragmented, containing different file types and data quality levels. Finding specific files, determining their owners and access permissions, can be frustrating. OneLake helps you overcome these challenges.7KViews2likes2Comments