azure machine learning
206 TopicsFine-tuning gpt-oss-20b Now Available on Managed Compute
Earlier this month, we made available OpenAI’s open‑source model gpt‑oss on Azure AI Foundry and Windows AI Foundry. Today, you can fine-tune gpt‑oss‑20b using Managed Compute on Azure — available in preview and accessible via notebook.461Views0likes0CommentsAzure Machine Learning now supports Large-Scale AI Training and Inference with ND H200 v5 VMs
TL;DR: Azure Machine Learning now offers ND H200 v5 VMs accelerated by NVIDIA H200 Tensor Core GPUs, purpose‑built to train and serve modern generative AI more efficiently at cloud scale. With massive on‑GPU memory and high intra‑node bandwidth, you can fit larger models and batches, keep tensors local, and cut cross‑GPU transfers - doing more with fewer nodes. Start with a single VM or scale out to hundreds in a managed cluster to capture cloud economics, while Azure’s AI‑optimized infrastructure delivers consistent performance across training and inference. Why this matters The AI stack is evolving with bigger parameter counts, longer context windows, multimodal pipelines, and production-scale inference. ND H200 v5 on Azure ML is designed to address these needs with a memory-first, network-optimized, and workflow-friendly approach, enabling data science and MLOps teams to move from experiment to production efficiently. Memory, the real superpower At the heart of each ND H200 v5 VM are eight NVIDIA H200 GPUs, each packing 141 GB of HBM3e memory - representing a 76% increase in HBM capacity over H100. That means you can now process more per GPU, larger models, more tokens and better performance. Aggregate that across all eight GPUs and you get a massive 1,128 GB of GPU memory per VM. HBM3e throughput: 4.8 TB/s per GPU ensures continuous data flow, preventing compute starvation. Larger models with fewer compromises: Accommodate wider context windows, larger batch sizes, deeper expert mixtures, or higher-resolution vision tokens without needing aggressive sharding or offloading techniques. Improved scaling: Increased on-GPU memory reduces cross-device communication and enhances step-time stability. Built to scale-within a VM and across the cluster When training across multiple GPUs, communication speed is crucial. Inside the VM: Eight NVIDIA H200 GPUs are linked via NVIDIA NVLink, delivering 900 GB/s of bidirectional bandwidth per GPU for ultra-fast all-reduce and model-parallel operations with minimal synchronization overhead. Across VMs: Each instance comes with eight 400 Gb/s NVIDIA ConnectX-7 InfiniBand adapters connecting to NVIDIA Quantum-2 InfiniBand switches, totaling 3.2 Tb/s interconnect per VM. GPUDirect RDMA: Enables data to move GPU-to-GPU across nodes with lower latency and lower CPU overhead, which is essential for distributed data/model/sequence parallelism. The result is near-linear scaling characteristics for many large-model training and fine-tuning workloads. Built into Azure ML workflows (no friction) Azure Machine Learning integrates ND H200 v5 with the tools your teams already use: Frameworks: PyTorch, TensorFlow, JAX, and more Containers: Optimized Docker images available via Azure Container Registry Distributed training: NVIDIA NCCL fully supported to maximize performance of NVLink and InfiniBand Bring your existing training scripts, launch distributed runs, and integrate into pipelines, registries, managed endpoints, and MLOps with minimal change. Real-world gains Early benchmarks show up to 35% throughput improvements for large language model inference compared to the previous generation, particularly on models like Llama 3.1 405B. The increased HBM capacity allows for larger inference batches, improving utilization and cost efficiency. For training, the combination of additional memory and higher bandwidth supports larger models or more data per step, often potentially reducing overall training time. Your mileage will vary by model architecture, precision, parallelism strategy, and data loader efficiency—but the headroom is real. Quick spec snapshot GPUs: 8× NVIDIA H200 Tensor Core GPUs HBM3: 141 GB per GPU (1,128 GB per VM) HBM bandwidth: 4.8 TB/s per GPU Inter-GPU: NVIDIA NVLink 900 GB/s (intra-VM) Host: 96 vCPUs (Intel Xeon Sapphire Rapids), 1,850 GiB RAM Local storage: 28 TB NVMe SSD Networking: 8× 400 Gb/s NVIDIA ConnectX-7 InfiniBand adapters (3.2 Tb/s total) with GPUDirect RDMA Getting started (it’s just a CLI away) Create an auto-scaling compute cluster in Azure ML: az ml compute create \ --name h200-training-cluster \ --size Standard_ND96isr_H200_v5 \ --min-instances 0 \ --max-instances 8 \ --type amlcompute Auto-scaling means you only pay for what you use - perfect for research bursts, scheduled training, and production inference with variable demand. What you can do now Train foundation models with larger batch sizes and longer sequences Fine-tune LLMs with fewer memory workarounds, reducing the need for offloading and resharding Deploy high-throughput inference ND H200 v5 documentation for chat, RAG, MoE, and multimodal use cases Accelerate scientific and simulation workloads that require high bandwidth + memory Pro tips to unlock performance Optimize HBM usage: Increase batch size/sequence length until you reach the HBM bandwidth limit of approximately 4.8 TB/s per GPU). Utilize parallelism effectively: Combine tensor/model parallel (NVLink-aware) with data parallelism across nodes (InfiniBand + GPUDirect RDMA). Optimize your input pipeline: Parallelize tokenization/augmentation,and store frequently accessed data on local NVMe to prevent GPU stalls. Leverage NCCL: Configure your communication backend to take advantage of the topology, using NVLink intra-node and InfiniBand inter-node. The bottom line This is more than a hardware bump - it’s a platform designed for the next wave of AI. With ND H200 v5 on Azure ML, you gain the memory capacity, network throughput, and operational simplicity needed to transform ambitious models into production-grade systems. For comprehensive technical specifications and deployment guidance, visit the official ND H200 v5 documentation and explore our detailed announcement blog for additional insights and use cases.401Views1like1CommentConnecting Azure Kubernetes Service Cluster to Azure Machine Learning for Multi-Node GPU Training
TLDR Create an Azure Kubernetes Service cluster with GPU nodes and connect it to Azure Machine Learning to run distributed ML training workloads. This integration provides a managed data science platform while maintaining Kubernetes flexibility under the hood, enables multi-node training that spans multiple GPUs, and bridges the gap between infrastructure and ML teams. The solution works for both new and existing clusters, supporting specialized GPU hardware and hybrid scenarios. Why Should You Care? Integrating Azure Kubernetes Service (AKS) clusters with GPUs into Azure Machine Learning (AML) offers several key benefits: Utilize existing infrastructure: Leverage your existing AKS clusters with GPUs via a managed data science platform like AML Flexible resource sharing: Allow both AKS workloads and AML jobs to access the same GPU resources Organizational alignment: Bridge the gap between infrastructure teams (who prefer AKS) and ML teams (who prefer AML) Hybrid scenarios: Connect on-premises GPUs to AML using Azure Arc in a similar way to this tutorial We are looking at Multi-Node Training because it is needed for most bigger training jobs. If you just need a single GPU or single VM we also look at how to do this. Prerequisites Before you begin, ensure you have: Azure subscription with privileges to create and manage AKS clusters and add compute targets in AML. We recommend the AKS and AML resources to be in the same region. Sufficient quota for GPU compute resources. Check this article on how to request quota How to Increase Quota for Specific Types of Azure Virtual Machines. We are using two Standard_NC8as_T4_v3. So, 4 T4s in total. You can also opt for other GPU enabled compute. Azure CLI version 2.24.0 or higher (az upgrade) Azure CLI k8s-extension version 1.2.3 or higher (az extension update --name k8s-extension) kubectl installed and updated Step 1: Create an AKS Cluster with GPU Nodes For Windows users, it's recommended to use WSL (Ubuntu 22.04 or similar). # Login to Azure az login # Create resource group az group create -n ResourceGroup -l francecentral # Create AKS cluster with a system node az aks create -g ResourceGroup -n MyCluster \ --node-vm-size Standard_D16s_v5 \ --node-count 2 \ --enable-addons monitoring # Get cluster credentials az aks get-credentials -g ResourceGroup -n MyCluster # Add GPU node pool (Spot Instances are not recommended) az aks nodepool add \ --resource-group ResourceGroup \ --cluster-name MyCluster \ --name gpupool \ --node-count 2 \ --vm-size standard_nc8as_t4_v3 \ # Verify cluster configuration kubectl get namespaces kubectl get nodes Step 2: Install NVIDIA Device Plugin Next, we need to make sure that our GPUs exactly work as expected. The NVIDIA Device Plugin is a Kubernetes plugin that enables the use of NVIDIA GPUs in containers running on Kubernetes clusters. It acts as a bridge between Kubernetes and the physical GPU hardware. Create and apply the NVIDIA device plugin to enable GPU access within AKS: kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.1/nvidia-device-plugin.yml To confirm that the GPUs are working as expected follow the steps here and run a test workload Use GPUs on Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn. Step 3: Register the KubernetesConfiguration Provider The KubernetesConfiguration Provider enables Azure to deploy and manage extensions on Kubernetes clusters, including the Azure Machine Learning extension. Before installing extensions, ensure the required resource provider is registered: # Install the k8s-extension Azure CLI extension az extension add --name k8s-extension # Check if the provider is already registered az provider list --query "[?contains(namespace,'Microsoft.KubernetesConfiguration')]" -o table # If not registered, register it az provider register --namespace Microsoft.KubernetesConfiguration az account set --subscription <YOUR-AZURE-SUBSCRIPTION-ID> az feature registration create --namespace Microsoft.KubernetesConfiguration --name ExtensionTypes # Check the status after a few minutes and wait until it shows Registered az feature show --namespace Microsoft.KubernetesConfiguration --name ExtensionTypes # Install the Dapr extension az k8s-extension create --cluster-type managedClusters \ --cluster-name MyCluster \ --resource-group ResourceGroup \ --name dapr \ --extension-type Microsoft.Dapr \ --auto-upgrade-minor-version false You can also check out the “Before you begin” section here Install the Dapr extension for Azure Kubernetes Service (AKS) and Arc-enabled Kubernetes - Azure Kubernetes Service | Microsoft Learn. Step 4: Deploy the Azure Machine Learning Extension Install the AML extension on your AKS cluster for training: az k8s-extension create \ --name azureml-extension \ --extension-type Microsoft.AzureML.Kubernetes \ --config enableTraining=True enableInference=False \ --cluster-type managedClusters \ --cluster-name MyCluster \ --resource-group ResourceGroup \ --scope cluster There are several options on the extension installation available which are listed here Deploy Azure Machine Learning extension on Kubernetes cluster - Azure Machine Learning | Microsoft Learn. Verify Extension Deployment az k8s-extension create \ --name azureml-extension \ --extension-type Microsoft.AzureML.Kubernetes \ --config enableTraining=True enableInference=False \ --cluster-type managedClusters \ --cluster-name MyCluster \ --resource-group ResourceGroup \ --scope cluster The extension is successfully deployed when provisioning state shows "Succeeded" and all pods in the "azureml" namespace are in the "Running" state. Step 5: Create a GPU-Enabled Instance Type By default, AML only has access to an instance type that doesn't include GPU resources. Create a custom instance type to utilize your GPUs: # Create a custom instance type definition cat > t4-full-node.yaml << EOF apiVersion: amlarc.azureml.com/v1alpha1 kind: InstanceType metadata: name: t4-full-node spec: nodeSelector: agentpool: gpupool kubernetes.azure.com/accelerator: nvidia resources: limits: cpu: "6" nvidia.com/gpu: 2 # Integer value equal to the number of GPUs memory: "55Gi" requests: cpu: "6" memory: "55Gi" EOF # Apply the instance type kubectl apply -f t4-full-node.yaml This configuration creates an instance type that allocates two T4 GPU nodes, making it ideal for ML training jobs. Step 6: Attach the Cluster to Azure Machine Learning Once your instance type is created, you can attach the AKS cluster to your AML workspace: In the Azure Machine Learning Studio, navigate to Compute > Kubernetes clusters Click New and select your AKS cluster Specify your custom instance type ("t4-full-node") when configuring the compute target Complete the attachment process following the UI workflow Alternatively, you can use the Azure CLI or Python SDK to attach the cluster programmatically Attach a Kubernetes cluster to Azure Machine Learning workspace - Azure Machine Learning | Microsoft Learn. Step 7: Test Distributed Training With your GPU-enabled AKS cluster now attached to AML, you can: Create an AML experiment that uses distributed training Specify your custom instance type in the training configuration Submit the job to take advantage of multi-node GPU capabilities You can now run advanced ML workloads like distributed deep learning, which requires multiple GPUs across nodes, all managed through the AML platform. If you want to submit such a job you simply need to list the compute name, the registered instance_type and the number of instances. As an example, clone yuvmaz/aml_labs: Labs to showcase the capabilities of Azure ML and switch to Lab 4 - Foundations of Distributed Deep Learning. Lab 4 introduces you on how distributed training works in general and in AML. In the Jupyter Notebook that guides through that tutorial you will find that the first job definition is in simple_environment.yaml. Open this file an make the following adjustments to use the AKS compute target: $schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json command: env | sort | grep -e 'WORLD' -e 'RANK' -e 'MASTER' -e 'NODE' environment: image: library/python:latest distribution: type: pytorch process_count_per_instance: 2 # We use 2 GPUs per node, Cross GPUs compute: azureml:<Kubernetes-compute_target_name> resources: instance_count: 2 # We want to VMs/instances in total, Cross node instance_type: <instance type name><instance type name> display_name: simple-env-vars-display experiment_name: distributed-training-foundations You can proceed in the same way for all other distributed training jobs. Conclusion By integrating AKS clusters with GPUs into Azure Machine Learning, you get the best of both worlds - the container orchestration and infrastructure capabilities of Kubernetes with the ML workflow management features of AML. This setup is particularly valuable for organizations that want to: Maximize GPU utilization across both operational and ML workloads Provide data scientists with self-service access to GPU resources Establish a consistent ML platform that spans both cloud and on-premises resources For production deployments, consider implementing additional security measures, networking configurations, and monitoring solutions appropriate for your organization's requirements. Thanks a lot, to Yuval Mazor and Alan Weaver for their collaboration on this blog post.417Views1like1CommentDistributed Databases: Adaptive Optimization with Graph Neural Networks and Causal Inference
This blog post introduces a new adaptive framework for distributed databases that leverages Graph Neural Networks (GNNs) and causal inference to overcome the classic limitations imposed by the CAP theorem. Traditional distributed systems often rely on static policies for consistency, availability, and partitioning, which struggle to keep up with rapidly changing workloads and data relationships. The proposed GNN-based approach models the complex, interconnected nature of distributed databases, enabling predictive consistency management, intelligent load balancing for availability, and dynamic, graph-aware partitioning. By integrating temporal modeling and reinforcement learning, the framework adapts in real time, delivering significant improvements in latency, load balancing, and partition efficiency across real-world and synthetic benchmarks. This marks a major step toward intelligent, self-optimizing database systems that can meet the demands of modern applications.202Views0likes0CommentsRAFT: A new way to teach LLMs to be better at RAG
In this article, we will look at the limitations of RAG and domain-specific Fine-tuning to adapt LLMs to existing knowledge and how a team of UC Berkeley researchers, Tianjun Zhang and Shishir G. Patil, may have just discovered a better approach.107KViews7likes5CommentsUnderstanding the Fundamentals of AI Concepts for Nonprofits
Artificial Intelligence (AI) has become a cornerstone of modern technology, driving innovation across various sectors. Nonprofits, too, can harness the power of AI to enhance their operations and amplify their impact. In this blog, we'll explore fundamental AI concepts, common AI workloads, Microsoft's Responsible AI policies, and the tools and services available through Azure AI, all tailored for the nonprofit sector. Understanding AI Workloads AI workloads refer to the different types of tasks that AI systems can perform. Here are some common AI workloads relevant to nonprofits: Machine Learning: This involves training a computer model to make predictions and draw conclusions from data. Nonprofits can use machine learning to predict donor behavior, optimize fundraising strategies, and analyze program outcomes. Computer Vision: This capability allows software to interpret the world visually through cameras, video, and images. Applications include identifying and tracking wildlife for conservation efforts or analyzing images to assess disaster damage. Natural Language Processing (NLP): NLP enables computers to understand and respond to human language. Nonprofits can use NLP for sentiment analysis of social media posts, language translation for multilingual communities, and developing conversational AI like chatbots for donor engagement. Anomaly Detection: This involves automatically detecting errors or unusual activity. It is useful for fraud detection in financial transactions, monitoring network security, and ensuring data integrity. Conversational AI: This refers to the capability of a software agent to engage in conversations with humans. Examples include chatbots and virtual assistants that can answer questions, provide recommendations, and perform tasks, enhancing donor and beneficiary interactions. Responsible AI Practices As AI technology continues to evolve, it is crucial to ensure it is developed and used responsibly. Microsoft's Responsible AI policies emphasize the importance of fairness, reliability, safety, privacy, security, inclusiveness, transparency, and accountability in AI systems. These principles guide the development and deployment of AI solutions to ensure they benefit everyone and do not cause harm. To learn more about Microsoft Responsible AI Practices click here: Empowering responsible AI practices | Microsoft AI Azure AI Services for Nonprofits Microsoft Azure offers a suite of AI services that enable nonprofits to build intelligent applications. Some key services include: Azure Machine Learning: A comprehensive platform for building, training, and deploying machine learning models. It supports a wide range of machine learning frameworks and tools, helping nonprofits analyze data and make informed decisions. To learn more or get started with Azure Machine Learning click here: Azure Machine Learning - ML as a Service | Microsoft Azure Azure AI Bot Service: A service for building conversational AI applications. It provides tools for creating, testing, and deploying chatbots that can interact with users through various channels, improving donor engagement and support services. To learn more or get started with Azure AI Bot Service click here: Azure AI Bot Service | Microsoft Azure Azure Cognitive Services: A collection of APIs that enable developers to add AI capabilities to their applications. These services include vision, speech, language, and decision-making APIs, which can be used for tasks like image recognition, language translation, and sentiment analysis. To learn more about the various Cognitive Service please click here: Azure AI Services – Using AI for Intelligent Apps | Microsoft Azure Conclusion AI has the potential to transform the nonprofit sector by enhancing efficiency, driving innovation, and providing valuable insights. By understanding AI workloads, adhering to responsible AI practices, and leveraging Azure AI services, nonprofits can unlock the full potential of AI to better serve their communities and achieve their missions. Embrace the power of AI to take your nonprofit organization to new heights and make a greater impact. For a deeper dive into the fundamental concepts of AI, please visit the module Fundamental AI Concepts. This resource will provide you with essential insights and a solid foundation to enhance your knowledge in the ever-evolving field of artificial intelligence.221Views0likes0CommentsTransforming Customer Support with Azure OpenAI, Azure AI Services, and Voice AI Agents
Customer support today is under immense pressure to meet the rising expectations of speed, personalization, and always-on availability. Yet, businesses still struggle with 1. Long wait times and call center 2. queues 3. Disconnected support channels 4. Limited availability of agents outside business hours 5. Repetitive issues consuming valuable human time 6. Frustrated users due to lack of immediate and contextual answers These inefficiencies are costing businesses over $3.7 trillion annually in poor service delivery, while over 70% of agents (based on the research) spend excessive time searching for the right answers instead of resolving problems directly How Voice AI Agents Are Transforming the Support Experience Enter the era of voice-enabled AI agents—powered by Azure OpenAI, Azure AI Services, and ServiceNow—designed to completely transform the way customers engage with support systems. These agents can now: Handle complex user queries in natural language Access enterprise systems (like CRM, ITSM, HR) in real-time Automate repetitive tasks such as password resets, ticket status updates, or return tracking Escalate only when human assistance is truly needed Create connected, seamless, and intelligent support experiences across departments Let’s take a closer look at four architecture patterns that showcase how enterprises can deploy these agents effectively. 🔷 Architecture Pattern 1: Unified Voice Agent with Azure AI + ServiceNow + CRM Integration In this architecture, the customer support journey begins when a user initiates a voice-based conversation through a front-end interface such as a web application, mobile app, or smart device. The captured audio is streamed directly to Azure OpenAI GPT-4o's real-time API, which performs immediate speech-to-text transcription, interprets the intent behind the request, and prepares the initial system response—all in a single seamless stream. Once the user’s intent is understood (e.g., "create a ticket", "check incident status", or "list recent issues"), GPT-4o passes control to Semantic Kernel, which orchestrates the next steps through function calling. Semantic Kernel hosts pre-defined tools (functions) that map to ServiceNow API actions, such as createIncident, getIncidentStatus, listIncidents, or searchKnowledgeBase. These function calls are then securely routed to ServiceNow via REST APIs. ServiceNow executes the appropriate actions—whether it's creating a new support ticket, retrieving the status of an open incident, or searching its Knowledge Base. CRM data is also seamlessly accessed, if needed, to enrich responses with personalized context such as customer history or case metadata. The result from ServiceNow (e.g., an incident ID or KB article summary) is then sent back to Azure GPT-4o, which converts the structured data into a natural spoken response. This final audio output is delivered to the user in real time, completing the end-to-end conversational loop. Additionally, tools like Azure Monitor or Application Insights can be integrated to log telemetry, track usage trends, monitor latency, and analyze user satisfaction over time. This architecture enables organizations to streamline customer support operations, reduce wait times, and deliver natural, intelligent assistance across any channel—voice-first. 🔷 Architecture Pattern 2: Scalable Customer Support with Multi-Agent Voice Architecture This architecture introduces a modular and distributed agent-based design to deliver intelligent, scalable customer support through a voice interface. The process starts with the User Proxy Agent, which acts as the entry point for all user conversations. It captures voice input and forwards the request to the Master Agent, which serves as the brain of the architecture. The Master Agent, empowered with a large language model (LLM) and memory, interprets the intent behind the user's input and dynamically routes the request to the most appropriate domain-specific agent. These include specialized agents such as the Activation Agent, Root Agent, Sales Agent, or Technical Agent, each designed to handle specific workflows or business tasks. The Activation Agent connects to web services and handles provisioning or onboarding scenarios. The Root Agent taps into document search systems (like Azure Cognitive Search) to answer questions grounded in internal documentation. The Sales Agent is equipped with structured logic models (SLMs) and CRM access to retrieve sales-related data from backend databases. The Technical Agent is containerized via Docker and built to manage backend diagnostics, code-level issues, or infrastructure status—often connecting to systems like ServiceNow for real-time ITSM execution. Once the task is executed by the respective agent, results are passed back through the Master Agent and ultimately to the User Proxy Agent, which synthesizes the output into a voice response and delivers it to the user. The presence of shared memory between agents allows for maintaining context across multi-turn conversations, enabling complex, multi-step interactions (e.g., “Create a ticket, check the latest order status, and escalate it if unresolved.”) without breaking continuity. This architecture is ideal for enterprises looking to scale customer support horizontally, adding new agents without disrupting existing workflows. It enables parallelism, specialization, and real-time orchestration, providing faster resolutions while reducing the burden on human agents. Best suited for distributed support operations across IT, HR, sales, and field support—where task-specific intelligence and modular scale are critical. 🔷 Architecture Pattern 3: Customer Support Reinvented with Voice RAG + Azure AI + ServiceNow This architecture brings a cutting-edge twist to Retrieval-Augmented Generation (RAG) by enabling it through a Voice AI agent—creating a truly conversational experience grounded in enterprise knowledge. By combining Azure OpenAI models with the ServiceNow Knowledge Base, this pattern ensures accurate, voice-driven support for employees or customers in real time. The process begins when a user interacts with a voice-enabled interface—via phone, web, or embedded assistant. The Voice AI agent streams the audio to Azure OpenAI GPT-4o, which transcribes the voice input, understands the intent, and then triggers a RAG pipeline. Instead of relying solely on the model’s internal memory, the system performs a real-time query against the ServiceNow Product Knowledge Base, retrieving relevant knowledge articles, troubleshooting guides, or support workflows. These results are embedded directly into the prompt, creating an enriched context that is passed to the language model via Azure AI Foundry. The model then generates a natural, contextually accurate spoken response, which is converted back into audio and voiced to the user—creating a seamless end-to-end Voice RAG experience. This approach ensures that responses are not only conversational but also deeply grounded in trusted enterprise knowledge. Ideal for helpdesk automation, HR support, and IT troubleshooting—where users prefer speaking naturally and need verified, document-backed responses in real time. 🔷 Architecture Pattern 4: Conversational Customer Support with AI Avatars and Azure AI This architecture delivers rich, conversational experiences by integrating AI avatars, Azure AI, and ServiceNow to offer human-like, intelligent customer support across channels. It merges natural speech, facial expression, and enterprise data to create a highly engaging support assistant. The interaction begins when a user speaks with an AI avatar application, whether embedded in a web portal, mobile device, or kiosk. The voice is captured and processed through a speech-to-text pipeline, which feeds the Avatar Module and Live Discussions Engine to manage lip-sync, emotional tone, and turn-taking. Behind the scenes, the avatar is connected to Azure AI services, including Custom Neural Voice (CNV) and Azure OpenAI, which enable the avatar to understand intent and generate responses in natural, conversational language. Most critically, the system integrates directly with the ServiceNow platform. Through secure APIs, the avatar queries ServiceNow to: Retrieve case status updates Provide summaries of incident history Look up Knowledge Base articles Trigger incident creation if needed These ServiceNow results are then passed through the text-to-speech module, with support for multilingual voice synthesis, and rendered by the avatar using expressive animation. Responses are visually delivered as live or pre-rendered avatar videos, creating a truly interactive and personalized experience. This pattern not only answers basic questions but also surfaces dynamic enterprise data—turning the AI avatar into a frontline voice agent capable of real-time, connected support across IT, HR, or customer service domains. Best for branded digital experiences, frontline support stations, or HR/IT helpdesk automation where facial presence, empathy, and backend integration are essential. ✨ Closing Thoughts: The Future of Customer Support Is Here Customer expectations have evolved—and so must the way we deliver support. By combining the power of Azure OpenAI, Azure AI Services, and ServiceNow, we’re not just automating tasks—we’re reinventing how organizations connect with their users. Whether it's: A unified voice agent handling IT tickets and CRM queries, A multi-agent architecture scaling across departments, A voice-enabled RAG system delivering knowledge-grounded answers in real time, or A human-like AI avatar offering face-to-face support— These architectures are driving a new era of intelligent, conversational, and scalable customer service. 👉 Join us at the Microsoft Booth during ServiceNow Knowledge 2025 (starting May 6th) to experience these solutions live, explore the tech behind them, and imagine how they can transform your business. Let’s build the future of support—together.1.4KViews1like1CommentLeveraging Power Platform Connectors in Copilot Studio for Nonprofits
Nonprofits can greatly benefit from using Microsoft Power Platform connectors in Copilot Studio. These connectors act as proxies or "wrappers" around APIs, enabling Microsoft Copilot Studio, Microsoft Power Automate, Microsoft Power Apps, and Azure Logic Apps to interact with various apps and services. By using connectors, nonprofits can streamline their operations, automate workflows, and enhance their engagement with stakeholders. What Are Power Platform Connectors? Connectors allow you to access a wide range of services, both within and outside the Microsoft ecosystem, to perform various tasks automatically. These connectors are categorized into: Standard Connectors: Included with all Copilot Studio plans (e.g., SharePoint). Premium Connectors: Available in select Copilot Studio plans. Custom Connectors: Enable connections to any publicly available API for services not covered by existing connectors. Integration with Copilot Studio Microsoft Power Platform connectors are essential tools that extend the functionality of Copilot Studio agents by connecting to various external services and applications. This integration allows nonprofits to create more dynamic, responsive, and useful agents tailored to their specific needs and processes. You can call connectors as connector actions in your agent, from an Action node in conversational topics, and through cloud flows as actions or within topics. Adding a Connector Action Select Add Node: On the authoring canvas, click the Add node (+) icon. Choose Connector: In the node selection window, select Call an action > Connectors (preview), and search for the connector you want to add. Configure Inputs and Outputs: Set up the required and optional inputs and outputs for your experience. By default, the connection is configured to use user credentials. For more information about supported authentication modes, see the section on configuring user authentication for actions. Using Connectors with Agent Author's Credentials Connector actions require valid credentials. By default, these actions ask users to provide their credentials for the associated service when invoked. To use the agent author's credentials or a proxy account's credentials: Configure Authenticated Channel: Set up your agent to use an authenticated channel. Add Connector Action: Add a connector action to your agent as a plugin action and configure it. Set Authentication Method: Go to the connector action properties and under End user authentication, select Agent author authentication. Publish and Test: Publish the changes and test the experience in the Test your agent pane or your desired channel. Sharing Connections To share your connection with others: Navigate to PowerApps: Go to make.powerapps.com. Select Connections: In the left navigation bar, select Connections. Share Connection: Choose the connection, click Share, search for the desired user, and select them. Set Permissions: Under Permission, next to the user, select Can use + share. Practical Applications for Nonprofits Automated Donor Management: Use connectors to integrate with CRM systems like Dynamics 365 or Salesforce to automate donor communications and manage donor data efficiently. Volunteer Coordination: Connect to scheduling tools and communication platforms to streamline volunteer sign-ups, scheduling, and notifications. Event Management: Integrate with event management services to automate the planning, coordination, and follow-up for fundraising events. Resource Allocation: Use connectors to manage and track resources, ensuring they are allocated efficiently to various projects and initiatives. Social Media Engagement: Connect to social media platforms like Twitter and Facebook to automate posts, track engagement, and respond to inquiries. By leveraging Power Platform connectors, nonprofits can enhance their operational efficiency, improve stakeholder engagement, and focus more on their mission-driven activities.215Views0likes0CommentsHubs and Workspaces on Azure Machine Learning – General Availability
We are pleased to announce that hubs and workspaces is now generally available on Azure machine learning allowing users to use hub for team’s collaboration environment for machine learning applications. Azure Hubs and Workspaces provide a centralized platform capability for Azure Machine Learning. This feature enables developers to innovate faster by creating project workspaces and accessing shared company resources without needing repeated assistance from IT administrators. Quick Model Building and Experimentation without IT bottleneck Hubs and Workspaces in Azure Machine Learning provide a centralized solution for managing machine learning resources. Hubs act as a central resource management construct that oversees security, connectivity, computing resources, and team quotas. Once created, they allow developers to create individual workspaces to manage their tasks while adhering to IT setup guidelines. Key Benefits Centralized Management: Hubs allow for centralized settings such as connectivity, compute resources, and security, making it easier for IT admins to manage resources and monitor costs. Cost Efficiency: Utilizing a hub workspace for sharing and reusing configurations enhances cost efficiency when deploying Azure Machine Learning on a large scale. There is a cost associated with setting separate firewall per workspace which scales up as the number of workspaces go up. With hubs, only one firewall is needed which extends across workspaces saving cost. Resource Management: Hubs provide a single pool of compute across workspaces on a user level, eliminating repetitive compute setup and duplicate management steps. This ensures higher utilization of available capacity and fair share of compute resources. Improved Security and Compliance: Hubs act as security boundaries, ensuring that different teams can work in isolated environments without compromising security. Simplified Workspace Creation: Hubs allow for the creation of "light-weight" workspaces in a single step by an ML professional. Enhanced Collaboration: Hubs enable better collaboration among data scientists by providing a centralized platform for managing projects and resource How to get started with Hubs and Projects There are different ways to create hubs for users. You can create hubs via Azure portal, with Azure Resource Manager templates, or via Azure Machine Learning SDK/CLI. Hub properties like networking, monitoring, encryption, identity can be customized while creating a hub and can be set depending on org’s requirements. Workspaces associated with a hub will share hub’s security, connectivity and compute resources. While creating hubs via ML Studio is not supported currently, once hub is created users can create workspaces which get shared access to the company resources made available by the administrator including compute, security and connections. Besides ML Studio, workspaces can be created via Using Azure SDK, Using automation templates, Using Azure CLI. Secure access for Azure resources For accessing data sources outside hubs, connections can help make data available to Azure machine learning. External sources like Snowflake DB, Amazon S3 and Azure SQL DB can be connected to AML resources. Users can also set access permissions to the azure resources with Role based access controls. Besides default built-in roles, users can also create custom roles for more granular access. To conclude, the General Availability of Azure Machine Learning Hubs and Workspaces marks a significant milestone in our commitment to providing scalable, secure, and efficient machine learning solutions. We look forward to seeing how our customers leverage this new feature to drive innovation and achieve their business goals. For more information on hubs and workspaces in Azure machine learning, please refer the following links. What are Azure hubs and workspaces - AML Manage AML hub workspaces in the portal Create a hub using AML SDK and CLI580Views0likes0Comments