azure ml

30 Topics

Power Up Your Open WebUI with Azure AI Speech: Quick STT & TTS Integration
Introduction Ever found yourself wishing your web interface could really talk and listen back to you? With a few clicks (and a bit of code), you can turn your plain Open WebUI into a full-on voice assistant. In this post, you’ll see how to spin up an Azure Speech resource, hook it into your frontend, and watch as user speech transforms into text and your app’s responses leap off the screen in a human-like voice. By the end of this guide, you’ll have a voice-enabled web UI that actually converses with users, opening the door to hands-free controls, better accessibility, and a genuinely richer user experience. Ready to make your web app speak? Let’s dive in. Why Azure AI Speech? We use Azure AI Speech service in Open Web UI to enable voice interactions directly within web applications. This allows users to: Speak commands or input instead of typing, making the interface more accessible and user-friendly. Hear responses or information read aloud, which improves usability for people with visual impairments or those who prefer audio. Provide a more natural and hands-free experience especially on devices like smartphones or tablets. In short, integrating Azure AI Speech service into Open Web UI helps make web apps smarter, more interactive, and easier to use by adding speech recognition and voice output features. If you haven’t hosted Open WebUI already, follow my other step-by-step guide to host Ollama WebUI on Azure. Proceed to the next step if you have Open WebUI deployed already. Learn More about OpenWeb UI here. Deploy Azure AI Speech service in Azure. Navigate to the Azure Portal and search for Azure AI Speech on the Azure portal search bar. Create a new Speech Service by filling up the fields in the resource creation page. Click on “Create” to finalize the setup. After the resource has been deployed, click on “View resource” button and you should be redirected to the Azure AI Speech service page. The page should display the API Keys and Endpoints for Azure AI Speech services, which you can use in Open Web UI. Settings things up in Open Web UI Speech to Text settings (STT) Head to the Open Web UI Admin page > Settings > Audio. Paste the API Key obtained from the Azure AI Speech service page into the API key field below. Unless you use different Azure Region, or want to change the default configurations for the STT settings, leave all settings to blank. Text to Speech settings (TTS) Now, let's proceed with configuring the TTS Settings on OpenWeb UI by toggling the TTS Engine to Azure AI Speech option. Again, paste the API Key obtained from Azure AI Speech service page and leave all settings to blank. You can change the TTS Voice from the dropdown selection in the TTS settings as depicted in the image below: Click Save to reflect the change. Expected Result Now, let’s test if everything works well. Open a new chat / temporary chat on Open Web UI and click on the Call / Record button. The STT Engine (Azure AI Speech) should identify your voice and provide a response based on the voice input. To test the TTS feature, click on the Read Aloud (Speaker Icon) under any response from Open Web UI. The TTS Engine should reflect Azure AI Speech service! Conclusion And that’s a wrap! You’ve just given your Open WebUI the gift of capturing user speech, turning it into text, and then talking right back with Azure’s neural voices. Along the way you saw how easy it is to spin up a Speech resource in the Azure portal, wire up real-time transcription in the browser, and pipe responses through the TTS engine. From here, it’s all about experimentation. Try swapping in different neural voices or dialing in new languages. Tweak how you start and stop listening, play with silence detection, or add custom pronunciation tweaks for those tricky product names. Before you know it, your interface will feel less like a web page and more like a conversation partner.
suzarilshah
Sep 10, 2025 Place Educator Developer Blog
977Views
2likes
1Comment
From Space to Subsurface: Using Azure AI to Predict Gold Rich Zones
In traditional mineral exploration, identifying gold bearing zones can take months of fieldwork and high cost drilling often with limited success. In our latest project, we flipped the process on its head by using Azure AI and Satellite data to guide geologists before they break ground. Using Azure AI and Azure Machine Learning, we built an intelligent, automated pipeline that identified high potential zones from geospatial data saving time, cost, and uncertainty. Here’s a behind the scenes look at how we did it.👇 📡 Step 1: Translating Satellite Imagery into Features We began with Sentinel-2 imagery covering our Area of Interest (AOI) and derived alteration indices commonly used in mineral exploration, including: 🟤 Clay Index – proxies for hydrothermal alteration 🟥 Fe (Iron Oxide) Index 🌫️ Silica Ratio 💧 NDMI (Normalized Difference Moisture Index) Using Azure Notebooks and Python, we processed and cleaned the imagery, transforming raw reflectance bands into meaningful geochemical features. 🔍 Step 2: Discovering Patterns with Unsupervised Learning (KMeans) With feature rich geospatial data prepared, we used unsupervised clustering (KMeans) in Azure Machine Learning Studio to identify natural groupings across the region. This gave us a first look at the terrain’s underlying geochemical structure one cluster in particular stood out as a strong candidate for gold rich zones. No geology degree needed: AI finds patterns humans can't see :) 🧠 Step 3: Scaling with Azure AutoML We then trained a classification model using Azure AutoML to predict these clusters over a dense prediction grid: ✅ 7,200+ data points generated ✅ ~50m resolution grid ✅ 14 km² area of interest This was executed as a short, early stopping run to minimize cost and optimize training time. Models were trained, validated, and registered using: Azure Machine Learning Compute Instance + Compute Cluster Azure Storage for dataset access 🔬 Step 4: Validation with Field Samples To ground our predictions, we validated against lab assayed (gold concentration) from field sampling points. The results? 🔥 The geospatial cluster labeled 'Class 0' by the model showed strong correlation with lab validated gold concentrations, supporting the model's predictive validity. This gave geologists AI augmented evidence to prioritize areas for further sampling and drilling. ⚖️ Traditional vs AI-based Workflow 🚀 Why Azure? ✅ Azure Machine Learning Studio for AutoML and experiment tracking ✅ Azure Storage for seamless access to geospatial data ✅ Azure OpenAI Service for advanced language understanding, report generation, and enhanced human AI interaction ✅ Azure Notebooks for scripting, preprocessing, and validation ✅ Azure Compute Cluster for scalable, cost effective model training ✅ Model Registry for versioning and deployment 🌍 Key Takeaways AI turns mineral exploration from reactive guesswork into proactive intelligence. In our workflow, AI plays a critical role by: ✅ Extracting key geochemical features from satellite imagery 🧠 Identifying patterns using unsupervised learning 🎯 Predicting high potential zones through automated classification 🌍 Delivering full spatial coverage at scale With Azure AIand Azure ML tools, we’ve built a complete pipeline that supports: End to end automation; from data prep to model deployment Faster, more accurate exploration with lower costs A reusable, scalable solution for global teams This isn’t just a proof of concept, it’s a production ready framework that empowers geologists with AI driven insights before the first drill hits the ground. 🔗 If you're working in Mining industry, geoscience, AI for Earth, or exploration tech, let’s connect! We’re on a mission to bring AI deeper into every industry through strategic partnerships and collaborative innovation.
MeysamRouhnavaz
Aug 07, 2025 Place Azure AI Foundry Discussions
136Views
2likes
0Comments
Introducing AzureImageSDK — A Unified .NET SDK for Azure Image Generation And Captioning
Hello 👋 I'm excited to share something I've been working on — AzureImageSDK — a modern, open-source .NET SDK that brings together Azure AI Foundry's image models (like Stable Image Ultra, Stable Image Core), along with Azure Vision and content moderation APIs and Image Utilities, all in one clean, extensible library. While working with Azure’s image services, I kept hitting the same wall: Each model had its own input structure, parameters, and output format — and there was no unified, async-friendly SDK to handle image generation, visual analysis, and moderation under one roof. So... I built one. AzureImageSDK wraps Azure's powerful image capabilities into a single, async-first C# interface that makes it dead simple to: 🎨 Inferencing Image Models 🧠 Analyze visual content (Image to text) 🚦 Image Utilities — with just a few lines of code. It's fully open-source, designed for extensibility, and ready to support new models the moment they launch. 🔗 GitHub Repo: https://github.com/DrHazemAli/AzureImageSDK Also, I've posted the release announcement on the https://github.com/orgs/azure-ai-foundry/discussions/47 👉🏻 feel free to join the conversation there too. The SDK is available on NuGet too. Would love to hear your thoughts, use cases, or feedback!
hazem
Jun 11, 2025 Place Azure AI Foundry Discussions
123Views
1like
0Comments
Building an AI-Powered ESG Consultant Using Azure AI Services: A Case Study
In today's corporate landscape, Environmental, Social, and Governance (ESG) compliance has become increasingly important for stakeholders. To address the challenges of analyzing vast amounts of ESG data efficiently, a comprehensive AI-powered solution called ESGai has been developed. This blog explores how Azure AI services were leveraged to create a sophisticated ESG consultant for publicly listed companies. https://youtu.be/5-oBdge6Q78?si=Vb9aHx79xk3VGYAh The Challenge: Making Sense of Complex ESG Data Organizations face significant challenges when analyzing ESG compliance data. Manual analysis is time-consuming, prone to errors, and difficult to scale. ESGai was designed to address these pain points by creating an AI-powered virtual consultant that provides detailed insights based on publicly available ESG data. Solution Architecture: The Three-Agent System ESGai implements a sophisticated three-agent architecture, all powered by Azure's AI capabilities: Manager Agent: Breaks down complex user queries into manageable sub-questions containing specific keywords that facilitate vector search retrieval. The system prompt includes generalized document headers from the vector database for context. Worker Agent: Processes the sub-questions generated by the Manager, connects to the vector database to retrieve relevant text chunks, and provides answers to the sub-questions. Results are stored in Cosmos DB for later use. Director Agent: Consolidates the answers from the Worker agent into a comprehensive final response tailored specifically to the user's original query. It's important to note that while conceptually there are three agents, the Worker is actually a single agent that gets called multiple times - once for each sub-question generated by the Manager. Current Implementation State The current MVP implementation has several limitations that are planned for expansion: Limited Company Coverage: The vector database currently stores data for only 2 companies, with 3 documents per company (Sustainability Report, XBRL, and BRSR). Single Model Deployment: Only one GPT-4o model is currently deployed to handle all agent functions. Basic Storage Structure: The Blob container has a simple structure with a single directory. While Azure Blob storage doesn't natively support hierarchical folders, the team plans to implement virtual folders in the future. Free Tier Limitations: Due to funding constraints, the AI Search service is using the free tier, which limits vector data storage to 50MB. Simplified Vector Database: The current index stores all 6 files (3 documents × 2 companies) in a single vector database without filtering capabilities or schema definition. Azure Services Powering ESGai The implementation of ESGai leverages multiple Azure services for a robust and scalable architecture: Azure AI Services: Provides pre-built APIs, SDKs, and services that incorporate AI capabilities without requiring extensive machine learning expertise. This includes access to 62 pre-trained models for chat completions through the AI Foundry portal. Azure OpenAI: Hosts the GPT-4o model for generating responses and the Ada embedding model for vectorization. The service combines OpenAI's advanced language models with Azure's security and enterprise features. Azure AI Foundry: Serves as an integrated platform for developing, deploying, and governing generative AI applications. It offers a centralized management centre that consolidates subscription information, connected resources, access privileges, and usage quotas. Azure AI Search (formerly Cognitive Search): Provides both full-text and vector search capabilities using the OpenAI ada-002 embedding model for vectorization. It's configured with hybrid search algorithms (BM25 RRF) for optimal chunk ranking. Azure Storage Services: Utilizes Blob Storage for storing PDFs, Business Responsibility Sustainability Reports (BRSRs), and other essential documents. It integrates seamlessly with AI Search using indexers to track database changes. Cosmos DB: Employs MongoDB APIs within Cosmos DB as a NoSQL database for storing chat history between agents and users. Azure App Services: Hosts the web application using a B3-tier plan optimized for cost efficiency, with GitHub Actions integrated for continuous deployment. Project Evolution: From Concept to Deployment The development of ESGai followed a structured approach through several phases: Phase 1: Data Cleaning Extracted specific KPIs from XML/XBRL datasets and BRSR reports containing ESG data for 1,000 listed companies Cleaned and standardized data to ensure consistency and accuracy Phase 2: RAG Framework Development Implemented Retrieval-Augmented Generation (RAG) to enhance responses by dynamically fetching relevant information Created a workflow that includes query processing, data retrieval, and response generation Phase 3: Initial Deployment Deployed models locally using Docker and n8n automation tools for testing Identified the need for more scalable web services Phase 4: Transition to Azure Services Migrated automation workflows from n8n to Azure AI Foundry services Leveraged Azure's comprehensive suite of AI services, storage solutions, and app hosting capabilities Technical Implementation Details Model Configurations: The GPT model is configured with: Model version: 2024-11-20 Temperature: 0.7 Max Response Token: 800 Past Messages: 10 Top-p: 0.95 Frequency/Presence Penalties: 0 The embedding model uses OpenAI-text-embedding-Ada-002 with 1536 dimensions and hybrid semantic search (BM25 RRF) algorithms. Cost Analysis and Efficiency A detailed cost breakdown per user query reveals: App Server: $390-400 AI Search: $5 per query RAG Query Processing: $4.76 per query Agent-specific costs: Manager: $0.05 (30 input tokens, 210 output tokens) Worker: $3.71 (1500 input tokens, 1500 output tokens) Director: $1.00 (600 input tokens, 600 output tokens) Challenges and Solutions The team faced several challenges during implementation: Quota Limitations: Initial deployments encountered token quota restrictions, which were resolved through Azure support requests (typically granted within 24 hours). Cost Optimization: High costs associated with vectorization required careful monitoring. The team addressed this by shutting down unused services and deploying on services with free tiers. Integration Issues: GitHub Actions raised errors during deployment, which were resolved using GitHub's App Service Build Service. Azure UI Complexity: The team noted that Azure AI service naming conventions were sometimes confusing, as the same name is used for both parent and child resources. Free Tier Constraints: The AI Search service's free tier limitation of 50MB for vector data storage restricts the amount of company information that can be included in the current implementation. Future Roadmap The current implementation is an MVP with several areas for expansion: Expand the database to include more publicly available sustainability reports beyond the current two companies Optimize token usage by refining query handling processes Research alternative embedding models to reduce costs while maintaining accuracy Implement a more structured storage system with virtual folders in Blob storage Upgrade from the free tier of AI Search to support larger data volumes Develop a proper schema for the vector database to enable filtering and more targeted searches Scale to multiple GPT model deployments for improved performance and redundancy Conclusion ESGai demonstrates how advanced AI techniques like Retrieval-Augmented Generation can transform data-intensive domains such as ESG consulting. By leveraging Azure's comprehensive suite of AI services alongside a robust agent-based architecture, this solution provides users with actionable insights while maintaining scalability and cost efficiency. https://youtu.be/5-oBdge6Q78?si=Vb9aHx79xk3VGYAh
Aditya_vp
Apr 06, 2025 Place Azure
230Views
0likes
0Comments
Operationalize your Prompt Engineering Skills with Azure Prompt Flow
In today’s AI-driven world, prompt engineering is a game-changing skill for developers and professionals alike. With Azure Prompt Flow, you can harness the power of open-source LLMs to solve real-world operational challenges! This article guides you through using Azure’s robust tools to build, deploy, and refine your own LLM apps—from chatbots to data extraction tools and beyond. Whether you're just starting or looking to sharpen your AI expertise, this guide has everything you need to unlock new possibilities with prompt engineering. Dive in and take your tech journey to the next level!
Ifeoluwa_Oduwaiye
Dec 27, 2024 Place Educator Developer Blog
1.4KViews
5likes
3Comments
Understanding the Difference in Using Different Large Language Models: Step-by-Step Guide
Unlock the secrets of deploying Large Language Models on Azure with our comprehensive guide! Learn step-by-step integration techniques for models like GPT-2, Llama 2, and Dolly v1 in your Web Applications or Power Apps. Explore detailed instructions, ready-made code, and expert tips. Join us for a live session on November 2nd, 2023, to harness the power of AI and Microsoft tools. Become an entrepreneur with Microsoft Founders Hub, offering up to $2,500 OpenAI credits and $1,000 Azure credits. Dive into the world of tech solutions and creative writing ideas today!
JohnAziz
Nov 15, 2023 Place Educator Developer Blog
14KViews
3likes
1Comment
Deploying a Large Language Model (GPT-2) on Azure Using Power Automate: Step-by-Step Guide
Step-by-step guide on deploying a large language model (GPT-2) to the Azure platform and consuming it using Power Platform (Power Automate, Power Apps) to generate text and give you creative writing ideas.
JohnAziz
Nov 01, 2023 Place Educator Developer Blog
24KViews
5likes
0Comments
The Full Guide to Packaging and Deploying ML Models to Production Using Azure: Step-by-Step Guide
Step-by-step guide on How to package and deploy any machine learning model using ONNX to the Azure platform and consume it using Power Platform (Power Automate, Power Apps) to predict house prices.
JohnAziz
Jul 14, 2023 Place Educator Developer Blog
11KViews
3likes
1Comment
Bringing Feature Store to Azure - from Microsoft Azure, Redis, and Feast Community
Bringing Feature Store to Azure - from Microsoft Azure, Redis, and Feast Community
azuredatateam
Sep 13, 2022 Place Microsoft Developer Community Blog
30KViews
3likes
1Comment
Pre-train and Fine-tune Language Model with Hugging Face and Gaudi HPU.
In this blog, we provide a general guideline of pre-training and fine-tuning language models using Hugging Face. For illustration, we use pre-training language models for question generation (answer-agnostic) in Korean as running example.
congmu
Aug 11, 2022 Place Microsoft Developer Community Blog
13KViews
0likes
0Comments