api
17 TopicsModel Mondays S2E8: On-Device & Local AI
Model Mondays S2E8: On-Device & Local AI Welcome to Episode 8! This week, we explored how AI is moving from the cloud to your own device, making it faster, more private, and more accessible. We also saw a real-world customer story from Xander Glasses, showing how AI can help people with hearing loss. RFD Observability tools in Azure AI Foundry: Real-time model telemetry, auto evals, quick evals, Python grader. GitHub Copilot Pro with Spark: AI pair programmer for code explanation and workflow suggestions. Synthetic Data for Vision Models: Training accurate models with procedurally generated data. Agent-Friendly Websites: Making sites accessible to AI agents via APIs, semantic markup, and OpenAPI specs. MCP (Model Context Protocol): Standardizing agent memory and context for scalable AI.118Views0likes0CommentsConfigure Embedding Models on Azure AI Foundry with Open Web UI
Introduction Let’s take a closer look at an exciting development in the AI space. Embedding models are the key to transforming complex data into usable insights, driving innovations like smarter chatbots and tailored recommendations. With Azure AI Foundry, Microsoft’s powerful platform, you’ve got the tools to build and scale these models effortlessly. Add in Open Web UI, a intuitive interface for engaging with AI systems, and you’ve got a winning combo that’s hard to beat. In this article, we’ll explore how embedding models on Azure AI Foundry, paired with Open Web UI, are paving the way for accessible and impactful AI solutions for developers and businesses. Let’s dive in! To proceed with configuring the embedding model from Azure AI Foundry on Open Web UI, please firstly configure the requirements below. Requirements: Setup Azure AI Foundry Hub/Projects Deploy Open Web UI – refer to my previous article on how you can deploy Open Web UI on Azure VM. Optional: Deploy LiteLLM with Azure AI Foundry models to work on Open Web UI - refer to my previous article on how you can do this as well. Deploying Embedding Models on Azure AI Foundry Navigate to the Azure AI Foundry site and deploy an embedding model from the “Model + Endpoint” section. For the purpose of this demonstration, we will deploy the “text-embedding-3-large” model by OpenAI. You should be receiving a URL endpoint and API Key to the embedding model deployed just now. Take note of that credential because we will be using it in Open Web UI. Configuring the embedding models on Open Web UI Now head to the Open Web UI Admin Setting Page > Documents and Select Azure Open AI as the Embedding Model Engine. Copy and Paste the Base URL, API Key, the Embedding Model deployed on Azure AI Foundry and the API version (not the model version) into the fields below: Click “Save” to reflect the changes. Expected Output Now let us look into the scenario for when the embedding model configured on Open Web UI and when it is not. Without Embedding Models configured. With Azure Open AI Embedding models configured. Conclusion And there you have it! Embedding models on Azure AI Foundry, combined with the seamless interaction offered by Open Web UI, are truly revolutionizing how we approach AI solutions. This powerful duo not only simplifies the process of building and deploying intelligent systems but also makes cutting-edge technology more accessible to developers and businesses of all sizes. As we move forward, it’s clear that such integrations will continue to drive innovation, breaking down barriers and unlocking new possibilities in the AI landscape. So, whether you’re a seasoned developer or just stepping into this exciting field, now’s the time to explore what Azure AI Foundry and Open Web UI can do for you. Let’s keep pushing the boundaries of what’s possible!1.1KViews0likes0CommentsStep-by-step: Integrate Ollama Web UI to use Azure Open AI API with LiteLLM Proxy
Introductions Ollama WebUI is a streamlined interface for deploying and interacting with open-source large language models (LLMs) like Llama 3 and Mistral, enabling users to manage models, test them via a ChatGPT-like chat environment, and integrate them into applications through Ollama’s local API. While it excels for self-hosted models on platforms like Azure VMs, it does not natively support Azure OpenAI API endpoints—OpenAI’s proprietary models (e.g., GPT-4) remain accessible only through OpenAI’s managed API. However, tools like LiteLLM bridge this gap, allowing developers to combine Ollama-hosted models with OpenAI’s API in hybrid workflows, while maintaining compliance and cost-efficiency. This setup empowers users to leverage both self-managed open-source models and cloud-based AI services. Problem Statement As of February 2025, Ollama WebUI, still do not support Azure Open AI API. The Ollama Web UI only support self-hosted Ollama API and managed OpenAI API service (PaaS). This will be an issue if users want to use Open AI models they already deployed on Azure AI Foundry. Objective To integrate Azure OpenAI API via LiteLLM proxy into with Ollama Web UI. LiteLLM translates Azure AI API requests into OpenAI-style requests on Ollama Web UI allowing users to use OpenAI models deployed on Azure AI Foundry. If you haven’t hosted Ollama WebUI already, follow my other step-by-step guide to host Ollama WebUI on Azure. Proceed to the next step if you have Ollama WebUI deployed already. Step 1: Deploy OpenAI models on Azure Foundry. If you haven’t created an Azure AI Hub already, search for Azure AI Foundry on Azure, and click on the “+ Create” button > Hub. Fill out all the empty fields with the appropriate configuration and click on “Create”. After the Azure AI Hub is successfully deployed, click on the deployed resources and launch the Azure AI Foundry service. To deploy new models on Azure AI Foundry, find the “Models + Endpoints” section on the left hand side and click on “+ Deploy Model” button > “Deploy base model” A popup will appear, and you can choose which models to deploy on Azure AI Foundry. Please note that the o-series models are only available to select customers at the moment. You can request access to the o-series models by completing this request access form, and wait until Microsoft approves the access request. Click on “Confirm” and another popup will emerge. Now name the deployment and click on “Deploy” to deploy the model. Wait a few moments for the model to deploy. Once it successfully deployed, please save the “Target URI” and the API Key. Step 2: Deploy LiteLLM Proxy via Docker Container Before pulling the LiteLLM Image into the host environment, create a file named “litellm_config.yaml” and list down the models you deployed on Azure AI Foundry, along with the API endpoints and keys. Replace "API_Endpoint" and "API_Key" with “Target URI” and “Key” found from Azure AI Foundry respectively. Template for the “litellm_config.yaml” file. model_list: - model_name: [model_name] litellm_params: model: azure/[model_name_on_azure] api_base: "[API_ENDPOINT/Target_URI]" api_key: "[API_Key]" api_version: "[API_Version]" Tips: You can find the API version info at the end of the Target URI of the model's endpoint: Sample Endpoint - https://example.openai.azure.com/openai/deployments/o1-mini/chat/completions?api-version=2024-08-01-preview Run the docker command below to start LiteLLM Proxy with the correct settings: docker run -d \ -v $(pwd)/litellm_config.yaml:/app/config.yaml \ -p 4000:4000 \ --name litellm-proxy-v1 \ --restart always \ ghcr.io/berriai/litellm:main-latest \ --config /app/config.yaml --detailed_debug Make sure to run the docker command inside the directory where you created the “litellm_config.yaml” file just now. The port used to listen for LiteLLM Proxy traffic is port 4000. Now that LiteLLM proxy had been deployed on port 4000, lets change the OpenAI API settings on Ollama WebUI. Navigate to Ollama WebUI’s Admin Panel settings > Settings > Connections > Under the OpenAI API section, write http://127.0.0.1:4000 as the API endpoint and set any key (You must write anything to make it work!). Click on “Save” button to reflect the changes. Refresh the browser and you should be able to see the AI models deployed on the Azure AI Foundry listed in the Ollama WebUI. Now let’s test the chat completion + Web Search capability using the "o1-mini" model on Ollama WebUI. Conclusion Hosting Ollama WebUI on an Azure VM and integrating it with OpenAI’s API via LiteLLM offers a powerful, flexible approach to AI deployment, combining the cost-efficiency of open-source models with the advanced capabilities of managed cloud services. While Ollama itself doesn’t support Azure OpenAI endpoints, the hybrid architecture empowers IT teams to balance data privacy (via self-hosted models on Azure AI Foundry) and cutting-edge performance (using Azure OpenAI API), all within Azure’s scalable ecosystem. This guide covers every step required to deploy your OpenAI models on Azure AI Foundry, set up the required resources, deploy LiteLLM Proxy on your host machine and configure Ollama WebUI to support Azure AI endpoints. You can test and improve your AI model even more with the Ollama WebUI interface with Web Search, Text-to-Image Generation, etc. all in one place.8.3KViews1like4CommentsPart 1 - Develop a VS Code Extension for Your Capstone Project
API Guardian - My Capstone Project As software and APIs evolve, developers encounter significant difficulties in maintaining and updating API endpoints. Breaking changes can lead to system instability, while outdated or unclear documentation makes maintenance less efficient. These challenges are further compounded by the time-consuming nature of updating dependencies and the tendency to prioritize new features over maintenance tasks. The absence of effective tools and processes to tackle these issues reduces overall productivity and developer efficiency. To address this, API Guardian was created as a Visual Studio Code extension that identifies API endpoints in a project and checks their functionality before deployment. This solution was developed to help developers save time spent fixing issues caused by breaking or non-breaking changes and to alleviate the difficulties in performing maintenance due to unclear or outdated documentation. Features and Capabilities This extension has 3 main features: Feature 1. Developers can decide if the extension will scan or skip specified files in the project. Press “Enter” to scan/skip all files. Type the file name (e.g., main.py) and press “Enter” to scan/skip a single file. Type file names with a delimiter (e.g., main.py | pythonFile.py) and press “Enter” to scan/skip multiple files. Feature 2. Custom hover messages when developers mouse over identified APIs This hover message will vary based on the status of the APIs. If the API returns a success status, the hover message will only show the completed API and its status. However, if an error occurs, the hover message will include this additional information: (1) API Name, (2) Official API Link, (3) Error Message, (4) Title of Recommended Fix and (5) Link to the Recommended Fix. Feature 3. Excel Report with Details of Identified APIs After all the identified APIs have been tested, an excel report will exported with the following information to allow developers to easily identify the APIs in the project. What Technology and Products does it involved? Building a Visual Studio Code extension and publishing it to the Visual Studio Marketplace involves a mix of technologies and tools. The project was initiated using the NPM package, generator-code, to set up a JavaScript project for developing the extension. All the extension's logic will be developed and managed within the "extension.js" file generated during the setup process. Once ready for deployment, we will package the extension using "vsce" to generate a ".vsix" file, which will then be used for deployment to the Visual Studio Code Marketplace. The deployment process involves requiring the user to create a publishing account and using tools like vsce to upload and manage the extension's version, updates, and metadata. As part of this process, you would need to create a Personal Access Token (PAT) from Azure DevOps. This token is used to verify your identity and authenticate the publishing tool, allowing you to securely upload your extension to the Visual Studio Marketplace. The PAT provides the necessary permissions for tasks such as version management, publishing new releases, and updating the extension metadata. What did I learn? Throughout this journey, I learned not just about the technical stack but also about the value of detailed project setup and secure publishing processes. While the technical steps can be challenging, they’re incredibly rewarding, and I’m excited to dive deeper into it moving forward. I’m looking forward to exploring how the extension can be further improved and enhanced. If you're interested in learning more about how my API guidance was built, keep an eye out for my next post! API Guardian https://marketplace.visualstudio.com/items?itemName=APIGuardian-vsc.api About the Authors Main Author - Ms Joy Cheng Yee Shing, BSc (Hon) Computing Science Academic Supervisor - Dr Peter Yau, Microsoft MVP309Views0likes0CommentsVisual Studio AI Toolkit : Building Phi-3 GenAI Applications
Port Forwarding, a valuable feature within the AI Toolkit, serves as a crucial gateway for seamless communication with the GenAI model. Whether it's through a straightforward API call or leveraging the SDKs, this functionality greatly enhances our ability to harness the power of the LLM/SLM. By enabling Port Forwarding, a plethora of new scenarios unfold, unlocking the full potential of our interactions with the model.9.6KViews2likes0CommentsCreate a Simple Speech REST API with Azure AI Speech Services
Explore the world of Speech recognition and Speech Synthesis with Azure AI Services. In this tutorial, you will learn how to create your own simple Speech REST API using Azure AI Speech Synthesis and Azure OpenAI services or OpenAI API. Experience the power of speech synthesis using Azure and explore the infinite number of possibilities today unveiled to you by Azure AI Services to create powerful products.5.9KViews2likes0CommentsVisualizing Top GitHub Programming Languages in Excel with Microsoft Graph .NET SDK
Have you ever thought about going through all your GitHub Repositories, taking note of the languages used, aggregating them and visualizing it on Excel? Well, that is what this post is all about except you don’t have to do it manually in a mundane way.9.4KViews1like0CommentsNoSQL databases with Azure Cosmos DB
Learn about the differences between relational and NoSQL databases, and discover the advantages of Azure Cosmos DB. This Microsoft Azure database service supports multiple NoSQL models and offers high security, scalability, and global availability. Get started with a free tier and try your hand at storing data with Azure Cosmos DB.4.5KViews1like0CommentsGetting started with ML.NET
ML has been added into the .NET ecosystem a few years back, by creating an open-source framework (ML.NET) which enables developers to train, build and ship custom ML models for a wide range of scenarios. Since then, the framework has evolved a lot, incorporating new features, with the preview release of the latest version (ML.NET 3.0 ) being announced a few weeks ago.4.5KViews0likes0Comments