Open AI
5 TopicsMastering Azure OpenAI Services: A Comprehensive Learning Path for Aspiring AI Engineers
Are you a computer science student looking to delve into the world of Azure OpenAI Services? Look no further! In this Microsoft Learning Pathway, "Develop Generative AI solutions with Azure OpenAI Service," you'll embark on an exciting journey to harness the power of OpenAI's vast language models like ChatGPT, GPT, Codex, and Embeddings. These models are pivotal for creating innovative Natural Language Processing (NLP) solutions that can comprehend, converse, and generate content.8.1KViews4likes0CommentsDeploy Open Web UI on Azure VM via Docker: A Step-by-Step Guide with Custom Domain Setup.
Introductions Open Web UI (often referred to as "Ollama Web UI" in the context of LLM frameworks like Ollama) is an open-source, self-hostable interface designed to simplify interactions with large language models (LLMs) such as GPT-4, Llama 3, Mistral, and others. It provides a user-friendly, browser-based environment for deploying, managing, and experimenting with AI models, making advanced language model capabilities accessible to developers, researchers, and enthusiasts without requiring deep technical expertise. This article will delve into the step-by-step configurations on hosting OpenWeb UI on Azure. Requirements: Azure Portal Account - For students you can claim $USD100 Azure Cloud credits from this URL. Azure Virtual Machine - with a Linux of any distributions installed. Domain Name and Domain Host Caddy Open WebUI Image Step One: Deploy a Linux – Ubuntu VM from Azure Portal Search and Click on “Virtual Machine” on the Azure portal search bar and create a new VM by clicking on the “+ Create” button > “Azure Virtual Machine”. Fill out the form and select any Linux Distribution image – In this demo, we will deploy Open WebUI on Ubuntu Pro 24.04. Click “Review + Create” > “Create” to create the Virtual Machine. Tips: If you plan to locally download and host open source AI models via Open on your VM, you could save time by increasing the size of the OS disk / attach a large disk to the VM. You may also need a higher performance VM specification since large resources are needed to run the Large Language Model (LLM) locally. Once the VM has been successfully created, click on the “Go to resource” button. You will be redirected to the VM’s overview page. Jot down the public IP Address and access the VM using the ssh credentials you have setup just now. Step Two: Deploy the Open WebUI on the VM via Docker Once you are logged into the VM via SSH, run the Docker Command below: docker run -d --name open-webui --network=host --add-host=host.docker.internal:host-gateway -e PORT=8080 -v open-webui:/app/backend/data --restart always ghcr.io/open-webui/open-webui:dev This Docker command will download the Open WebUI Image into the VM and will listen for Open Web UI traffic on port 8080. Wait for a few minutes and the Web UI should be up and running. If you had setup an inbound Network Security Group on Azure to allow port 8080 on your VM from the public Internet, you can access them by typing into the browser: [PUBLIC_IP_ADDRESS]:8080 Step Three: Setup custom domain using Caddy Now, we can setup a reverse proxy to map a custom domain to [PUBLIC_IP_ADDRESS]:8080 using Caddy. The reason why Caddy is useful here is because they provide automated HTTPS solutions – you don’t have to worry about expiring SSL certificate anymore, and it’s free! You must download all Caddy’s dependencies and set up the requirements to install it using this command: sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | sudo tee /etc/apt/sources.list.d/caddy-stable.list sudo apt update && sudo apt install caddy Once Caddy is installed, edit Caddy’s configuration file at: /etc/caddy/Caddyfile , delete everything else in the file and add the following lines: yourdomainname.com { reverse_proxy localhost:8080 } Restart Caddy using this command: sudo systemctl restart caddy Next, create an A record on your DNS Host and point them to the public IP of the server. Step Four: Update the Network Security Group (NSG) To allow public access into the VM via HTTPS, you need to ensure the NSG/Firewall of the VM allow for port 80 and 443. Let’s add these rules into Azure by heading to the VM resources page you created for Open WebUI. Under the “Networking” Section > “Network Settings” > “+ Create port rule” > “Inbound port rule” On the “Destination port ranges” field, type in 443 and Click “Add”. Repeat these steps with port 80. Additionally, to enhance security, you should avoid external users from directly interacting with Open Web UI’s port - port 8080. You should add an inbound deny rule to that port. With that, you should be able to access the Open Web UI from the domain name you setup earlier. Conclusion And just like that, you’ve turned a blank Azure VM into a sleek, secure home for your Open Web UI, no magic required! By combining Docker’s simplicity with Caddy’s “set it and forget it” HTTPS magic, you’ve not only made your app accessible via a custom domain but also locked down security by closing off risky ports and keeping traffic encrypted. Azure’s cloud muscle handles the heavy lifting, while you get to enjoy the perks of a pro setup without the headache. If you are interested in using AI models deployed on Azure AI Foundry on OpenWeb UI via API, kindly read my other article: Step-by-step: Integrate Ollama Web UI to use Azure Open AI API with LiteLLM Proxy2.9KViews1like1CommentAzure OpenAI Service - Features Overview and Key Concepts
Azure artificial intelligence services including a variety services related to language and language processing (speech recognition, speech formation, translations), text recognition, and image and character recognition. What is Azure OpenAI Service? Azure OpenAI Service provides REST API access to OpenAI's powerful language models including the GPT-3, Codex and Embeddings model series. Azure OpenAI Model Azure OpenAI provides access to many different models, grouped by family and capability. A model family typically associates models by their intended task. Azure OpenAI Service Model capabilities Each model family has a series of models that are further distinguished by capability. These capabilities are typically identified by names, and the alphabetical order of these names generally signifies the relative capability and cost of that model within a given model family. Azure OpenAI models fall into a few main families: GPT-4: A set of models that improve on GPT-3.5 and can understand as well as generate natural language and code. GPT-3.5: A set of models that improve on GPT-3 and can understand as well as generate natural language and code. Embeddings: A set of models that can convert text into numerical vector form to facilitate text similarity. DALL-E: A series of models that can generate original images from natural language. Key concepts: Prompts & Completion The completions endpoint is the core component of the API service. This API provides access to the model's text-in, text-out interface. Users simply need to provide an input prompt containing the English text command, and the model will generate a text completion. Token Azure OpenAI processes text by breaking it down into tokens. Tokens can be words or just chunks of characters. For example, the word “hamburger” gets broken up into the tokens “ham”, “bur” and “ger” The total number of tokens processed in a given request depends on the length of your input, output and request parameters. The quantity of tokens being processed will also affect your response latency and throughput for the models. Resource Azure OpenAI is a new product offering on Azure. You can get started with Azure OpenAI the same way as any other Azure product where you create a resource, or instance of the service, in your Azure Subscription. You can read more about Azure's resource management design. Deployment Once you create an Azure OpenAI Resource, you must deploy a model before you can start making API calls and generating text. This action can be done using the Deployment APIs. These APIs allow you to specify the model you wish to use. In-context learning The models used by Azure OpenAI use natural language instructions and examples provided during the generation call to identify the task being asked and skill required. When you use this approach, the first part of the prompt includes natural language instructions and/or examples of the specific task desired. The model then completes the task by predicting the most probable next piece of text. This technique is known as "in-context" learning. There are three main approaches for in-context learning: Few-shot: In this case, a user includes several examples in the call prompt that demonstrate the expected answer format and content. One-shot: This case is the same as the few-shot approach except only one example is provided. Zero-shot: In this case, no examples are provided to the model and only the task request is provided. Model The service provides users access to several different models. Each model provides a different capability and price point. GPT-4 models are the latest available models. Due to high demand access to this model series is currently only available by request. The GPT-3 base models are known as Davinci, Curie, Babbage, and Ada in decreasing order of capability and increasing order of speed. The Codex series of models is a descendant of GPT-3 and has been trained on both natural language and code to power natural language to code use cases. Use cases: GPT 3.5 Generating natural language for chatbots and virtual assistants with awareness of the previous history of chat Power chatbots that can handle customer inquiries, provide assistance, and converse but doesn’t have memory of conversations Automatically summarize lengthy texts Assist writers by suggesting synonyms, correcting grammar and spelling errors, and even generating entire sentences or paragraphs Help researchers by quickly processing large amounts of data and generating insights, summaries, and visualizations to aid in analysis Generate good quality code based on natural language Use cases: GPT 4.0 Generating and understanding natural language for customer service interactions, chatbots, and virtual assistants – doesn’t have memory of conversations Generating high-quality code for programming languages based on natural language input. Providing accurate translations between languages Improving text summarization and content generation Provides for multi-modal interaction (text and images) Substantial reduction in Hallucinations Consistency between different runs is high Multi-Modal Transformer Architecture Multi-modal models combine text and other types of input (such as graphics, images etc.) and are more task-specific. One multi-modal model in the collection has not been pre-trained in the same self-supervised manner. These models have performed state-of-the-art tasks, including visual question answering, image captioning, and speech recognition. Pricing Pricing will be based on the pay-as-you-go consumption model with a price per unit for each model, which is similar to other Azure Cognitive Services pricing models. Language models Image models Fine-tuned models Embedding models DALL-E Image Generation Editing an image Creating variations of image Embedding models The embedding is an information dense representation of the semantic meaning of a piece of text. Microsoft currently offers three families of Embeddings models for different functionalities: Similarity embedding: are good at capturing semantic similarity between two or more pieces of text. Text search embedding: help measure whether long documents are relevant to a short query. Code search embedding: are useful for embedding code snippets and embedding natural language search queries.3.3KViews1like0Comments