VSCode
40 TopicsVisual Studio Code AI Toolkit: Run LLMs locally
AI Toolkit is here Getting the LLMs/ SLMs on our local machines. This toolkit lets us easily download the models on our local machine. Evaluation of the model. Whenever we need to evaluate a model to check for the feasibility to any particular application, then this tool lets us do it in a playground environment, which is what we will seeing in this blog. Fine-tuning, this majorly delas with training the model further to do the tasks that we specifically want the model to do. Usually, it does a generic task and has generic data, with fine-tuning we can give it a particular flavor to perform particular task.32KViews5likes3CommentsData Science Day 2024 , 14th March 2024 - Schedule Announcement
We are thrilled to announce Python Data Science Day will be taking place March 14th, 2024 with talks all day; a "PyDay" on Pi Day: 3.14. If you're a Python developer, entrepreneur, data scientist, student, or researcher working on projects from hobbyist and start up to enterprise level, you'll find solutions to modernize your data pipelines and answer complex queries with data.2.1KViews4likes0CommentsPrompt Engineering Simplified: AI Toolkit's Prompt Builder
In the age of generative AI, crafting effective prompts is no longer a nice-to-have, it's a must-have. Understanding how to communicate with these underlying models is the key to unlocking their true potential and getting the results we need. What are Prompts? Every time we want to communicate to the language model, we give set of instructions to these models, we refer to these inputs as Prompts. Prompts play a very crucial role while working with the GenAI models. The quality of a prompt directly impacts the output of GenAI models. Precise and well-crafted prompts are crucial for achieving desired results. What factors crafts an optimal Prompt? Crafting an optimal requires balancing clarity, specificity and context. Besides these, constraints are a critical factor in crafting effective prompts. Specificity Clearly define the expectations. The prompt should leave no room for misinterpretation. Precise language is the key. Avoid vague language. Vague prompts lead to vague or irrelevant responses. e.g., “Tell me about history” ➔ “Explain the economic causes of the French Revolution”. Clarity Use simple, unambiguous language. Avoid jargon unless your audience expects it, recommended to use action verbs like "write," "summarize," "explain," "translate". Context Provide background for e.g., “As a beginner in coding, how do I write a Python loop?”. Give the LLM enough context to understand the situation. Include relevant details, keywords, and background information Conciseness Trim unnecessary words (e.g., “Describe photosynthesis” vs. “Can you tell me about how plants use sunlight?”). Ensure the prompt remains relevant to the desired output Tone & Audience Alignment Match the tone to the goal (formal, casual, instructive). Example: For kids, “Explain how rainbows form in simple terms.” Explicit Instructions Directly state what is needed e.g., “Compare X and Y”, “List pros and cons,” “Write a poem about…”. Guiding Constraints Limit scope to avoid overly broad answers e.g., “Focus on environmental impacts, not economic ones”. Constraints reduce ambiguity, focus responses, and improve relevance. Few example constraints, Format: “Summarize in 3 bullet points.” Length: “Explain in 2 sentences.” Scope: “Focus on environmental impacts, not economic ones.” Style/Tone: “Write a casual email,” or “Use non-technical terms.” Technical limits: “Keep code examples under 50 lines.” Few advanced Considerations for AI/LLM Prompts Examples or Demonstrations Include examples to set expectations for e.g., “Write a limerick like this: There once was a cat from Peru…”. Step-by-Step Guidance Break complex tasks into steps for e.g., “First analyze the Python code, then suggest solutions”. Role Assignment Assign roles to guide the AI for e.g., “Act as a historian explaining World War 2”. Avoid Bias Neutral phrasing ensures fair responses for e.g., “Discuss pros and cons of renewable energy” vs. “Why is solar energy bad?” the former is a well-formed prompt. Prompt engineering is an iterative process. Experiment with different phrasings and structures to see what works best. Analyze the LLM's responses and refine prompts accordingly. Make adjustments to improve the accuracy and relevance of the output. Prompt Builder: From the above section, we know that crafting effective prompts is essential for robust AI engagement. Prompt Builder tool on AI Toolkit helps in this enhancement by streamlining the whole process of crafting prompts. Prompt builder helps the users by helping in the following areas, o Prompt Creation, Modification, and Evaluation: Customize prompts through an accessible and straightforward interface. o AI-Assisted Prompt Generation: Articulate the project concept using everyday language, and the AI-powered feature will produce prompts for your exploration. o Organized Output Capability: Craft the prompts to yield outputs in a consistent, standardized and predictable manner. o Automated Code Generation for Prompt Usage: Following model and prompt experimentation, transition to coding immediately by accessing automatically generated, executable Python code. This tool has three sections on the UI. Prompt configuration Response History Prompt Configuration Section: In the Prompt configuration section, there are 4 major sub sections, Model System Prompt User Prompt Add Prompt Model: The Model section is the first subsection of the Prompt Configuration. Here, we select the model to use. The AI Toolkit offers a wide range of models, including remote models served from GitHub and those from providers such as OpenAI, Google, Anthropic, and Nvidia. For this tutorial we will be using OpenAI GPT-4o mini via GitHub System Prompt: In System prompt section, we provide instructions with relevant context to guide the system response. We can think of a system prompt as the "role" we give an AI before we ask it anything, like telling an actor what character to play. Generate Prompt: Upon choosing cloud-based / GitHub / Remote models, a new tool called as “Generate Prompt” is enabled, this is an AI Powered tool especially useful for crafting AI Powered well defined prompts which can be used in the “System Prompt” Section. Upon clicking on the “Generate Prompt” we can see a small window that pops up and asks for the input prompt. This can generate a prompt template by sharing basic details about the task. In this tutorial, let’s ask the LLM to generate prompt about “Professor in university teaching math”. Once the message is updated click on “Generate” button, and in a few seconds, we will have a well-structured prompt in the “System Prompt” section. The prompt that we generated is as follows Provide a detailed syllabus for a university-level mathematics course, including course objectives, weekly topics, assessment methods, and required materials. The syllabus should cover all essential components such as the course title, description, prerequisites, learning outcomes, weekly schedules, and any relevant policies regarding attendance, grading, and participation. # Steps 1. **Course Title and Description**: Clearly state the title of the course and provide a brief description of what the course will cover. 2. **Prerequisites**: List any required courses or knowledge necessary for students to enroll. 3. **Learning Outcomes**: Define what students are expected to learn by the end of the course. 4. **Weekly Schedule**: Outline topics for each week, along with any associated readings or assignments. 5. **Assessment Methods**: Describe how students will be evaluated (e.g., exams, quizzes, projects). 6. **Required Materials**: Include information on textbooks and other resources needed for the course. 7. **Course Policies**: State attendance, grading, and participation rules. # Output Format The output should be formatted as a structured syllabus, presented in clear sections with headings for each part. The document should be detailed yet concise, ideally around 3-5 pages in length. # Examples **Example 1** **Input:** Create a syllabus for a Calculus I course. **Output:** - **Course Title**: Calculus I - **Description**: An introduction to limits, derivatives, and integrals. - **Prerequisites**: Pre-Calculus or equivalent. - **Learning Outcomes**: Students will be able to calculate limits, differentiate basic functions, and understand the Fundamental Theorem of Calculus. - **Weekly Schedule**: - Week 1: Introduction to Limits - Week 2: Continuity - Week 3: Derivatives - ... - **Assessment Methods**: Midterm exam (30%), Final exam (40%), Weekly quizzes (20%), Participation (10%). - **Required Materials**: "Calculus: Early Transcendentals" by James Stewart. - **Course Policies**: Attendance required, late assignments will incur a penalty. **Example 2** **Input:** Design a syllabus for a Linear Algebra course. **Output:** - **Course Title**: Linear Algebra - **Description**: Study vector spaces, matrices, and linear transformations. - **Prerequisites**: None. - **Learning Outcomes**: Mastery of matrix operations and ability to solve systems of linear equations. - **Weekly Schedule**: - Week 1: Introduction to Vector Spaces - Week 2: Matrix Operations - Week 3: Determinants - ... - **Assessment Methods**: Two midterms (50%), Homework assignments (30%), Attendance (20%). - **Required Materials**: "Linear Algebra Done Right" by Sheldon Axler. - **Course Policies**: Participation in class discussions is mandatory. # Notes Ensure that the syllabus is comprehensive and tailored to the specific course topic. Consider including any unique teaching methods or technologies that will be employed during the course. User Prompt: User prompt is the specific question, instruction, or request that a person provides to the AI to elicit a response. It's the direct input from the user that initiates the AI's processing and generation of text. In AI Toolkit for a few models that support the multimodal feature, we can also upload images in this section. For this tutorial let’s input “Explain to me the Fourier equation in simple terms” Add Prompt: If any additional prompt needs to be added, we can configure more User or assistant prompt. So, in a conversation, we have: User Prompt: What the human says. Assistant Prompt: What the AI says. The major configuration part is now completed through this window, its now time to test the responses based on the LLM’s knowledge, in this case how well does GPT 4o mini behave in the role as university-level mathematics professor. In order to test it, we navigate to the next window, the Response section. Response Section: The Response section is where we finally get to see the responses. This section has the “Run” and “View Code” buttons. We can also choose the type of response we need. It can be a simple text or json schema. Upon choosing Json Schema, user will be prompted to “Prepare Schema”. Users can define their own schema or select from example. There are a few examples for the user to choose from. For this tutorial we will be using the simple text format. As we have our setup ready, we can directly click on the “Run” button, In a few seconds we have our well formatted and accurate answer on the screen, AI Toolkit‘s markdown capability can neatly format all the mathematical signs and equations. We can also add this to the “Assistant Prompt” by using the button provided. It provides better example for the LLM in the code later. The result from the LLM now seems very satisfactory with our well-crafted prompt. We can now proceed with the Code generation feature of the Prompt Builder tool of AI Toolkit. Upon clicking the “View Code” button, user is prompted to choose the SDK of their choice. This SDK lets us communicate with the API from the code. For this tutorial, we will use Azure AI Inference SDK. For more details on this SDK refer here. The code requires azure-ai-inference. Install the library by pip install azure-ai-inference """Run this model in Python > pip install azure-ai-inference """ import os from azure.ai.inference import ChatCompletionsClient from azure.ai.inference.models import AssistantMessage, SystemMessage, UserMessage from azure.ai.inference.models import ImageContentItem, ImageUrl, TextContentItem from azure.core.credentials import AzureKeyCredential # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings. # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens client = ChatCompletionsClient( endpoint = "https://models.inference.ai.azure.com", credential = AzureKeyCredential(os.environ["GITHUB_TOKEN"]), api_version = "2024-08-01-preview", ) response = client.complete( messages = [ SystemMessage(content = "Provide a detailed syllabus for a university-level mathematics course, including course objectives, weekly topics, assessment methods, and required materials.\n \nThe syllabus should cover all essential components such as the course title, description, prerequisites, learning outcomes, weekly schedules, and any relevant policies regarding attendance, grading, and participation.\n \n# Steps\n \n1. **Course Title and Description**: Clearly state the title of the course and provide a brief description of what the course will cover.\n2. **Prerequisites**: List any required courses or knowledge necessary for students to enroll.\n3. **Learning Outcomes**: Define what students are expected to learn by the end of the course.\n4. **Weekly Schedule**: Outline topics for each week, along with any associated readings or assignments.\n5. **Assessment Methods**: Describe how students will be evaluated (e.g., exams, quizzes, projects).\n6. **Required Materials**: Include information on textbooks and other resources needed for the course.\n7. **Course Policies**: State attendance, grading, and participation rules.\n \n# Output Format\n \nThe output should be formatted as a structured syllabus, presented in clear sections with headings for each part. The document should be detailed yet concise, ideally around 3-5 pages in length.\n \n# Examples\n \n**Example 1** \n**Input:** \nCreate a syllabus for a Calculus I course. \n**Output:** \n- **Course Title**: Calculus I \n- **Description**: An introduction to limits, derivatives, and integrals. \n- **Prerequisites**: Pre-Calculus or equivalent. \n- **Learning Outcomes**: Students will be able to calculate limits, differentiate basic functions, and understand the Fundamental Theorem of Calculus. \n- **Weekly Schedule**: \n - Week 1: Introduction to Limits \n - Week 2: Continuity \n - Week 3: Derivatives \n - ... \n- **Assessment Methods**: Midterm exam (30%), Final exam (40%), Weekly quizzes (20%), Participation (10%). \n- **Required Materials**: \"Calculus: Early Transcendentals\" by James Stewart. \n- **Course Policies**: Attendance required, late assignments will incur a penalty.\n \n**Example 2** \n**Input:** \nDesign a syllabus for a Linear Algebra course. \n**Output:** \n- **Course Title**: Linear Algebra \n- **Description**: Study vector spaces, matrices, and linear transformations. \n- **Prerequisites**: None. \n- **Learning Outcomes**: Mastery of matrix operations and ability to solve systems of linear equations. \n- **Weekly Schedule**: \n - Week 1: Introduction to Vector Spaces \n - Week 2: Matrix Operations \n - Week 3: Determinants \n - ... \n- **Assessment Methods**: Two midterms (50%), Homework assignments (30%), Attendance (20%). \n- **Required Materials**: \"Linear Algebra Done Right\" by Sheldon Axler. \n- **Course Policies**: Participation in class discussions is mandatory. \n \n# Notes\n \nEnsure that the syllabus is comprehensive and tailored to the specific course topic. Consider including any unique teaching methods or technologies that will be employed during the course."), UserMessage(content = [ TextContentItem(text = "Explain to me the Fourier equation in simple terms"), ]), ], model = "gpt-4o-mini", response_format = "text", max_tokens = 4096, temperature = 1, top_p = 1, ) print(response.choices[0].message.content) This Python code is ready to be modified and used in any Generative AI application. It can be modified with any Orchestration framework like Semantic Kernel to add more features or even make an agentic application. History Section: We also have the “History” and “New Prompt”. History shows all the previous sessions; we can revisit and resume working or perhaps check the output or regenerate the code. History” and “New Prompt” In essence, the Prompt Builder tool significantly streamlines the process of crafting effective prompts, saving developers valuable time. Beyond prompt creation, it also facilitates output evaluation, model behavior analysis, and generates quality code to accelerate application development. Stay tuned for upcoming blog posts, where we'll delve into even more advanced techniques for building powerful generative AI applications. You can also join our AI Sparks series to learn more about the capabilities of the AI Toolkit for Visual Studio Code.2.9KViews3likes0CommentsDeploy Your First App Using GitHub Copilot for Azure: A Beginner’s Guide
Deploying an app for the first time can feel overwhelming. You may find yourself switching between tutorials, scanning documentation, and wondering if you missed a step. But what if you could do it all in one place? Now you can! With GitHub Copilot for Azure, you can receive real time deployment guidance without leaving the Visual Studio Code. While it won’t fully automate deployments, it serves as a step-by-step AI powered assistant, helping you navigate the process with clear, actionable instructions. No more endless tab switching or searching for the right tutorial—simply type, deploy, and learn, all within your IDE i.e. Visual Studio Code. If you are a student, you have access to exclusive opportunities! Whether you are exploring new technologies or experimenting with them, platforms like GitHub Education and the Microsoft Learn Student Hub provide free Azure credits, structured learning paths, and certification opportunities. These resources can help you gain hands-on experience with GitHub Copilot for Azure and streamline your journey toward deploying applications efficiently. Prerequisites: Before we begin, ensure you have the following: Account in GitHub. Sign up with GitHub Copilot. Account in Azure (Claim free credits using Azure for Students) Visual Studio Code installed. Step 1: Installation How to install GitHub Copilot for Azure? Open VS Code, in the leftmost panel, click on Extensions, type – ‘GitHub Copilot for Azure’, and install the first result which is by Microsoft. After this installation, you will be prompted to install – GitHub Copilot, Azure Tools, and other required installations. Click on allow and install all required extensions from the same method, as used above. Step 2: Enable How to enable GitHub Copilot in GitHub? Open GitHub click on top rightmost Profile pic, a left panel will open. Click on Your Copilot. Upon opening, enable it for IDE, as shown in the below Figure. Step 3: Walkthrough Open VSCode, and click on the GitHub Copilot icon from topmost right side. This will open the GitHub Copilot Chat. From here, you can customize the model type and Send commands. Type azure to work with Azure related tasks. Below figure will help to locate the things smoothly: Step 4: Generate Boilerplate Code with GitHub Copilot Let’s start by creating a simple HTML website that we will deploy to Azure Static Web Apps Service. Prompt for GitHub Copilot: Create a simple "Hello, World!" code with HTML. Copilot will generate a basic structure like this: Then, click on "Edit with Copilot." It will create an index.html file and add the code to it. Then, click on "Accept" and modify the content and style if needed before moving forward. Step 5: Deploy Your App Using Copilot Prompts Instead of searching for documentation, let’s use Copilot to generate deployment instructions directly within Visual Studio Code. Trigger Deployment Prompts Using azure To get deployment related suggestions, use azure in GitHub Copilot’s chat. In the chat text box at the bottom of the pane, type the following prompt after azure, then select Send (paper airplane icon) or press Enter on your keyboard: Prompt: azure How do I deploy a static website? Copilot will provide two options: deploying via Azure Blob Storage or Azure Static Web App Service. We will proceed with Azure Static Web Apps, so we will ask Copilot to guide us through deploying our app using this service. We will use the following prompt: azure I would like to deploy a site using Azure Static Web Apps. Please provide a step-by-step guide. Copilot will then return steps like: You will receive a set of instructions to deploy your website. To make it simpler, you can ask Copilot for a more detailed guide. To get a detailed guide, we will use the following prompt: azure Can you provide a more detailed guide and elaborate on GitHub Actions, including the steps to take for GitHub Actions? Copilot will then return steps like: See? That’s how you can experiment, ask questions, and get step-by-step guidance. Remember, the better the prompt, the better the results will be. Step 6: Learn as You Deploy One of the best features of Copilot is that you can ask follow-up questions if anything is unclear—all within Visual Studio Code, without switching tabs. Examples of Useful Prompts: What Azure services should I use with my app? What is GitHub Actions, and how does it work? What are common issues when deploying to Azure, and how can I fix them? Copilot provides contextual responses, guiding you through troubleshooting and best practices. You can learn more about this here. Conclusion: With GitHub Copilot for Azure, deploying applications is now more intuitive than ever. Instead of memorizing complex commands, you can use AI powered prompts to generate deployment steps in real time and even debug the errors within Visual Studio Code. 🚀 Next Steps: Experience with different prompts and explore how Copilot assists you. Try deploying more advanced applications, like Node.js or Python apps. GitHub Copilot isn’t just an AI assistant, it’s a learning tool. The more you engage with it, the more confident you’ll become in deploying and managing applications on Azure! Learn more about GitHub Copilot for Azure: Understand what GitHub Copilot for Azure Preview is and how it works. See example prompts for learning more about Azure and understanding your Azure account, subscription, and resources. See example prompts for designing and developing applications for Azure. See example prompts for deploying your application to Azure. See example prompts for optimizing your applications in Azure. See example prompts for troubleshooting your Azure resources. That's it, folks! But the best part? You can become part of a thriving community of learners and builders by joining the Microsoft Learn Student Ambassadors Community. Connect with like-minded individuals, explore hands-on projects, and stay updated with the latest in cloud and AI. 💬 Join the community on Discord here.963Views2likes1CommentVisual Studio AI Toolkit : Building Phi-3 GenAI Applications
Port Forwarding, a valuable feature within the AI Toolkit, serves as a crucial gateway for seamless communication with the GenAI model. Whether it's through a straightforward API call or leveraging the SDKs, this functionality greatly enhances our ability to harness the power of the LLM/SLM. By enabling Port Forwarding, a plethora of new scenarios unfold, unlocking the full potential of our interactions with the model.9.6KViews2likes0CommentsMicrosoft Student Summit March 2023 - Start and Accelerate Your Career in Tech
Student Summit - Start and Accelerate Your Career in Tech In partnership with Microsoft Reactor this exciting 90-minute event will help build your confidence and motivation to skill on the Microsoft Cloud, and coach you on the next steps to continue your learning on topics including - Application Development and Developer Tools, Low Code/ No-Code / Fusion Development, and AI, Data and Machine Learning.10KViews2likes1CommentHow to Customize Visual Studio Code Chat with GitHub Copilot and Semantic Kernel
Discover how to customize Visual Studio Code Chat to revolutionize your development workflow with AI. By leveraging GitHub Copilot, Semantic Kernel, and Azure AI Agent Service, you can create chat participants tailored to tasks like project creation, requirement analysis, and code orchestration. Learn to integrate models like o1-mini for reasoning and .NET Aspire for distributed application management. Empower your IDE with AI to streamline complex workflows and boost efficiency.1.6KViews1like0CommentsRecipe Generator Application with Phi-3 Vision on AI Toolkit Locally
In today's data-driven world, images have become a ubiquitous source of information. From social media feeds to medical imaging, we encounter and generate images constantly. Extracting meaningful insights from these visual data requires sophisticated analysis techniques. In this blog post let’s build an Image Analysis Application using the cutting-edge Phi-3 Vision model completely free of cost and on-premise environment using the VS Code AI Toolkit. We'll explore the exciting possibilities that this powerful combination offers. The AI Toolkit for Visual Studio Code (VS Code) is a VS Code extension that simplifies generative AI app development by bringing together cutting-edge AI development tools and models. I would recommend going through the following blogs for getting started with VS Code AI Toolkit. 1. Visual Studio Code AI Toolkit: How to Run LLMs locally 2. Visual Studio AI Toolkit : Building Phi-3 GenAI Applications 3. Building Retrieval Augmented Generation on VSCode & AI Toolkit 4. Bring your own models on AI Toolkit - using Ollama and API keys Setup VS Code AI Toolkit: Launch the VS Code application and Click on the VS Code AI Toolkit extension. Login to the GitHub account if not already done. Once ready, click on model catalog. In the model catalog there are a lot of models, broadly classified into two categories, Local Run (with CPU and with GPU) Remote Access (Hosted by GitHub and other providers) For this blog, we will be using a Local Run model. This will utilize the local machine’s hardware to run the Language model. Since it involves analyzing images, we will be using the language model which supports vision operations and hence Phi-3-Vision will be a good fit as its light and supports local run. Download the model and then further it will be loaded it in the playground to test. Once downloaded, Launch the “Playground” tab and load the Phi-3 Vision model from the dropdown. The Playground also shows that Phi-3 vision allows image attachments. We can try it out before we start developing the application. Let’s upload the image using the “Paperclip icon” on the UI. I have uploaded image of Microsoft logo and prompted the language model to Analyze and explain the image. Phi-3 vision running on local premise boasts an uncanny ability to not just detect but unerringly pinpoint the exact Company logo and decipher the name with astonishing precision. This is a simple use case, but it can be built upon with various applications to unlock a world of new possibilities. Port Forwarding: Port Forwarding, a valuable feature within the AI Toolkit, serves as a crucial gateway for seamless communication with the GenAI model. To do this, launch the terminal and navigate to the “Ports” section. There will be button “Forward a Port”, click on that and select any desired port, in this blog we will use 5272 as the port. The Model-as-a-server is now ready, where the model will be available on the port 5272 to respond to the API calls. It can be tested with any API testing application. To know more click here. Creating Application with Python using OpenAI SDK: To follow this section, Python must be installed on the local machine. Launch the new VS Code window and set the working directory. Create a new Python Virtual environment. Once the setup is ready, open the terminal on VS Code, and install the libraries using “pip”. pip install openai pip install streamlit Before we build the streamlit application, lets develop the basic program and check the responses in the VSCode terminal and then further develop a basic webapp using the streamlit framework. Basic Program Import libraries: import base64 from openai import OpenAI base64: The base64 module provides functions for encoding binary data to base64-encoded strings and decoding base64-encoded strings back to binary data. Base64 encoding is commonly used for encoding binary data in text-based formats such as JSON or XML. OpenAI: The OpenAI package is a Python client library for interacting with OpenAI's API. The OpenAI class provides methods for accessing various OpenAI services, such as generating text, performing natural language processing tasks, and more. Initialize Client: Initialize an instance of the OpenAI class from the openai package, client = OpenAI( base_url="http://127.0.0.1:5272/v1/", api_key="xyz" # required by API but not used ) OpenAI (): Initializes a OpenAI model with specific parameters, including a base URL for the API, an API key, a custom model name, and a temperature setting. This model is used to generate responses based on user queries. This instance will be used to interact with the OpenAI API. base_url = "http://127.0.0.1:5272/v1/": Specifies the base URL for the OpenAI API. In this case, it points to a local server running on 127.0.0.1 (localhost) at port 5272. api_key = "ai-toolkit": The API key used to authenticate requests to the OpenAI API. In case of AI Toolkit usage, we don’t have to specify any API key. The image analysis application will frequently deal with images uploaded by users. But to send these images to GenAI model, we need them in a format it understands. This is where the encode_image function comes in. # Function to encode the image def encode_image(image_path): with open(image_path, "rb") as image_file: return base64.b64encode(image_file.read()).decode("utf-8") Function Definition: def encode_image(image_path): defines a function named encode_image that takes a single argument, image_path. This argument represents the file path of the image we want to encode. Opening the Image: with open(image_path, "rb") as image_file: opens the image file specified by image_path in binary reading mode ("rb"). This is crucial because we're dealing with raw image data, not text. Reading Image Content: image_file.read() reads the entire content of the image file into a byte stream. Remember, images are stored as collections of bytes representing color values for each pixel. Base64 Encoding: base64.b64encode(image_file.read()) encodes the byte stream containing the image data into base64 format. Base64 encoding is a way to represent binary data using a combination of printable characters, which makes it easier to transmit or store the data. Decoding to UTF-8: .decode("utf-8") decodes the base64-encoded data into a UTF-8 string. This step is necessary because the OpenAI API typically expects text input, and the base64-encoded string can be treated as text containing special characters. Returning the Encoded Image: return returns the base64-encoded string representation of the image. This encoded string is what we'll send to the AI model for analysis. In essence, the encode_image function acts as a bridge, transforming an image file on your computer into a format that the AI model can understand and process. Path for the Image: We will use an image stored on our local machine for this section, while we develop the webapp, we will change this to accept it to what the user uploads. image_path = "C:/img.jpg" #path of the image here This line of code is crucial for any program that needs to interact with an image file. It provides the necessary information for the program to locate and access the image data. Base64 String: # Getting the base64 string base64_image = encode_image(image_path) This line of code is responsible for obtaining the base64-encoded representation of the image specified by the image_path. Let's break it down: encode_image(image_path): This part calls the encode_image function, which we've discussed earlier. This function takes the image_path as input and performs the following: Reads the image file from the specified path. Converts the image data into a base64-encoded string. Returns the resulting base64-encoded string. base64_image = ...: This part assigns the return value of the encode_image function to the variable base64_image. This section effectively fetches the image from the given location and transforms it into a special format (base64) that can be easily handled and transmitted by the computer system. This base64-encoded string will be used subsequently to send the image data to the AI model for analysis. Invoking the Language Model: This code tells the AI model what to do with the image. response = client.chat.completions.create( model="Phi-3-vision-128k-cpu-int4-rtn-block-32-acc-level-4-onnx", messages=[ { "role": "user", "content": [ { "type": "text", "text": "What's in the Image?", }, { "type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}, }, ], } ], ) response = client.chat.completions.create(...): This line sends instructions to the AI model we're using (represented by client). Here's a breakdown of what it's telling the model: chat.completions.create: We're using a specific part of the OpenAI API designed for having a conversation-like interaction with the model. The ... part: This represents additional details that define what we want the model to do, which we'll explore next. Let's break down the details (...) sent to the model: 1) model="Phi-3-vision-128k-cpu-int4-rtn-block-32-acc-level-4-onnx": This tells the model exactly which AI model to use for analysis. In our case, it's the "Phi-3-vision" model. 2) messages: This defines what information we're providing to the model. Here, we're sending two pieces of information: role": "user": This specifies that the first message comes from a user (us). The content: This includes two parts: "What's in the Image?": This is the prompt we're sending to the model about the image. "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}: This sends the actual image data encoded in base64 format (stored in base64_image). In a nutshell, this code snippet acts like giving instructions to the AI model. We specify the model to use, tell it we have a question about an image, and then provide the image data itself. Printing the response on the console: print(response.choices[0].message.content) We asked the AI model "What's in this image?" This line of code would then display the AI's answer. Console response: Finally, we can see the response on the terminal. Now to make things more interesting, let’s convert this into a webapp using the streamlit framework. Recipe Generator Application with Streamlit: Now we know how to interact with the Vision model offline using a basic console. Let’s make things even more exciting by applying all this to a use-case which probably will be most loved by all those who are cooking enthusiasts!! Yes, let’s create an application which will assist in cooking by looking what’s in the image of ingredients! Create a new file and name is as “app.py” select the same. venv that was used earlier. Make sure the Visual studio toolkit is running and serving the Phi-3 Vision model through the port 5272. First step is importing the libraries, import streamlit as st import base64 from openai import OpenAI base64 and OpenAI is the same as we had used in the earlier section. Streamlit: This part imports the entire Streamlit library, which provides a powerful set of tools for creating user interfaces (UIs) with Python. Streamlit simplifies the process of building web apps by allowing you to write Python scripts that directly translate into interactive web pages. client = OpenAI( base_url="http://127.0.0.1:5272/v1/", api_key="xyz" # required by API but not used ) As discussed in the earlier section, initializing the client and configuring the base_url and api_key. st.title('Recipe Generator 🍔') st.write('This is a simple recipe generator application.Upload images of the Ingridients and get the recipe by Chef GenAI! 🧑🍳') uploaded_file = st.file_uploader("Choose a file") if uploaded_file is not None: st.image(uploaded_file, width=300) st.title('Recipe Generator 🍔'): This line sets the title of the Streamlit application as "Recipe Generator" with a visually appealing burger emoji. st.write(...): This line displays a brief description of the application's functionality to the user. uploaded_file = st.file_uploader("Choose a file"): This creates a file uploader component within the Streamlit app. Users can select and upload an image file (likely an image of ingredients). if uploaded_file is not None: : This conditional block executes only when the user has actually selected and uploaded a file. st.image(uploaded_file, width=300): If an image is uploaded, this line displays the uploaded image within the Streamlit app with a width of 300 pixels. In essence, this code establishes the basic user interface for the Recipe Generator app. It allows users to upload an image, and if an image is uploaded, it displays the image within the app. preference = st.sidebar.selectbox( "Choose your preference", ("Vegetarian", "Non-Vegetarian") ) cuisine = st.sidebar.selectbox( "Select for Cuisine", ("Indian","Chinese","French","Thai","Italian","Mexican","Japanese","American","Greek","Spanish") ) We use Streamlit's sidebar and selectbox features to create interactive user input options within a web application: st.sidebar.selectbox(...): This line creates a dropdown menu (selectbox) within the sidebar of the Streamlit application.The first argument, "Choose your preference", sets the label or title for the dropdown.The second argument, ("Vegetarian", "Non-Vegetarian"), defines the list of options available for the user to select (in this case, dietary preferences). cuisine = st.sidebar.selectbox(...): This line creates another dropdown menu in the sidebar, this time for selecting the desired cuisine.The label is "Select for Cuisine".The options provided include "Indian", "Chinese", "French", and several other popular cuisines. In essence, this code allows users to interact with the application by selecting their preferred dietary restrictions (Vegetarian or Non-Vegetarian) and desired cuisine from the dropdown menus in the sidebar. def encode_image(uploaded_file): """Encodes a Streamlit uploaded file into base64 format""" if uploaded_file is not None: content = uploaded_file.read() return base64.b64encode(content).decode("utf-8") else: return None base64_image = encode_image(uploaded_file) The same function of encode_image as discussed in the earlier section is being used here. if st.button("Ask Chef GenAI!"): if base64_image: response = client.chat.completions.create( model="Phi-3-vision-128k-cpu-int4-rtn-block-32-acc-level-4-onnx", messages=[ { "role": "user", "content": [ { "type": "text", "text": f"STRICTLY use the ingredients in the image to generate a {preference} recipe and {cuisine} cuisine.", }, { "type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}, }, ], } ], ) print(response.choices[0].message.content) st.write(response.choices[0].message.content) else: st.write("Please upload an image with any number of ingridients and instantly get a recipe.") Above code block implements the core functionality of the Recipe Generator app, triggered when the user clicks a button labeled "Ask Chef GenAI!": if st.button("Ask Chef GenAI!"): This line checks if the user has clicked the button. If they have, the code within the if block executes. if base64_image: This inner if condition checks if a variable named base64_image has a value. This variable likely stores the base64 encoded representation of the uploaded image (containing ingredients). If base64_image has a value (meaning an image is uploaded), the code proceeds. client.chat.completions.create(...): Client that had been defined earlier interacts with the API . Here, it calls a to generate text completions, thereby invoking a small language model. The arguments provided specify the model to be used ("Phi-3-vision-128k-cpu-int4-rtn-block-32-acc-level-4-onnx") and the message to be completed. The message consists of two parts within a list: User Input: The first part defines the user's role ("user") and the content they provide. This content is an instruction with two key points: Dietary Preference: It specifies to "STRICTLY use the ingredients in the image" to generate a recipe that adheres to the user's preference (vegetarian or non-vegetarian, set using the preference dropdown). Cuisine Preference: It mentions the desired cuisine type (Indian, Chinese, etc., selected using the cuisine dropdown). Image Data: The second part provides the image data itself. It includes the type ("image_url") and the URL, which is constructed using the base64_image variable containing the base64 encoded image data. print(response.choices[0].message.content) & st.write(...): The response will contain a list of possible completions. Here, the code retrieves the first completion (response.choices[0]) and extracts its message content. This content is then printed to the console like before and displayed on the Streamlit app using st.write. else block: If no image is uploaded (i.e., base64_image is empty), the else block executes. It displays a message reminding the user to upload an image to get recipe recommendations. The above code block is the same as before except the we have now modified it to accept few inputs and also have made it compatible with streamlit. The coding is now completed for our streamlit application! It's time to test the application. Navigate to the terminal on Visual Studio Code and enter the following command, (if the file is named as app.py) streamlit run app.py Upon successful run, it will redirect to default browser and a screen with the Recipe generator will be launched, Upload an image with ingredients, select the recipe, cuisine and click on “Ask Chef GenAI”. It will take a few moments for delightful recipe generation. While generating we can see the logs on the terminal and finally the recipe will be shown on the screen! Enjoy your first recipe curated by Chef GenAI powered by Phi-3 vision model on local prem using Visual Studio AI Toolkit! The code is available on the following GitHub Repository. In the upcoming series we will explore more types of Gen AI implementations with AI toolkit. Resources: 1. Visual Studio Code AI Toolkit: Run LLMs locally 2. Visual Studio AI Toolkit : Building Phi-3 GenAI Applications 3. Building Retrieval Augmented Generation on VSCode & AI Toolkit 4. Bring your own models on AI Toolkit - using Ollama and API keys 5. Expanded model catalog for AI Toolkit 6. Azure Toolkit Samples GitHub Repository488Views1like1CommentFestival Web
Microsoft anuncia una nueva iniciativa para ayudarte a impulsar tu carrera en desarrollo web llamada el Festival Web, es una serie de charlas en vivo que comienza desde el 31 de octubre y finaliza el 14 de noviembre. En estas charlas podrás aprender de la mano de expertos y conocer diferentes herramientas como VSCode y GitHub. ¡Regístrate en las charlas en vivo del Festival Web y comienza tu viaje en el mundo de la programación web con expertos de la industria! Además, hay una oportunidad gratuita para quienes desean iniciarse en este campo. ¡Descubre más información sobre esta iniciativa de Microsoft en este blog!4KViews1like0Comments