vscode

41 Topics

Build an AI-Powered Space Invaders Game
Build an AI-Powered Space Invaders Game: Integrating LLMs into HTML5 Games with Microsoft Foundry Local Introduction What if your game could talk back to you? Imagine playing Space Invaders while an AI commander taunts you during battle, delivers personalized mission briefings, and provides real-time feedback based on your performance. This isn't science fiction it's something you can build today using HTML, JavaScript, and a locally-running AI model. In this tutorial, we'll explore how to create an HTML5 game with integrated Large Language Model (LLM) features using Microsoft Foundry Local. You'll learn how to combine classic game development with modern AI capabilities, all running entirely on your own machine—no cloud services, no API costs, no internet connection required during gameplay. We'll be working with the Space Invaders - AI Commander Edition project, which demonstrates exactly how to architect games that leverage local AI. Whether you're a student learning game development, exploring AI integration patterns, or building your portfolio, this guide provides practical, hands-on experience with technologies that are reshaping how we build interactive applications. What You'll Learn By the end of this tutorial, you'll understand how to combine traditional web development with local AI inference. These skills transfer directly to building chatbots, interactive tutorials, AI-enhanced productivity tools, and any application where you want intelligent, context-aware responses. Set up Microsoft Foundry Local for running AI models on your machine Understand the architecture of games that integrate LLM features Use GitHub Copilot CLI to accelerate your development workflow Implement AI-powered game features like dynamic commentary and adaptive feedback Extend the project with your own creative AI features Why Local AI for Games? Before diving into the code, let's understand why running AI locally matters for game development. Traditional cloud-based AI services have limitations that make them impractical for real-time gaming experiences. Latency is the first challenge. Cloud API calls typically take 500ms to several seconds, an eternity in a game running at 60 frames per second. Local inference can respond in tens of milliseconds, enabling AI responses that feel instantaneous and natural. When an enemy ship appears, your AI commander can taunt you immediately, not three seconds later. Cost is another consideration. Cloud AI services charge per token, which adds up quickly when generating dynamic content during gameplay. Local models have zero per-use cost, once installed, they run entirely on your hardware. This frees you to experiment without worrying about API bills. Privacy and offline capability complete the picture. Local AI keeps all data on your machine, perfect for games that might handle player information. And since nothing requires internet connectivity, your game works anywhere, on planes, in areas with poor connectivity, or simply when you want to play without network access. Understanding Microsoft Foundry Local Microsoft Foundry Local is a runtime that enables you to run small language models (SLMs) directly on your computer. It's designed for developers who want to integrate AI capabilities into applications without requiring cloud infrastructure. Think of it as having a miniature AI assistant living on your laptop. Foundry Local handles the complex work of loading AI models, managing memory, and processing inference requests through a simple API. You send text prompts, and it returns AI-generated responses, all happening locally on your CPU or GPU. The models are optimized to run efficiently on consumer hardware, so you don't need a supercomputer. For our Space Invaders game, Foundry Local powers the "AI Commander" feature. During gameplay, the game sends context about what's happening, your score, accuracy, current level, enemies remaining and receives back contextual commentary, taunts, and encouragement. The result feels like playing alongside an AI companion who actually understands the game. Setting Up Your Development Environment Let's get your machine ready for AI-powered game development. We'll install Foundry Local, clone the project, and verify everything works. The entire setup takes about 10-15 minutes. Step 1: Install Microsoft Foundry Local Foundry Local installation varies by operating system. Open your terminal and run the appropriate command: # Windows (using winget) winget install Microsoft.FoundryLocal # macOS (using Homebrew) brew install microsoft/foundrylocal/foundrylocal These commands download and install the Foundry Local runtime along with a default small language model. The installation includes everything needed to run AI inference locally. Verify the installation by running: foundry --version If you see a version number, Foundry Local is ready. If you encounter errors, ensure you have administrator/sudo privileges and that your package manager is up to date. Step 2: Install Node.js (If Not Already Installed) Our game's AI features require a small Node.js server to communicate between the browser and Foundry Local. Check if Node.js is installed: node --version If you see a version number (v16 or higher recommended), you're set. Otherwise, install Node.js: # Windows winget install OpenJS.NodeJS.LTS # macOS brew install node # Linux sudo apt install nodejs npm Node.js provides the JavaScript runtime that powers our proxy server, bridging browser code with the local AI model. Step 3: Clone the Project Get the Space Invaders project onto your machine: git clone https://github.com/leestott/Spaceinvaders-FoundryLocal.git cd Spaceinvaders-FoundryLocal This downloads all game files, including the HTML interface, game logic, AI integration module, and server code. Step 4: Install Dependencies and Start the Server Install the Node.js packages and launch the AI-enabled server: npm install npm start The first command downloads required packages (primarily for the proxy server). The second starts the server, which listens for AI requests from the game. You should see output indicating the server is running on port 3001. Step 5: Play the Game Open your browser and navigate to: http://localhost:3001 You should see Space Invaders with "AI: ONLINE" displayed in the game HUD, indicating that AI features are active. Use arrow keys or A/D to move, SPACE to fire, and P to pause. The AI Commander will start providing commentary as you play! Understanding the Project Architecture Now that the game is running, let's explore how the different pieces fit together. Understanding this architecture will help you modify the game and apply these patterns to your own projects. The project follows a clean separation of concerns, with each file handling a specific responsibility: Spaceinvaders-FoundryLocal/ ├── index.html # Main game page and UI structure ├── styles.css # Retro arcade visual styling ├── game.js # Core game logic and rendering ├── llm.js # AI integration module ├── sound.js # Web Audio API sound effects ├── server.js # Node.js proxy for Foundry Local └── package.json # Project configuration index.html: Defines the game canvas and UI elements. It's the entry point that loads all other modules. game.js: Contains the game loop, physics, collision detection, scoring, and rendering logic. This is the heart of the game. llm.js: Handles all communication with the AI backend. It formats game state into prompts and processes AI responses. server.js: A lightweight Express server that proxies requests between the browser and Foundry Local. sound.js: Synthesizes retro sound effects using the Web Audio API—no audio files needed! How the AI Integration Works The magic of the AI Commander happens through a simple but powerful pattern. Let's trace the flow from gameplay event to AI response. When something interesting happens in the game, you clear a wave, achieve a combo, or lose a life, the game logic in game.js triggers an AI request. This request includes context about the current game state: your score, accuracy percentage, current level, lives remaining, and what just happened. The llm.js module formats this context into a prompt. For example, when you clear a wave with 85% accuracy, it might construct: You are an AI Commander in a Space Invaders game. The player just cleared wave 3 with 85% accuracy. Score: 12,500. Lives: 3. Provide a brief, enthusiastic comment (1-2 sentences). This prompt travels to server.js , which forwards it to Foundry Local. The AI model processes the prompt and generates a response like: "Impressive accuracy, pilot! Wave 3 didn't stand a chance. Keep that trigger finger sharp!" The response flows back through the server to the browser, where llm.js passes it to the game. The game displays the message in the HUD, creating the illusion of playing alongside an AI companion. This entire round trip typically completes in 50-200 milliseconds, fast enough to feel responsive without interrupting gameplay. Using GitHub Copilot CLI to Explore and Modify the Code GitHub Copilot CLI accelerates your development workflow by letting you ask questions and generate code directly in your terminal. Let's use it to understand and extend the Space Invaders project. Installing Copilot CLI If you haven't installed Copilot CLI yet, here's the quick setup: # Install GitHub CLI winget install GitHub.cli # Windows brew install gh # macOS # Authenticate with GitHub gh auth login # Add Copilot extension gh extension install github/gh-copilot # Verify installation gh copilot --help With Copilot CLI ready, you can interact with AI directly from your terminal while working on the project. Exploring Code with Copilot CLI Use Copilot to understand unfamiliar code. Navigate to the project directory and try: gh copilot explain "How does llm.js communicate with the server?" Copilot analyzes the code and explains the communication pattern, helping you understand the architecture without reading every line manually. You can also ask about specific functions: gh copilot explain "What does the generateEnemyTaunt function do?" This accelerates onboarding to unfamiliar codebases, a valuable skill when working with open source projects or joining teams. Generating New Features Want to add a new AI feature? Ask Copilot to help generate the code: gh copilot suggest "Create a function that asks the AI to generate a mission briefing at the start of each level, including the level number and a random mission objective" Copilot generates starter code that you can customize and integrate. This combination of AI-powered development tools and AI-integrated gameplay demonstrates how LLMs are transforming both how we build games and how games behave. Customizing the AI Commander The default AI Commander provides generic gaming commentary, but you can customize its personality and responses. Open llm.js to find the prompt templates that control AI behavior. Changing the AI's Personality The system prompt defines who the AI "is." Find the base prompt and modify it: // Original const systemPrompt = "You are an AI Commander in a Space Invaders game."; // Customized - Drill Sergeant personality const systemPrompt = `You are Sergeant Blaster, a gruff but encouraging drill sergeant commanding space cadets. Use military terminology, call the player "cadet," and be tough but fair.`; // Customized - Supportive Coach personality const systemPrompt = `You are Coach Nova, a supportive and enthusiastic gaming coach. Use encouraging language, celebrate small victories, and provide gentle guidance when players struggle.`; These personality changes dramatically alter the game's feel without changing any gameplay code. It's a powerful example of how AI can add variety to games with minimal development effort. Adding New Commentary Triggers Currently the AI responds to wave completions and game events. You can add new triggers in game.js : // Add AI commentary when player achieves a kill streak if (killStreak >= 5 && !streakCommentPending) { requestAIComment('killStreak', { count: killStreak }); streakCommentPending = true; } // Add AI reaction when player narrowly avoids death if (nearMissOccurred) { requestAIComment('nearMiss', { livesRemaining: lives }); } Each new trigger point adds another opportunity for the AI to engage with the player, making the experience more dynamic and personalized. Understanding the Game Features Beyond AI integration, the Space Invaders project demonstrates solid game development patterns worth studying. Let's explore the key features. Power-Up System The game includes eight different power-ups, each with unique effects: SPREAD (Orange): Fires three projectiles in a spread pattern LASER (Red): Powerful beam with high damage RAPID (Yellow): Dramatically increased fire rate MISSILE (Purple): Homing projectiles that track enemies SHIELD (Blue): Grants an extra life EXTRA LIFE (Green): Grants two extra lives BOMB (Red): Destroys all enemies on screen BONUS (Gold): Random score bonus between 250-750 points Power-ups demonstrate state management, tracking which power-up is active, applying its effects to player actions, and handling timeouts. Study the power-up code in game.js to understand how temporary state modifications work. Leaderboard System The game persists high scores using the browser's localStorage API: // Saving scores localStorage.setItem('spaceInvadersScores', JSON.stringify(scores)); // Loading scores const savedScores = localStorage.getItem('spaceInvadersScores'); const scores = savedScores ? JSON.parse(savedScores) : []; This pattern works for any data you want to persist between sessions—game progress, user preferences, or accumulated statistics. It's a simple but powerful technique for web games. Sound Synthesis Rather than loading audio files, the game synthesizes retro sound effects using the Web Audio API in sound.js . This approach has several benefits: no external assets to load, smaller project size, and complete control over sound parameters. Examine how oscillators and gain nodes combine to create laser sounds, explosions, and victory fanfares. This knowledge transfers directly to any web project requiring audio feedback. Extending the Project: Ideas for Students Ready to make the project your own? Here are ideas ranging from beginner-friendly to challenging, each teaching valuable skills. Beginner: Customize Visual Theme Modify styles.css to create a new visual theme. Try changing the color scheme from green to blue, or create a "sunset" theme with orange and purple gradients. This builds CSS skills while making the game feel fresh. Intermediate: Add New Enemy Types Create a new enemy class in game.js with different movement patterns. Perhaps enemies that move in sine waves, or boss enemies that take multiple hits. This teaches object-oriented programming and game physics. Intermediate: Expand AI Interactions Add new AI features like: Pre-game mission briefings that set up the story Dynamic difficulty hints when players struggle Post-game performance analysis and improvement suggestions AI-generated names for enemy waves Advanced: Multiplayer Commentary Modify the game for two-player support and have the AI provide play-by-play commentary comparing both players' performance. This combines game networking concepts with advanced AI prompting. Advanced: Voice Integration Use the Web Speech API to speak the AI Commander's responses aloud. This creates a more immersive experience and demonstrates browser speech synthesis capabilities. Troubleshooting Common Issues If something isn't working, here are solutions to common problems. "AI: OFFLINE" Displayed in Game This means the game can't connect to the AI server. Check that: The server is running ( npm start shows no errors) You're accessing the game via http://localhost:3001 , not directly opening the HTML file Foundry Local is installed correctly ( foundry --version works) Server Won't Start If npm start fails: Ensure you ran npm install first Check that port 3001 isn't already in use by another application Verify Node.js is installed ( node --version ) AI Responses Are Slow Local AI performance depends on your hardware. If responses feel sluggish: Close other resource-intensive applications Ensure your laptop is plugged in (battery mode may throttle CPU) Consider that first requests may be slower as the model loads Key Takeaways Local AI enables real-time game features: Microsoft Foundry Local provides fast, free, private AI inference perfect for gaming applications Clean architecture matters: Separating game logic, AI integration, and server code makes projects maintainable and extensible AI personality is prompt-driven: Changing a few lines of prompt text completely transforms how the AI interacts with players Copilot CLI accelerates learning: Use it to explore unfamiliar code and generate new features quickly The patterns transfer everywhere: Skills from this project apply to chatbots, assistants, educational tools, and any AI-integrated application Conclusion and Next Steps You've now seen how to integrate AI capabilities into a browser-based game using Microsoft Foundry Local. The Space Invaders project demonstrates that modern AI features don't require cloud services or complex infrastructure, they can run entirely on your laptop, responding in milliseconds. More importantly, you've learned patterns that extend far beyond gaming. The architecture of sending context to an AI, receiving generated responses, and integrating them into user experiences applies to countless applications: customer support bots, educational tutors, creative writing tools, and accessibility features. Your next step is experimentation. Clone the repository, modify the AI's personality, add new commentary triggers, or build an entirely new game using these patterns. The combination of GitHub Copilot CLI for development assistance and Foundry Local for runtime AI gives you powerful tools to bring intelligent applications to life. Start playing, start coding, and discover what you can create when your games can think. Resources Space Invaders - AI Commander Edition Repository - Full source code and documentation Play Space Invaders Online - Try the basic version without AI features Microsoft Foundry Local Documentation - Official installation and API guide GitHub Copilot CLI Documentation - Installation and usage guide GitHub Education - Free developer tools for students Web Audio API Documentation - Learn about browser sound synthesis Canvas API Documentation - Master HTML5 game rendering
Lee_Stott
Feb 09, 2026 Place Educator Developer Blog
295Views
0likes
1Comment
Model Mondays S2:E4 Understanding AI Developer Experiences with Leo Yao
This week in Model Mondays, we put the spotlight on the AI Toolkit for Visual Studio Code - and explore the tools and workflows that make building generative AI apps and agents easier for developers. Read on for my recap. This post was generated with AI help and human revision & review. To learn more about our motivation and workflows, please refer to this document in our website. About Model Mondays Model Mondays is a weekly series designed to help you grow your Azure AI Foundry Model IQ step by step. Each week includes: 5-Minute Highlights – Quick news and updates about Azure AI models and tools on Monday 15-Minute Spotlight – Deep dive into a key model, protocol, or feature on Monday 30-Minute AMA on Friday – Live Q&A with subject matter experts from the Monday livestream If you're looking to grow your skills with the latest in AI model development, this series is a great place to begin. Useful links: Register for upcoming livestreams Watch past episodes Join the AMA on AI Developer Experiences Visit the Model Mondays forum Spotlight On: AI Developer Experiences 1. What is this topic and why is it important? AI Developer Experiences focus on making the process of building, testing, and deploying AI models as efficient as possible. With the right tools—such as the AI Toolkit and Azure AI Foundry extensions for Visual Studio Code—developers can eliminate unnecessary friction and focus on innovation. This is essential for accelerating the real-world impact of generative AI. 2. What is one key takeaway from the episode? The integration of Azure AI Foundry with Visual Studio Code allows developers to manage models, run experiments, and deploy applications directly from their preferred development environment. This unified workflow enhances productivity and simplifies the AI development lifecycle. 3. How can I get started? Here are a few resources to explore: Install the AI Toolkit for VS Code Explore Azure AI Foundry Documentation Join the Microsoft Tech Community to follow and contribute to discussions 4. What’s New in Azure AI Foundry? Azure AI Foundry continues to evolve to meet developer needs with more power, flexibility, and productivity. Here are some of the latest updates highlighted in this week’s episode: AI Toolkit for Visual Studio Code Now with deeper integration, allowing developers to manage models, run experiments, and deploy applications directly within their editor—streamlining the entire workflow. Prompt Shields Enhanced security capabilities designed to protect generative AI applications from prompt injection and unsafe content, improving reliability in production environments. Model Router A new intelligent routing system that dynamically directs model requests to the most suitable model available—enhancing performance and efficiency at scale. Expanded Model Catalog The catalog now includes more open-source and proprietary models, featuring the latest from Hugging Face, OpenAI, and other leading providers. Improved Documentation and Sample Projects Newly added guides and ready-to-use examples to help developers get started faster, understand workflows, and build confidently. My A-Ha Moment Before watching this episode, setting up an AI development environment always felt like a challenge. There were so many moving parts—configurations, integrations, and dependencies—that it was hard to know where to begin. Seeing the AI Toolkit in action inside Visual Studio Code changed everything for me. It was a realization moment: “That’s it? I can explore models, test prompts, and deploy apps—without ever leaving my editor?” This episode made it clear that building with AI doesn’t have to be complex or intimidating. With the right tools, experimentation becomes faster and far more enjoyable. Now, I’m genuinely excited to build, test, and explore new generative AI solutions because the process finally feels accessible. Coming Up Next Week In the next episode, we’ll be exploring Fine-Tuning and Distillation with Dave Voutila. This session will focus on how to adapt Azure OpenAI models to your unique use cases and apply best practices for efficient knowledge transfer. Register here to reserve your spot and be part of the conversation. Join the Community Building in AI is better when we do it together. That’s why the Azure AI Developer Community exists—to support your journey and provide resources every step of the way. Join the Discord for real-time discussions, events, and peer learning Explore the Forum to catch up on AMAs, ask questions, and connect with other developers About Me I'm Sharda, a Gold Microsoft Learn Student Ambassador passionate about cloud technologies and artificial intelligence. I enjoy learning, building, and helping others grow in tech. Connect with me: LinkedIn GitHub Dev.to Microsoft Tech Community
Sharda_Kaur
Sep 12, 2025 Place Educator Developer Blog
289Views
0likes
0Comments
Part 1 - Develop a VS Code Extension for Your Capstone Project
API Guardian - My Capstone Project As software and APIs evolve, developers encounter significant difficulties in maintaining and updating API endpoints. Breaking changes can lead to system instability, while outdated or unclear documentation makes maintenance less efficient. These challenges are further compounded by the time-consuming nature of updating dependencies and the tendency to prioritize new features over maintenance tasks. The absence of effective tools and processes to tackle these issues reduces overall productivity and developer efficiency. To address this, API Guardian was created as a Visual Studio Code extension that identifies API endpoints in a project and checks their functionality before deployment. This solution was developed to help developers save time spent fixing issues caused by breaking or non-breaking changes and to alleviate the difficulties in performing maintenance due to unclear or outdated documentation. Features and Capabilities This extension has 3 main features: Feature 1. Developers can decide if the extension will scan or skip specified files in the project. Press “Enter” to scan/skip all files. Type the file name (e.g., main.py) and press “Enter” to scan/skip a single file. Type file names with a delimiter (e.g., main.py | pythonFile.py) and press “Enter” to scan/skip multiple files. Feature 2. Custom hover messages when developers mouse over identified APIs This hover message will vary based on the status of the APIs. If the API returns a success status, the hover message will only show the completed API and its status. However, if an error occurs, the hover message will include this additional information: (1) API Name, (2) Official API Link, (3) Error Message, (4) Title of Recommended Fix and (5) Link to the Recommended Fix. Feature 3. Excel Report with Details of Identified APIs After all the identified APIs have been tested, an excel report will exported with the following information to allow developers to easily identify the APIs in the project. What Technology and Products does it involved? Building a Visual Studio Code extension and publishing it to the Visual Studio Marketplace involves a mix of technologies and tools. The project was initiated using the NPM package, generator-code, to set up a JavaScript project for developing the extension. All the extension's logic will be developed and managed within the "extension.js" file generated during the setup process. Once ready for deployment, we will package the extension using "vsce" to generate a ".vsix" file, which will then be used for deployment to the Visual Studio Code Marketplace. The deployment process involves requiring the user to create a publishing account and using tools like vsce to upload and manage the extension's version, updates, and metadata. As part of this process, you would need to create a Personal Access Token (PAT) from Azure DevOps. This token is used to verify your identity and authenticate the publishing tool, allowing you to securely upload your extension to the Visual Studio Marketplace. The PAT provides the necessary permissions for tasks such as version management, publishing new releases, and updating the extension metadata. What did I learn? Throughout this journey, I learned not just about the technical stack but also about the value of detailed project setup and secure publishing processes. While the technical steps can be challenging, they’re incredibly rewarding, and I’m excited to dive deeper into it moving forward. I’m looking forward to exploring how the extension can be further improved and enhanced. If you're interested in learning more about how my API guidance was built, keep an eye out for my next post! API Guardian https://marketplace.visualstudio.com/items?itemName=APIGuardian-vsc.api About the Authors Main Author - Ms Joy Cheng Yee Shing, BSc (Hon) Computing Science Academic Supervisor - Dr Peter Yau, Microsoft MVP
drpeteryau
May 07, 2025 Place Educator Developer Blog
703Views
0likes
0Comments
Deploy Your First App Using GitHub Copilot for Azure: A Beginner’s Guide
Deploying an app for the first time can feel overwhelming. You may find yourself switching between tutorials, scanning documentation, and wondering if you missed a step. But what if you could do it all in one place? Now you can! With GitHub Copilot for Azure, you can receive real time deployment guidance without leaving the Visual Studio Code. While it won’t fully automate deployments, it serves as a step-by-step AI powered assistant, helping you navigate the process with clear, actionable instructions. No more endless tab switching or searching for the right tutorial—simply type, deploy, and learn, all within your IDE i.e. Visual Studio Code. If you are a student, you have access to exclusive opportunities! Whether you are exploring new technologies or experimenting with them, platforms like GitHub Education and the Microsoft Learn Student Hub provide free Azure credits, structured learning paths, and certification opportunities. These resources can help you gain hands-on experience with GitHub Copilot for Azure and streamline your journey toward deploying applications efficiently. Prerequisites: Before we begin, ensure you have the following: Account in GitHub. Sign up with GitHub Copilot. Account in Azure (Claim free credits using Azure for Students) Visual Studio Code installed. Step 1: Installation How to install GitHub Copilot for Azure? Open VS Code, in the leftmost panel, click on Extensions, type – ‘GitHub Copilot for Azure’, and install the first result which is by Microsoft. After this installation, you will be prompted to install – GitHub Copilot, Azure Tools, and other required installations. Click on allow and install all required extensions from the same method, as used above. Step 2: Enable How to enable GitHub Copilot in GitHub? Open GitHub click on top rightmost Profile pic, a left panel will open. Click on Your Copilot. Upon opening, enable it for IDE, as shown in the below Figure. Step 3: Walkthrough Open VSCode, and click on the GitHub Copilot icon from topmost right side. This will open the GitHub Copilot Chat. From here, you can customize the model type and Send commands. Type azure to work with Azure related tasks. Below figure will help to locate the things smoothly: Step 4: Generate Boilerplate Code with GitHub Copilot Let’s start by creating a simple HTML website that we will deploy to Azure Static Web Apps Service. Prompt for GitHub Copilot: Create a simple "Hello, World!" code with HTML. Copilot will generate a basic structure like this: Then, click on "Edit with Copilot." It will create an index.html file and add the code to it. Then, click on "Accept" and modify the content and style if needed before moving forward. Step 5: Deploy Your App Using Copilot Prompts Instead of searching for documentation, let’s use Copilot to generate deployment instructions directly within Visual Studio Code. Trigger Deployment Prompts Using azure To get deployment related suggestions, use azure in GitHub Copilot’s chat. In the chat text box at the bottom of the pane, type the following prompt after azure, then select Send (paper airplane icon) or press Enter on your keyboard: Prompt: azure How do I deploy a static website? Copilot will provide two options: deploying via Azure Blob Storage or Azure Static Web App Service. We will proceed with Azure Static Web Apps, so we will ask Copilot to guide us through deploying our app using this service. We will use the following prompt: azure I would like to deploy a site using Azure Static Web Apps. Please provide a step-by-step guide. Copilot will then return steps like: You will receive a set of instructions to deploy your website. To make it simpler, you can ask Copilot for a more detailed guide. To get a detailed guide, we will use the following prompt: azure Can you provide a more detailed guide and elaborate on GitHub Actions, including the steps to take for GitHub Actions? Copilot will then return steps like: See? That’s how you can experiment, ask questions, and get step-by-step guidance. Remember, the better the prompt, the better the results will be. Step 6: Learn as You Deploy One of the best features of Copilot is that you can ask follow-up questions if anything is unclear—all within Visual Studio Code, without switching tabs. Examples of Useful Prompts: What Azure services should I use with my app? What is GitHub Actions, and how does it work? What are common issues when deploying to Azure, and how can I fix them? Copilot provides contextual responses, guiding you through troubleshooting and best practices. You can learn more about this here. Conclusion: With GitHub Copilot for Azure, deploying applications is now more intuitive than ever. Instead of memorizing complex commands, you can use AI powered prompts to generate deployment steps in real time and even debug the errors within Visual Studio Code. 🚀 Next Steps: Experience with different prompts and explore how Copilot assists you. Try deploying more advanced applications, like Node.js or Python apps. GitHub Copilot isn’t just an AI assistant, it’s a learning tool. The more you engage with it, the more confident you’ll become in deploying and managing applications on Azure! Learn more about GitHub Copilot for Azure: Understand what GitHub Copilot for Azure Preview is and how it works. See example prompts for learning more about Azure and understanding your Azure account, subscription, and resources. See example prompts for designing and developing applications for Azure. See example prompts for deploying your application to Azure. See example prompts for optimizing your applications in Azure. See example prompts for troubleshooting your Azure resources. That's it, folks! But the best part? You can become part of a thriving community of learners and builders by joining the Microsoft Learn Student Ambassadors Community. Connect with like-minded individuals, explore hands-on projects, and stay updated with the latest in cloud and AI. 💬 Join the community on Discord here.
MuhammadSamiullah
Mar 19, 2025 Place Educator Developer Blog
1.2KViews
2likes
1Comment
Prompt Engineering Simplified: AI Toolkit's Prompt Builder
In the age of generative AI, crafting effective prompts is no longer a nice-to-have, it's a must-have. Understanding how to communicate with these underlying models is the key to unlocking their true potential and getting the results we need. What are Prompts? Every time we want to communicate to the language model, we give set of instructions to these models, we refer to these inputs as Prompts. Prompts play a very crucial role while working with the GenAI models. The quality of a prompt directly impacts the output of GenAI models. Precise and well-crafted prompts are crucial for achieving desired results. What factors crafts an optimal Prompt? Crafting an optimal requires balancing clarity, specificity and context. Besides these, constraints are a critical factor in crafting effective prompts. Specificity Clearly define the expectations. The prompt should leave no room for misinterpretation. Precise language is the key. Avoid vague language. Vague prompts lead to vague or irrelevant responses. e.g., “Tell me about history” ➔ “Explain the economic causes of the French Revolution”. Clarity Use simple, unambiguous language. Avoid jargon unless your audience expects it, recommended to use action verbs like "write," "summarize," "explain," "translate". Context Provide background for e.g., “As a beginner in coding, how do I write a Python loop?”. Give the LLM enough context to understand the situation. Include relevant details, keywords, and background information Conciseness Trim unnecessary words (e.g., “Describe photosynthesis” vs. “Can you tell me about how plants use sunlight?”). Ensure the prompt remains relevant to the desired output Tone & Audience Alignment Match the tone to the goal (formal, casual, instructive). Example: For kids, “Explain how rainbows form in simple terms.” Explicit Instructions Directly state what is needed e.g., “Compare X and Y”, “List pros and cons,” “Write a poem about…”. Guiding Constraints Limit scope to avoid overly broad answers e.g., “Focus on environmental impacts, not economic ones”. Constraints reduce ambiguity, focus responses, and improve relevance. Few example constraints, Format: “Summarize in 3 bullet points.” Length: “Explain in 2 sentences.” Scope: “Focus on environmental impacts, not economic ones.” Style/Tone: “Write a casual email,” or “Use non-technical terms.” Technical limits: “Keep code examples under 50 lines.” Few advanced Considerations for AI/LLM Prompts Examples or Demonstrations Include examples to set expectations for e.g., “Write a limerick like this: There once was a cat from Peru…”. Step-by-Step Guidance Break complex tasks into steps for e.g., “First analyze the Python code, then suggest solutions”. Role Assignment Assign roles to guide the AI for e.g., “Act as a historian explaining World War 2”. Avoid Bias Neutral phrasing ensures fair responses for e.g., “Discuss pros and cons of renewable energy” vs. “Why is solar energy bad?” the former is a well-formed prompt. Prompt engineering is an iterative process. Experiment with different phrasings and structures to see what works best. Analyze the LLM's responses and refine prompts accordingly. Make adjustments to improve the accuracy and relevance of the output. Prompt Builder: From the above section, we know that crafting effective prompts is essential for robust AI engagement. Prompt Builder tool on AI Toolkit helps in this enhancement by streamlining the whole process of crafting prompts. Prompt builder helps the users by helping in the following areas, o Prompt Creation, Modification, and Evaluation: Customize prompts through an accessible and straightforward interface. o AI-Assisted Prompt Generation: Articulate the project concept using everyday language, and the AI-powered feature will produce prompts for your exploration. o Organized Output Capability: Craft the prompts to yield outputs in a consistent, standardized and predictable manner. o Automated Code Generation for Prompt Usage: Following model and prompt experimentation, transition to coding immediately by accessing automatically generated, executable Python code. This tool has three sections on the UI. Prompt configuration Response History Prompt Configuration Section: In the Prompt configuration section, there are 4 major sub sections, Model System Prompt User Prompt Add Prompt Model: The Model section is the first subsection of the Prompt Configuration. Here, we select the model to use. The AI Toolkit offers a wide range of models, including remote models served from GitHub and those from providers such as OpenAI, Google, Anthropic, and Nvidia. For this tutorial we will be using OpenAI GPT-4o mini via GitHub System Prompt: In System prompt section, we provide instructions with relevant context to guide the system response. We can think of a system prompt as the "role" we give an AI before we ask it anything, like telling an actor what character to play. Generate Prompt: Upon choosing cloud-based / GitHub / Remote models, a new tool called as “Generate Prompt” is enabled, this is an AI Powered tool especially useful for crafting AI Powered well defined prompts which can be used in the “System Prompt” Section. Upon clicking on the “Generate Prompt” we can see a small window that pops up and asks for the input prompt. This can generate a prompt template by sharing basic details about the task. In this tutorial, let’s ask the LLM to generate prompt about “Professor in university teaching math”. Once the message is updated click on “Generate” button, and in a few seconds, we will have a well-structured prompt in the “System Prompt” section. The prompt that we generated is as follows Provide a detailed syllabus for a university-level mathematics course, including course objectives, weekly topics, assessment methods, and required materials. The syllabus should cover all essential components such as the course title, description, prerequisites, learning outcomes, weekly schedules, and any relevant policies regarding attendance, grading, and participation. # Steps 1. **Course Title and Description**: Clearly state the title of the course and provide a brief description of what the course will cover. 2. **Prerequisites**: List any required courses or knowledge necessary for students to enroll. 3. **Learning Outcomes**: Define what students are expected to learn by the end of the course. 4. **Weekly Schedule**: Outline topics for each week, along with any associated readings or assignments. 5. **Assessment Methods**: Describe how students will be evaluated (e.g., exams, quizzes, projects). 6. **Required Materials**: Include information on textbooks and other resources needed for the course. 7. **Course Policies**: State attendance, grading, and participation rules. # Output Format The output should be formatted as a structured syllabus, presented in clear sections with headings for each part. The document should be detailed yet concise, ideally around 3-5 pages in length. # Examples **Example 1** **Input:** Create a syllabus for a Calculus I course. **Output:** - **Course Title**: Calculus I - **Description**: An introduction to limits, derivatives, and integrals. - **Prerequisites**: Pre-Calculus or equivalent. - **Learning Outcomes**: Students will be able to calculate limits, differentiate basic functions, and understand the Fundamental Theorem of Calculus. - **Weekly Schedule**: - Week 1: Introduction to Limits - Week 2: Continuity - Week 3: Derivatives - ... - **Assessment Methods**: Midterm exam (30%), Final exam (40%), Weekly quizzes (20%), Participation (10%). - **Required Materials**: "Calculus: Early Transcendentals" by James Stewart. - **Course Policies**: Attendance required, late assignments will incur a penalty. **Example 2** **Input:** Design a syllabus for a Linear Algebra course. **Output:** - **Course Title**: Linear Algebra - **Description**: Study vector spaces, matrices, and linear transformations. - **Prerequisites**: None. - **Learning Outcomes**: Mastery of matrix operations and ability to solve systems of linear equations. - **Weekly Schedule**: - Week 1: Introduction to Vector Spaces - Week 2: Matrix Operations - Week 3: Determinants - ... - **Assessment Methods**: Two midterms (50%), Homework assignments (30%), Attendance (20%). - **Required Materials**: "Linear Algebra Done Right" by Sheldon Axler. - **Course Policies**: Participation in class discussions is mandatory. # Notes Ensure that the syllabus is comprehensive and tailored to the specific course topic. Consider including any unique teaching methods or technologies that will be employed during the course. User Prompt: User prompt is the specific question, instruction, or request that a person provides to the AI to elicit a response. It's the direct input from the user that initiates the AI's processing and generation of text. In AI Toolkit for a few models that support the multimodal feature, we can also upload images in this section. For this tutorial let’s input “Explain to me the Fourier equation in simple terms” Add Prompt: If any additional prompt needs to be added, we can configure more User or assistant prompt. So, in a conversation, we have: User Prompt: What the human says. Assistant Prompt: What the AI says. The major configuration part is now completed through this window, its now time to test the responses based on the LLM’s knowledge, in this case how well does GPT 4o mini behave in the role as university-level mathematics professor. In order to test it, we navigate to the next window, the Response section. Response Section: The Response section is where we finally get to see the responses. This section has the “Run” and “View Code” buttons. We can also choose the type of response we need. It can be a simple text or json schema. Upon choosing Json Schema, user will be prompted to “Prepare Schema”. Users can define their own schema or select from example. There are a few examples for the user to choose from. For this tutorial we will be using the simple text format. As we have our setup ready, we can directly click on the “Run” button, In a few seconds we have our well formatted and accurate answer on the screen, AI Toolkit‘s markdown capability can neatly format all the mathematical signs and equations. We can also add this to the “Assistant Prompt” by using the button provided. It provides better example for the LLM in the code later. The result from the LLM now seems very satisfactory with our well-crafted prompt. We can now proceed with the Code generation feature of the Prompt Builder tool of AI Toolkit. Upon clicking the “View Code” button, user is prompted to choose the SDK of their choice. This SDK lets us communicate with the API from the code. For this tutorial, we will use Azure AI Inference SDK. For more details on this SDK refer here. The code requires azure-ai-inference. Install the library by pip install azure-ai-inference """Run this model in Python > pip install azure-ai-inference """ import os from azure.ai.inference import ChatCompletionsClient from azure.ai.inference.models import AssistantMessage, SystemMessage, UserMessage from azure.ai.inference.models import ImageContentItem, ImageUrl, TextContentItem from azure.core.credentials import AzureKeyCredential # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings. # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens client = ChatCompletionsClient( endpoint = "https://models.inference.ai.azure.com", credential = AzureKeyCredential(os.environ["GITHUB_TOKEN"]), api_version = "2024-08-01-preview", ) response = client.complete( messages = [ SystemMessage(content = "Provide a detailed syllabus for a university-level mathematics course, including course objectives, weekly topics, assessment methods, and required materials.\n \nThe syllabus should cover all essential components such as the course title, description, prerequisites, learning outcomes, weekly schedules, and any relevant policies regarding attendance, grading, and participation.\n \n# Steps\n \n1. **Course Title and Description**: Clearly state the title of the course and provide a brief description of what the course will cover.\n2. **Prerequisites**: List any required courses or knowledge necessary for students to enroll.\n3. **Learning Outcomes**: Define what students are expected to learn by the end of the course.\n4. **Weekly Schedule**: Outline topics for each week, along with any associated readings or assignments.\n5. **Assessment Methods**: Describe how students will be evaluated (e.g., exams, quizzes, projects).\n6. **Required Materials**: Include information on textbooks and other resources needed for the course.\n7. **Course Policies**: State attendance, grading, and participation rules.\n \n# Output Format\n \nThe output should be formatted as a structured syllabus, presented in clear sections with headings for each part. The document should be detailed yet concise, ideally around 3-5 pages in length.\n \n# Examples\n \n**Example 1** \n**Input:** \nCreate a syllabus for a Calculus I course. \n**Output:** \n- **Course Title**: Calculus I \n- **Description**: An introduction to limits, derivatives, and integrals. \n- **Prerequisites**: Pre-Calculus or equivalent. \n- **Learning Outcomes**: Students will be able to calculate limits, differentiate basic functions, and understand the Fundamental Theorem of Calculus. \n- **Weekly Schedule**: \n - Week 1: Introduction to Limits \n - Week 2: Continuity \n - Week 3: Derivatives \n - ... \n- **Assessment Methods**: Midterm exam (30%), Final exam (40%), Weekly quizzes (20%), Participation (10%). \n- **Required Materials**: \"Calculus: Early Transcendentals\" by James Stewart. \n- **Course Policies**: Attendance required, late assignments will incur a penalty.\n \n**Example 2** \n**Input:** \nDesign a syllabus for a Linear Algebra course. \n**Output:** \n- **Course Title**: Linear Algebra \n- **Description**: Study vector spaces, matrices, and linear transformations. \n- **Prerequisites**: None. \n- **Learning Outcomes**: Mastery of matrix operations and ability to solve systems of linear equations. \n- **Weekly Schedule**: \n - Week 1: Introduction to Vector Spaces \n - Week 2: Matrix Operations \n - Week 3: Determinants \n - ... \n- **Assessment Methods**: Two midterms (50%), Homework assignments (30%), Attendance (20%). \n- **Required Materials**: \"Linear Algebra Done Right\" by Sheldon Axler. \n- **Course Policies**: Participation in class discussions is mandatory. \n \n# Notes\n \nEnsure that the syllabus is comprehensive and tailored to the specific course topic. Consider including any unique teaching methods or technologies that will be employed during the course."), UserMessage(content = [ TextContentItem(text = "Explain to me the Fourier equation in simple terms"), ]), ], model = "gpt-4o-mini", response_format = "text", max_tokens = 4096, temperature = 1, top_p = 1, ) print(response.choices[0].message.content) This Python code is ready to be modified and used in any Generative AI application. It can be modified with any Orchestration framework like Semantic Kernel to add more features or even make an agentic application. History Section: We also have the “History” and “New Prompt”. History shows all the previous sessions; we can revisit and resume working or perhaps check the output or regenerate the code. History” and “New Prompt” In essence, the Prompt Builder tool significantly streamlines the process of crafting effective prompts, saving developers valuable time. Beyond prompt creation, it also facilitates output evaluation, model behavior analysis, and generates quality code to accelerate application development. Stay tuned for upcoming blog posts, where we'll delve into even more advanced techniques for building powerful generative AI applications. You can also join our AI Sparks series to learn more about the capabilities of the AI Toolkit for Visual Studio Code.
shreyanfern
Mar 06, 2025 Place Educator Developer Blog
3.4KViews
3likes
0Comments
Fine-Tuning Language Models with Azure AI Foundry: A Detailed Guide
What is Azure AI Foundry? Azure AI Foundry is a comprehensive platform designed to simplify the development, deployment, and management of AI models. It provides a user-friendly interface and powerful tools that enable developers to create custom AI solutions without needing extensive machine learning expertise. Key Features of Azure AI Foundry One-Button Fine-Tuning: A streamlined process that allows users to fine-tune models with minimal configuration. Integration with Development Tools: Seamless integration with popular development environments, particularly Visual Studio Code. Support for Multiple Models: Access to a variety of pre-trained models, including the Phi family of models. Understanding Fine-Tuning Fine-tuning is the process of taking a pre-trained model and adapting it to a specific dataset or task. This is particularly useful when the base model has been trained on a large corpus of general data but needs to perform well on a narrower domain. Why Fine-Tune? Improved Performance: Fine-tuning can significantly enhance the model's accuracy and relevance for specific tasks. Reduced Training Time: Starting with a pre-trained model reduces the amount of data and time required for training. Customization: Tailor the model to meet the unique needs of your application or business. One-Button Fine-Tuning in Azure AI Foundry Step-by-Step Process Select the Model: Log in to Azure AI Foundry and navigate to the model selection interface. Choose Phi-3 or another small language model from the available options. Prepare Your Data: Ensure your dataset is formatted correctly. Typically, this involves having a set of input-output pairs that the model can learn from. Upload your dataset to Azure AI Foundry. The platform supports various data formats, making it easy to integrate your existing data. Initiate Fine-Tuning: Locate the one-button fine-tuning feature within the Azure AI Foundry interface. Click the button to start the fine-tuning process. The platform will handle the configuration and setup automatically. Monitor Progress: After initiating fine-tuning, you can monitor the process through the Azure portal. The portal provides real-time updates on training metrics, allowing you to track the model's performance as it learns. Evaluate the Model: Once fine-tuning is complete, evaluate the model's performance using a validation dataset. Azure AI Foundry provides tools for assessing accuracy, precision, recall, and other relevant metrics. Deploy the Model: After successful evaluation, you can deploy the fine-tuned model directly from Azure AI Foundry. The platform supports various deployment options, including REST APIs and integration with other Azure services. Using the AI Toolkit in Visual Studio Code Overview of the AI Toolkit The AI Toolkit for Visual Studio Code enhances the development experience by providing tools specifically designed for AI model management and fine-tuning. This integration allows developers to work within a familiar environment while leveraging powerful AI capabilities. Key Features of the AI Toolkit 1) Model Management: Easily manage and switch between different models, including Phi-3 and Ollama models. 2) Data Handling: Simplified data upload and preprocessing tools to prepare datasets for training. 3) Real-Time Collaboration: Collaborate with team members in real-time, sharing insights and progress on AI projects. How to Use the AI Toolkit 1) Install the AI Toolkit: Open Visual Studio Code and navigate to the Extensions Marketplace. Search for "AI Toolkit" and install the extension. 2) Connect to Azure AI Foundry: Once installed, configure the toolkit to connect to your Azure AI Foundry account. This will allow you to access your models and datasets directly from Visual Studio Code. 3) Fine-Tune Models: Use the toolkit to initiate fine-tuning processes directly from your development environment. Monitor training progress and view logs without leaving Visual Studio Code. 4) Consume Ollama Models: The AI Toolkit supports the consumption of Ollama models, providing additional flexibility in your AI projects. This feature allows you to integrate various models seamlessly, enhancing your application's capabilities. Microsoft ONNX Live for Fine-Tuning What is Microsoft ONNX Live? Microsoft ONNX Live is a platform that allows developers to deploy and optimize AI models using the Open Neural Network Exchange (ONNX) format. ONNX is an open-source format that enables interoperability between different AI frameworks, making it easier to deploy models across various environments. Key Features of Microsoft ONNX Live Model Optimization: ONNX Live provides tools to optimize models for performance, ensuring they run efficiently in production environments. Cross-Framework Compatibility: Models trained in different frameworks (like PyTorch or TensorFlow) can be converted to ONNX format, allowing for greater flexibility in deployment. Real-Time Inference: ONNX Live supports real-time inference, enabling applications to utilize AI models for immediate predictions. Fine-Tuning with ONNX Live Model Conversion: If you have a model trained in a different framework, you can convert it to ONNX format using tools provided by Microsoft. This conversion allows you to leverage the benefits of ONNX Live for deployment and optimization. Integration with Azure AI Foundry: Once your model is in ONNX format, you can integrate it with Azure AI Foundry for fine-tuning. The one-button fine-tuning feature can be used to adapt the ONNX model to your specific dataset. Optimization Techniques: After fine-tuning, you can apply various optimization techniques available in ONNX Live to enhance the model's performance. Techniques such as quantization and pruning can significantly reduce the model size and improve inference speed. Deployment: Once optimized, the model can be deployed directly from Azure AI Foundry or ONNX Live. This deployment can be done as a REST API, allowing easy integration with web applications and services. Additional Resources To further enhance your understanding and capabilities in fine-tuning language models, consider exploring the following resources: Phi-3 Cookbook: This comprehensive guide provides insights into getting started with Phi models, including best practices for fine-tuning and deployment. Explore the Phi-3 Cookbook. Ignite Fine-Tuning Workshop: This workshop offers a hands-on approach to learning about fine-tuning techniques and tools. It includes real-world scenarios to help you understand the practical applications of fine-tuning. Visit the GitHub Repository. Conclusion Fine-tuning language models like Phi-3 using Azure AI Foundry, combined with the AI Toolkit in Visual Studio Code and Microsoft ONNX Live, provides a powerful and efficient workflow for developers. The one-button fine-tuning feature simplifies the process, while the integration with ONNX Live allows for optimization and deployment flexibility. By leveraging these tools, you can enhance your AI applications, ensuring they are tailored to meet specific needs and perform optimally in production environments. Whether you are a seasoned AI developer or just starting, Azure AI Foundry and its associated tools offer a robust ecosystem for building and deploying advanced AI solutions. References Microsoft Docs Links Fine-Tuning Models in Azure OpenAI Azure AI Services Documentation Azure Machine Learning Documentation Microsoft Learn Links Develop Generative AI Apps in Azure Fine-Tune a Language Model Azure AI Foundry Overview Get started with AI Toolkit for Visual Studio Code
Sharda_Kaur
Jan 29, 2025 Place Educator Developer Blog
1.9KViews
0likes
0Comments
Recipe Generator Application with Phi-3 Vision on AI Toolkit Locally
In today's data-driven world, images have become a ubiquitous source of information. From social media feeds to medical imaging, we encounter and generate images constantly. Extracting meaningful insights from these visual data requires sophisticated analysis techniques. In this blog post let’s build an Image Analysis Application using the cutting-edge Phi-3 Vision model completely free of cost and on-premise environment using the VS Code AI Toolkit. We'll explore the exciting possibilities that this powerful combination offers. The AI Toolkit for Visual Studio Code (VS Code) is a VS Code extension that simplifies generative AI app development by bringing together cutting-edge AI development tools and models. I would recommend going through the following blogs for getting started with VS Code AI Toolkit. 1. Visual Studio Code AI Toolkit: How to Run LLMs locally 2. Visual Studio AI Toolkit : Building Phi-3 GenAI Applications 3. Building Retrieval Augmented Generation on VSCode & AI Toolkit 4. Bring your own models on AI Toolkit - using Ollama and API keys Setup VS Code AI Toolkit: Launch the VS Code application and Click on the VS Code AI Toolkit extension. Login to the GitHub account if not already done. Once ready, click on model catalog. In the model catalog there are a lot of models, broadly classified into two categories, Local Run (with CPU and with GPU) Remote Access (Hosted by GitHub and other providers) For this blog, we will be using a Local Run model. This will utilize the local machine’s hardware to run the Language model. Since it involves analyzing images, we will be using the language model which supports vision operations and hence Phi-3-Vision will be a good fit as its light and supports local run. Download the model and then further it will be loaded it in the playground to test. Once downloaded, Launch the “Playground” tab and load the Phi-3 Vision model from the dropdown. The Playground also shows that Phi-3 vision allows image attachments. We can try it out before we start developing the application. Let’s upload the image using the “Paperclip icon” on the UI. I have uploaded image of Microsoft logo and prompted the language model to Analyze and explain the image. Phi-3 vision running on local premise boasts an uncanny ability to not just detect but unerringly pinpoint the exact Company logo and decipher the name with astonishing precision. This is a simple use case, but it can be built upon with various applications to unlock a world of new possibilities. Port Forwarding: Port Forwarding, a valuable feature within the AI Toolkit, serves as a crucial gateway for seamless communication with the GenAI model. To do this, launch the terminal and navigate to the “Ports” section. There will be button “Forward a Port”, click on that and select any desired port, in this blog we will use 5272 as the port. The Model-as-a-server is now ready, where the model will be available on the port 5272 to respond to the API calls. It can be tested with any API testing application. To know more click here. Creating Application with Python using OpenAI SDK: To follow this section, Python must be installed on the local machine. Launch the new VS Code window and set the working directory. Create a new Python Virtual environment. Once the setup is ready, open the terminal on VS Code, and install the libraries using “pip”. pip install openai pip install streamlit Before we build the streamlit application, lets develop the basic program and check the responses in the VSCode terminal and then further develop a basic webapp using the streamlit framework. Basic Program Import libraries: import base64 from openai import OpenAI base64: The base64 module provides functions for encoding binary data to base64-encoded strings and decoding base64-encoded strings back to binary data. Base64 encoding is commonly used for encoding binary data in text-based formats such as JSON or XML. OpenAI: The OpenAI package is a Python client library for interacting with OpenAI's API. The OpenAI class provides methods for accessing various OpenAI services, such as generating text, performing natural language processing tasks, and more. Initialize Client: Initialize an instance of the OpenAI class from the openai package, client = OpenAI( base_url="http://127.0.0.1:5272/v1/", api_key="xyz" # required by API but not used ) OpenAI (): Initializes a OpenAI model with specific parameters, including a base URL for the API, an API key, a custom model name, and a temperature setting. This model is used to generate responses based on user queries. This instance will be used to interact with the OpenAI API. base_url = "http://127.0.0.1:5272/v1/": Specifies the base URL for the OpenAI API. In this case, it points to a local server running on 127.0.0.1 (localhost) at port 5272. api_key = "ai-toolkit": The API key used to authenticate requests to the OpenAI API. In case of AI Toolkit usage, we don’t have to specify any API key. The image analysis application will frequently deal with images uploaded by users. But to send these images to GenAI model, we need them in a format it understands. This is where the encode_image function comes in. # Function to encode the image def encode_image(image_path): with open(image_path, "rb") as image_file: return base64.b64encode(image_file.read()).decode("utf-8") Function Definition: def encode_image(image_path): defines a function named encode_image that takes a single argument, image_path. This argument represents the file path of the image we want to encode. Opening the Image: with open(image_path, "rb") as image_file: opens the image file specified by image_path in binary reading mode ("rb"). This is crucial because we're dealing with raw image data, not text. Reading Image Content: image_file.read() reads the entire content of the image file into a byte stream. Remember, images are stored as collections of bytes representing color values for each pixel. Base64 Encoding: base64.b64encode(image_file.read()) encodes the byte stream containing the image data into base64 format. Base64 encoding is a way to represent binary data using a combination of printable characters, which makes it easier to transmit or store the data. Decoding to UTF-8: .decode("utf-8") decodes the base64-encoded data into a UTF-8 string. This step is necessary because the OpenAI API typically expects text input, and the base64-encoded string can be treated as text containing special characters. Returning the Encoded Image: return returns the base64-encoded string representation of the image. This encoded string is what we'll send to the AI model for analysis. In essence, the encode_image function acts as a bridge, transforming an image file on your computer into a format that the AI model can understand and process. Path for the Image: We will use an image stored on our local machine for this section, while we develop the webapp, we will change this to accept it to what the user uploads. image_path = "C:/img.jpg" #path of the image here This line of code is crucial for any program that needs to interact with an image file. It provides the necessary information for the program to locate and access the image data. Base64 String: # Getting the base64 string base64_image = encode_image(image_path) This line of code is responsible for obtaining the base64-encoded representation of the image specified by the image_path. Let's break it down: encode_image(image_path): This part calls the encode_image function, which we've discussed earlier. This function takes the image_path as input and performs the following: Reads the image file from the specified path. Converts the image data into a base64-encoded string. Returns the resulting base64-encoded string. base64_image = ...: This part assigns the return value of the encode_image function to the variable base64_image. This section effectively fetches the image from the given location and transforms it into a special format (base64) that can be easily handled and transmitted by the computer system. This base64-encoded string will be used subsequently to send the image data to the AI model for analysis. Invoking the Language Model: This code tells the AI model what to do with the image. response = client.chat.completions.create( model="Phi-3-vision-128k-cpu-int4-rtn-block-32-acc-level-4-onnx", messages=[ { "role": "user", "content": [ { "type": "text", "text": "What's in the Image?", }, { "type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}, }, ], } ], ) response = client.chat.completions.create(...): This line sends instructions to the AI model we're using (represented by client). Here's a breakdown of what it's telling the model: chat.completions.create: We're using a specific part of the OpenAI API designed for having a conversation-like interaction with the model. The ... part: This represents additional details that define what we want the model to do, which we'll explore next. Let's break down the details (...) sent to the model: 1) model="Phi-3-vision-128k-cpu-int4-rtn-block-32-acc-level-4-onnx": This tells the model exactly which AI model to use for analysis. In our case, it's the "Phi-3-vision" model. 2) messages: This defines what information we're providing to the model. Here, we're sending two pieces of information: role": "user": This specifies that the first message comes from a user (us). The content: This includes two parts: "What's in the Image?": This is the prompt we're sending to the model about the image. "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}: This sends the actual image data encoded in base64 format (stored in base64_image). In a nutshell, this code snippet acts like giving instructions to the AI model. We specify the model to use, tell it we have a question about an image, and then provide the image data itself. Printing the response on the console: print(response.choices[0].message.content) We asked the AI model "What's in this image?" This line of code would then display the AI's answer. Console response: Finally, we can see the response on the terminal. Now to make things more interesting, let’s convert this into a webapp using the streamlit framework. Recipe Generator Application with Streamlit: Now we know how to interact with the Vision model offline using a basic console. Let’s make things even more exciting by applying all this to a use-case which probably will be most loved by all those who are cooking enthusiasts!! Yes, let’s create an application which will assist in cooking by looking what’s in the image of ingredients! Create a new file and name is as “app.py” select the same. venv that was used earlier. Make sure the Visual studio toolkit is running and serving the Phi-3 Vision model through the port 5272. First step is importing the libraries, import streamlit as st import base64 from openai import OpenAI base64 and OpenAI is the same as we had used in the earlier section. Streamlit: This part imports the entire Streamlit library, which provides a powerful set of tools for creating user interfaces (UIs) with Python. Streamlit simplifies the process of building web apps by allowing you to write Python scripts that directly translate into interactive web pages. client = OpenAI( base_url="http://127.0.0.1:5272/v1/", api_key="xyz" # required by API but not used ) As discussed in the earlier section, initializing the client and configuring the base_url and api_key. st.title('Recipe Generator 🍔') st.write('This is a simple recipe generator application.Upload images of the Ingridients and get the recipe by Chef GenAI! 🧑‍🍳') uploaded_file = st.file_uploader("Choose a file") if uploaded_file is not None: st.image(uploaded_file, width=300) st.title('Recipe Generator 🍔'): This line sets the title of the Streamlit application as "Recipe Generator" with a visually appealing burger emoji. st.write(...): This line displays a brief description of the application's functionality to the user. uploaded_file = st.file_uploader("Choose a file"): This creates a file uploader component within the Streamlit app. Users can select and upload an image file (likely an image of ingredients). if uploaded_file is not None: : This conditional block executes only when the user has actually selected and uploaded a file. st.image(uploaded_file, width=300): If an image is uploaded, this line displays the uploaded image within the Streamlit app with a width of 300 pixels. In essence, this code establishes the basic user interface for the Recipe Generator app. It allows users to upload an image, and if an image is uploaded, it displays the image within the app. preference = st.sidebar.selectbox( "Choose your preference", ("Vegetarian", "Non-Vegetarian") ) cuisine = st.sidebar.selectbox( "Select for Cuisine", ("Indian","Chinese","French","Thai","Italian","Mexican","Japanese","American","Greek","Spanish") ) We use Streamlit's sidebar and selectbox features to create interactive user input options within a web application: st.sidebar.selectbox(...): This line creates a dropdown menu (selectbox) within the sidebar of the Streamlit application.The first argument, "Choose your preference", sets the label or title for the dropdown.The second argument, ("Vegetarian", "Non-Vegetarian"), defines the list of options available for the user to select (in this case, dietary preferences). cuisine = st.sidebar.selectbox(...): This line creates another dropdown menu in the sidebar, this time for selecting the desired cuisine.The label is "Select for Cuisine".The options provided include "Indian", "Chinese", "French", and several other popular cuisines. In essence, this code allows users to interact with the application by selecting their preferred dietary restrictions (Vegetarian or Non-Vegetarian) and desired cuisine from the dropdown menus in the sidebar. def encode_image(uploaded_file): """Encodes a Streamlit uploaded file into base64 format""" if uploaded_file is not None: content = uploaded_file.read() return base64.b64encode(content).decode("utf-8") else: return None base64_image = encode_image(uploaded_file) The same function of encode_image as discussed in the earlier section is being used here. if st.button("Ask Chef GenAI!"): if base64_image: response = client.chat.completions.create( model="Phi-3-vision-128k-cpu-int4-rtn-block-32-acc-level-4-onnx", messages=[ { "role": "user", "content": [ { "type": "text", "text": f"STRICTLY use the ingredients in the image to generate a {preference} recipe and {cuisine} cuisine.", }, { "type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}, }, ], } ], ) print(response.choices[0].message.content) st.write(response.choices[0].message.content) else: st.write("Please upload an image with any number of ingridients and instantly get a recipe.") Above code block implements the core functionality of the Recipe Generator app, triggered when the user clicks a button labeled "Ask Chef GenAI!": if st.button("Ask Chef GenAI!"): This line checks if the user has clicked the button. If they have, the code within the if block executes. if base64_image: This inner if condition checks if a variable named base64_image has a value. This variable likely stores the base64 encoded representation of the uploaded image (containing ingredients). If base64_image has a value (meaning an image is uploaded), the code proceeds. client.chat.completions.create(...): Client that had been defined earlier interacts with the API . Here, it calls a to generate text completions, thereby invoking a small language model. The arguments provided specify the model to be used ("Phi-3-vision-128k-cpu-int4-rtn-block-32-acc-level-4-onnx") and the message to be completed. The message consists of two parts within a list: User Input: The first part defines the user's role ("user") and the content they provide. This content is an instruction with two key points: Dietary Preference: It specifies to "STRICTLY use the ingredients in the image" to generate a recipe that adheres to the user's preference (vegetarian or non-vegetarian, set using the preference dropdown). Cuisine Preference: It mentions the desired cuisine type (Indian, Chinese, etc., selected using the cuisine dropdown). Image Data: The second part provides the image data itself. It includes the type ("image_url") and the URL, which is constructed using the base64_image variable containing the base64 encoded image data. print(response.choices[0].message.content) & st.write(...): The response will contain a list of possible completions. Here, the code retrieves the first completion (response.choices[0]) and extracts its message content. This content is then printed to the console like before and displayed on the Streamlit app using st.write. else block: If no image is uploaded (i.e., base64_image is empty), the else block executes. It displays a message reminding the user to upload an image to get recipe recommendations. The above code block is the same as before except the we have now modified it to accept few inputs and also have made it compatible with streamlit. The coding is now completed for our streamlit application! It's time to test the application. Navigate to the terminal on Visual Studio Code and enter the following command, (if the file is named as app.py) streamlit run app.py Upon successful run, it will redirect to default browser and a screen with the Recipe generator will be launched, Upload an image with ingredients, select the recipe, cuisine and click on “Ask Chef GenAI”. It will take a few moments for delightful recipe generation. While generating we can see the logs on the terminal and finally the recipe will be shown on the screen! Enjoy your first recipe curated by Chef GenAI powered by Phi-3 vision model on local prem using Visual Studio AI Toolkit! The code is available on the following GitHub Repository. In the upcoming series we will explore more types of Gen AI implementations with AI toolkit. Resources: 1. Visual Studio Code AI Toolkit: Run LLMs locally 2. Visual Studio AI Toolkit : Building Phi-3 GenAI Applications 3. Building Retrieval Augmented Generation on VSCode & AI Toolkit 4. Bring your own models on AI Toolkit - using Ollama and API keys 5. Expanded model catalog for AI Toolkit 6. Azure Toolkit Samples GitHub Repository
shreyanfern
Jan 28, 2025 Place Educator Developer Blog
698Views
1like
1Comment
How to Customize Visual Studio Code Chat with GitHub Copilot and Semantic Kernel
Discover how to customize Visual Studio Code Chat to revolutionize your development workflow with AI. By leveraging GitHub Copilot, Semantic Kernel, and Azure AI Agent Service, you can create chat participants tailored to tasks like project creation, requirement analysis, and code orchestration. Learn to integrate models like o1-mini for reasoning and .NET Aspire for distributed application management. Empower your IDE with AI to streamline complex workflows and boost efficiency.
kinfey
Jan 20, 2025 Place Educator Developer Blog
2.1KViews
1like
0Comments
Using Visual Studio Notebooks for learning C#
Getting Started Install Notebook Editor Extension: Notebook Editor - Visual Studio Marketplace C# 101 GitHub Repo dotnet/csharp-notebooks: Get started learning C# with C# notebooks powered by .NET Interactive and VS Code. (github.com) Machine Learning and .NET dotnet/csharp-notebooks: Get started learning C# with C# notebooks powered by .NET Interactive and VS Code. (github.com) .NET Interactive Notebooks for C# dotnet/csharp-notebooks: Get started learning C# with C# notebooks powered by .NET Interactive and VS Code. (github.com)
Lee_Stott
Jan 17, 2025 Place Educator Developer Blog
16KViews
0likes
4Comments
Festival Web
Microsoft anuncia una nueva iniciativa para ayudarte a impulsar tu carrera en desarrollo web llamada el Festival Web, es una serie de charlas en vivo que comienza desde el 31 de octubre y finaliza el 14 de noviembre. En estas charlas podrás aprender de la mano de expertos y conocer diferentes herramientas como VSCode y GitHub. ¡Regístrate en las charlas en vivo del Festival Web y comienza tu viaje en el mundo de la programación web con expertos de la industria! Además, hay una oportunidad gratuita para quienes desean iniciarse en este campo. ¡Descubre más información sobre esta iniciativa de Microsoft en este blog!
abrilurena
Nov 08, 2024 Place Educator Developer Blog
4.1KViews
1like
0Comments