rag

28 Topics

Level up your Python + AI skills with our complete series
We've just wrapped up our live series on Python + AI, a comprehensive nine-part journey diving deep into how to use generative AI models from Python. The series introduced multiple types of models, including LLMs, embedding models, and vision models. We dug into popular techniques like RAG, tool calling, and structured outputs. We assessed AI quality and safety using automated evaluations and red-teaming. Finally, we developed AI agents using popular Python agents frameworks and explored the new Model Context Protocol (MCP). To help you apply what you've learned, all of our code examples work with GitHub Models, a service that provides free models to every GitHub account holder for experimentation and education. Even if you missed the live series, you can still access all the material using the links below! If you're an instructor, feel free to use the slides and code examples in your own classes. If you're a Spanish speaker, check out the Spanish version of the series. Python + AI: Large Language Models 📺 Watch recording In this session, we explore Large Language Models (LLMs), the models that power ChatGPT and GitHub Copilot. We use Python to interact with LLMs using popular packages like the OpenAI SDK and LangChain. We experiment with prompt engineering and few-shot examples to improve outputs. We also demonstrate how to build a full-stack app powered by LLMs and explain the importance of concurrency and streaming for user-facing AI apps. Slides for this session Code repository with examples: python-openai-demos Python + AI: Vector embeddings 📺 Watch recording In our second session, we dive into a different type of model: the vector embedding model. A vector embedding is a way to encode text or images as an array of floating-point numbers. Vector embeddings enable similarity search across many types of content. In this session, we explore different vector embedding models, such as the OpenAI text-embedding-3 series, through both visualizations and Python code. We compare distance metrics, use quantization to reduce vector size, and experiment with multimodal embedding models. Slides for this session Code repository with examples: vector-embedding-demos Python + AI: Retrieval Augmented Generation 📺 Watch recording In our third session, we explore one of the most popular techniques used with LLMs: Retrieval Augmented Generation. RAG is an approach that provides context to the LLM, enabling it to deliver well-grounded answers for a particular domain. The RAG approach works with many types of data sources, including CSVs, webpages, documents, and databases. In this session, we walk through RAG flows in Python, starting with a simple flow and culminating in a full-stack RAG application based on Azure AI Search. Slides for this session Code repository with examples: python-openai-demos Python + AI: Vision models 📺 Watch recording Our fourth session is all about vision models! Vision models are LLMs that can accept both text and images, such as GPT-4o and GPT-4o mini. You can use these models for image captioning, data extraction, question answering, classification, and more! We use Python to send images to vision models, build a basic chat-with-images app, and create a multimodal search engine. Slides for this session Code repository with examples: openai-chat-vision-quickstart Python + AI: Structured outputs 📺 Watch recording In our fifth session, we discover how to get LLMs to output structured responses that adhere to a schema. In Python, all you need to do is define a Pydantic BaseModel to get validated output that perfectly meets your needs. We focus on the structured outputs mode available in OpenAI models, but you can use similar techniques with other model providers. Our examples demonstrate the many ways you can use structured responses, such as entity extraction, classification, and agentic workflows. Slides for this session Code repository with examples: python-openai-demos Python + AI: Quality and safety 📺 Watch recording This session covers a crucial topic: how to use AI safely and how to evaluate the quality of AI outputs. There are multiple mitigation layers when working with LLMs: the model itself, a safety system on top, the prompting and context, and the application user experience. We focus on Azure tools that make it easier to deploy safe AI systems into production. We demonstrate how to configure the Azure AI Content Safety system when working with Azure AI models and how to handle errors in Python code. Then we use the Azure AI Evaluation SDK to evaluate the safety and quality of output from your LLM. Slides for this session Code repository with examples: ai-quality-safety-demos Python + AI: Tool calling 📺 Watch recording In the final part of the series, we focus on the technologies needed to build AI agents, starting with the foundation: tool calling (also known as function calling). We define tool call specifications using both JSON schema and Python function definitions, then send these definitions to the LLM. We demonstrate how to properly handle tool call responses from LLMs, enable parallel tool calling, and iterate over multiple tool calls. Understanding tool calling is absolutely essential before diving into agents, so don't skip over this foundational session. Slides for this session Code repository with examples: python-openai-demos Python + AI: Agents 📺 Watch recording In the penultimate session, we build AI agents! We use Python AI agent frameworks such as the new agent-framework from Microsoft and the popular LangGraph framework. Our agents start simple and then increase in complexity, demonstrating different architectures such as multiple tools, supervisor patterns, graphs, and human-in-the-loop workflows. Slides for this session Code repository with examples: python-ai-agent-frameworks-demos Python + AI: Model Context Protocol 📺 Watch recording In the final session, we dive into the hottest technology of 2025: MCP (Model Context Protocol). This open protocol makes it easy to extend AI agents and chatbots with custom functionality, making them more powerful and flexible. We demonstrate how to use the Python FastMCP SDK to build an MCP server running locally and consume that server from chatbots like GitHub Copilot. Then we build our own MCP client to consume the server. Finally, we discover how easy it is to connect AI agent frameworks like LangGraph and Microsoft agent-framework to MCP servers. With great power comes great responsibility, so we briefly discuss the security risks that come with MCP, both as a user and as a developer. Slides for this session Code repository with examples: python-mcp-demo
Pamela_Fox
Oct 29, 2025 Place Educator Developer Blog
567Views
0likes
0Comments
Make your own private ChatGPT
Introduction Creating your own private ChatGPT allows you to leverage AI capabilities while ensuring data privacy and security. This guide walks you through building a secure, customized chatbot using tools like Azure OpenAI, Cosmos DB and Azure App service. Why Build a Private ChatGPT? With the rise of AI-driven applications, organizations, people often face challenges related to data privacy, customization, and integration. Building a private ChatGPT addresses these concerns by: Maintaining Data Privacy: Keep sensitive information within your infrastructure. Customizing Responses: Tailor the chatbot’s behavior and language to suit your requirements. Ensuring Security: Leverage enterprise-grade security protocols. Avoiding Data Sharing: Prevent your data from being used to train external models. If organizations do not take these measures their data may go into future model training and can leak your sensitive data to public. Eg: Chatgpt collects personal data mentioned in their privacy policy Prerequisites Before you begin, ensure you have: Access to Azure OpenAI Service. A development environment set up with Python. Basic knowledge of FastAPI and MongoDB. An Azure account with necessary permissions. If you do not have Azure subscription, try Azure for students for FREE. Step 1: Set Up Azure OpenAI Log in to the Azure Portal and create an Azure OpenAI resource. Deploy a model, such as GPT-4o (multimodal), and note down the endpoint and API key. Note there is also an option of keyless authentication. Configure permissions to control access. Step 2: Use Chatgpt like app sample You can select any repository to be as base template for your app, in this I will be using the third option AOAIchat. It is developed by me. GitHub - mckaywrigley/chatbot-ui: AI chat for any model. Azure-Samples/azure-search-openai-demo: A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. sourabhkv/AOAIchat: Azure OpenAI chat This architecture diagram represents a typical flow for a private ChatGPT application with the following components: App UX (User Interface): This is the front-end application (mobile, web, or desktop) where users interact with the chatbot. It sends the user's input (prompt) and displays the AI's responses. App Service: Acts as the backend application, handling user requests and coordinating with other services. Functions: Receives user inputs and prepares them for processing by the Azure OpenAI service. Streams AI responses back to the App UX. Reads from and writes to Cosmos DB to manage chat history. Azure OpenAI Service: This is the core AI service, processing the user input and generating responses using models like GPT-4o. The App Service sends the user input (along with context) to this service and receives the AI-generated responses. Cosmos DB: A NoSQL database used to store and manage chat history. Operations: Writes user messages and AI-generated responses for future reference or analysis. Reads chat history to provide context for AI responses, enabling more intelligent and contextual conversations. Data Flow: User inputs are sent from the App UX to the App Service. The App Service forwards the input (with additional context, if needed) to Azure OpenAI. Azure OpenAI generates a response, which is streamed back to the App UX via the App Service. The App Service writes user inputs and AI responses to Cosmos DB for persistence. This architecture ensures scalability, secure data handling, and the ability to provide contextual responses by integrating database and AI services. What can you do with my template? AOAIchat supports personal, enterprise chat enabled by RAG People can enable RAG mode if they want to search within their database, else it behaves like normal ChatGPT. It supports multimodality, (supports image, text input) also depends on model deployed in Azure AI foundry. Step 3: Deploy to Azure Deploy a Cosmos DB account in nearest region Deploy Azure OpenAI model (gpt-4o, gpt-4o-mini recommended) Deploy Azure App service, try using container I would recommend B1plan to your nearest region, select docker registry sourabhkv/aoaichatdb:0.1 startup command uvicorn app:app --host 0.0.0.0 --port 80 After app service starts, put all environment variables The application requires the following environment variables to be set for proper configuration: Environment Variable Description AZURE_OPENAI_ENDPOINT The endpoint for Azure OpenAI API. AZURE_OPENAI_API_KEY API key for accessing Azure OpenAI. DEPLOYMENT_NAME Azure OpenAI deployment name. API_VERSION API version for Azure OpenAI. MAX_TOKENS Maximum tokens for API responses. MONGO_DETAILS MongoDB connection string. AZURE_OPENAI_ENDPOINT=<your_azure_openai_endpoint> AZURE_OPENAI_API_KEY=<your_azure_openai_api_key> DEPLOYMENT_NAME=<your_deployment_name> API_VERSION=<your_api_version> MAX_TOKENS=<max_tokens> MONGO_DETAILS=<your_mongo_connection_string> Optional feature: implement authentication to secure access. Within app service select Authentication and select service providers. I went with Entra based authentication with single tenant. There is option of multi-tenant, personal accounts as well. Restart App service and within 2 minutes your private ChatGPT is ready. Pricing Pricing may depend on the plan you have deployed resources and region. Check Azure calculator for price estimation. My estimate for pricing I deployed all my resources in Sweden central Cosmos DB config - Cosmos DB for MongoDB (RU) serverless config with single write master, 2 GB transactional storage, 2 backup plan (FREE) ~ 0.75$ Azure OpenAI service - plan S0, model gpt-4o-mini global deployment, Input 20000 tokens, Output 10000 tokens ~ 9.00$ App service plan - OS Linux, Tier B1, instance count 1 ~13.14$ Total monthly cost = 22.89$ This price may vary in future, in region I calculated my configuration in Azure calculator Governance Azure OpenAI provides content filters to block any kind of input that violates responsible AI practices. Categories include Hate and Fairness Sexual Violence Self-harm User Prompt Attacks (direct and indirect) The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions. Azure OpenAI Service includes default safety settings applied to all models set as medium. Content filters can be modified to different level depending on use case. It supports RAG, I have provided detailed solution for it in my GitHub. Practical implementation GE Aerospace, in partnership with Microsoft and Accenture, has launched a company-wide generative AI platform, leveraging Microsoft Azure and Azure OpenAI Service. This solution aims to transform asset tracking and compliance in aviation, enabling quick access to maintenance records and reducing manual processing time from days to minutes. It supports informed decision-making by providing insights into aircraft leasing, compliance gaps, and asset health. For enterprises implementing private ChatGPT solutions, this illustrates the potential of generative AI for streamlining document-intensive processes while ensuring data security and compliance through cloud-based infrastructure like Azure. GE Aerospace Launches Company-wide Generative AI Platform for Employees | GE Aerospace News Build your own private ChatGPT style app with enterprise-ready architecture - By Microsoft Mechanics How to make private ChatGPT for FREE? It can be FREE if all of the setup is running locally on your hardware. Cosmos DB <-> MongoDB. Azure OpenAI <-> Ollama / LM studio Refer this NOTE : I have used gpt-4o, gpt-4o-mini these values are hardcoded in webpage, if you are using other models, you might have to change them in index.html. App Service <-> Local machine Register for Github models to access API for FREE. Note: GitHub models have rate limit for different models. Useful links sourabhkv/AOAIchat: Azure OpenAI chat What is RAG? Get started with Azure OpenAI API Chat with Azure OpenAI models using your own data
sourabhkv
Jun 05, 2025 Place Educator Developer Blog
13KViews
1like
1Comment
Building Custom Chat AI: A Comprehensive Guide for Developers
In today's rapidly evolving digital landscape, the integration of artificial intelligence (AI) into business operations has become a pivotal strategy for companies aiming to enhance their customer engagement and streamline their processes. This article delves into the foundational steps and considerations for developers embarking on the journey of building a custom chat AI for their company website. From understanding the core concepts of AI to selecting the right models and implementing effective prompt engineering techniques, this guide provides a comprehensive overview to help developers navigate the complexities of AI development. Whether you are a beginner or have some experience in the field, the insights shared here will equip you with the knowledge and tools needed to create a robust and efficient chat AI tailored to your business needs. A discussion will be held with Nitya Narasimhan, Senior Cloud Advocate at Microsoft specializing in AI, and Wey Gu, a Chinese AI MVP, to delve into these critical topics. What are the first steps a developer should take when starting to build a custom chat AI for their company website? Nitya: If you are new to AI, start by familiarizing yourself with the core concepts and usage of AI models. A course like Generative AI for Beginnerscan be a great starting point. Next, get hands-on experience with models by trying out GitHub Models, which are free to use with just a GitHub account. This will help you build your intuition for model selection and prompt engineering. If you already have some experience, the initial steps to building a custom chat AI are as follows: Identify the use case and requirements (e.g., typical questions asked and valid responses). Choose a model to start prototyping (test the question with various models and compare results). If your chat AI is grounded in your data, identify the data sources and formats (where and what). Select an AI app template to jumpstart development and customize it with your model and data choices. How does understanding model choice impact the development of a custom chat AI? Nitya: Understanding model choice is crucial for developing a custom chat AI. It involves evaluating models based on three key factors: cost, customization, and performance. Customization: Start by identifying the task you want to execute (e.g., chat, image, embeddings, agents). Filter models that support this capability and validate them with a test prompt to ensure they fit your requirements. This process will narrow down your options from thousands to a few suitable models. Cost: Consider whether the model supports serverless deployments (pay-as-you-go, per token) or managed deployments (subscription-based, per VM). Evaluate costs not just for usage (chat completion) but also for end-to-end development (evaluations, iterative ideation). Performance: Assess models based on latency (e.g., chat completions vs. reasoning models) and the quality and safety of responses. Understand default model characteristics (model card) and perform custom evaluations to ensure quality for your desired prompts dataset. Can you explain the concept of prompt engineering and how it can be applied using GitHub models? Nitya: Prompt engineering involves guiding the model on how to process questions and generate responses to improve quality. Think of developers as teachers and models as students being taught to answer exam questions. Prompt engineering provides a rubric to guide models in giving relevant answers. This includes providing examples, creating personas (e.g., "answer politely using formal language"), defining output formats (e.g., "answer in 1-2 sentences", "reply with results in JSON format"), and configuring model parameters (e.g., temperature, stop-words, top-p, max tokens). When working with GitHub models, you can configure models using the Playground (UI) or move to an IDE with the Azure AI Inference API, offering both low-code and code-first options for prompt engineering. What is retrieval augmented generation (RAG), and how does it enhance the ability to chat with data? Wei: RAG involves grounding user questions in retrieved knowledge from private data sources to ensure responses are relevant to the application scenario. It works by wrapping the initial user prompt in a prompt template to create the final model prompt sent to the model. The RAG workflow includes retrieval of knowledge, augmentation of the prompt, and generation of the response. This dynamic process provides relevant grounding data and instructions to contextualize user questions for app-required responses. What are some practical tips for developers to streamline their end-to-end journey from catalog to cloud? Nitya: Here are three tips to get started: Model Selection: Use GitHub Models with diverse test prompts to build intuition for prompt engineering and model capabilities. Compare models side-by-side. Copilot Development: Start with an Azure AI app template. Deploy it to understand the application and its architecture before customizing it to your needs. Validate your development environment and get familiar with tools. Safety & Evaluation: Explore built-in content safety filters and evaluators in the Azure AI platform to understand metrics and effectiveness of your prompt engineering or RAG strategy. Use tracing and App Insights to monitor performance and cost. What are some common challenges developers might face when building a custom chat AI, and how can they overcome them? Nitya and Wei: There are many challenges we can think of - here are three that are important: App Architecture: Understand the app architecture for your scenario (e.g., RAG, multi-agent). Explore existing AI app templates to build intuition and customize one that fits your requirements. Model Choice: Choose models based on cost, quota availability, and flexibility for future configuration. Use the Azure AI model inference API to abstract provider-specific SDKs and decouple your code from your choice, allowing for easier model swaps later. Observability: Debug issues in app development or execution performance. Use platforms and tools that bring observability to the end-to-end workflow. Activate App Insights and use tracing tools to generate telemetry for insights locally or in production. What resources and samples are available for further exploration into this subject? Wei: Explore Azure AI App Templates, For Beginners Curricula, RAG Chat Workshop, AI Tour Workshops and Generative AI for Beginners. For more workshops and talks, visit https://aka.ms/aitour/repos. Feel free to check out opensource projects like AutoGen, LlamaIndex, LangChain, and CamelAI's documentation. As we conclude this exploration into building a custom chat AI for your company website, it's clear that the journey is both challenging and rewarding. By understanding the core concepts of AI, selecting the right models, and mastering prompt engineering, developers can create a powerful tool that enhances customer engagement and streamlines business operations. The insights and practical tips shared in this article provide a solid foundation for embarking on this journey. Remember, the key to success lies in continuous learning and adaptation. As AI technology evolves, you should also adapt your approach to developing and refining your chat AI. Stay curious, stay innovative, and most importantly, stay committed to delivering the best possible experience for your users.
RochelleSonnenberg
Apr 29, 2025 Place Microsoft MVP Program Blog
233Views
1like
0Comments
AI Agents: Metacognition for Self-Aware Intelligence - Part 9
This blog post, Part 9 in a series on AI agents, introduces the concept of metacognition, or "thinking about thinking," and its application to AI agents. It explains how metacognition enables agents to self-evaluate, adapt, and improve their performance. The post outlines the key components of an AI agent and illustrates metacognition with a travel agent example, demonstrating how it can enhance planning, error correction, and personalization. The post also discusses the Corrective RAG approach and demonstrates code snippets.
ShivamGoyal03
Apr 28, 2025 Place Educator Developer Blog
673Views
0likes
0Comments
AI Agents: Mastering Agentic RAG - Part 5
This blog post, Part 5 of a series on AI agents, explores Agentic RAG (Retrieval-Augmented Generation), a paradigm shift in how LLMs interact with external data. Unlike traditional RAG, Agentic RAG allows LLMs to autonomously plan their information retrieval process through an iterative loop of actions and evaluations. The post highlights the importance of the LLM "owning" the reasoning process, dynamically selecting tools and refining queries. It covers key implementation details, including iterative loops, tool integration, memory management, and handling failure modes. Practical use cases, governance considerations, and code examples demonstrating Agentic RAG with AutoGen, Semantic Kernel, and Azure AI Agent Service are provided. The post concludes by emphasizing the transformative potential of Agentic RAG and encourages further exploration through linked resources and previous blog posts in the series.
ShivamGoyal03
Mar 31, 2025 Place Educator Developer Blog
2.7KViews
1like
0Comments
Create your own QA RAG Chatbot with LangChain.js + Azure OpenAI Service
Demo: Mpesa for Business Setup QA RAG Application In this tutorial we are going to build a Question-Answering RAG Chat Web App. We utilize Node.js and HTML, CSS, JS. We also incorporate Langchain.js + Azure OpenAI + MongoDB Vector Store (MongoDB Search Index). Get a quick look below. Note: Documents and illustrations shared here are for demo purposes only and Microsoft or its products are not part of Mpesa. The content demonstrated here should be used for educational purposes only. Additionally, all views shared here are solely mine. What you will need: An active Azure subscription, get Azure for Student for free or get started with Azure for 12 months free. VS Code Basic knowledge in JavaScript (not a must) Access to Azure OpenAI, click here if you don't have access. Create a MongoDB account (You can also use Azure Cosmos DB vector store) Setting Up the Project In order to build this project, you will have to fork this repository and clone it. GitHub Repository link: https://github.com/tiprock-network/azure-qa-rag-mpesa . Follow the steps highlighted in the README.md to setup the project under Setting Up the Node.js Application. Create Resources that you Need In order to do this, you will need to have Azure CLI or Azure Developer CLI installed in your computer. Go ahead and follow the steps indicated in the README.md to create Azure resources under Azure Resources Set Up with Azure CLI. You might want to use Azure CLI to login in differently use a code. Here's how you can do this. Instead of using az login. You can do az login --use-code-device OR you would prefer using Azure Developer CLI and execute this command instead azd auth login --use-device-code Remember to update the .env file with the values you have used to name Azure OpenAI instance, Azure models and even the API Keys you have obtained while creating your resources. Setting Up MongoDB After accessing you MongoDB account get the URI link to your database and add it to the .env file along with your database name and vector store collection name you specified while creating your indexes for a vector search. Running the Project In order to run this Node.js project you will need to start the project using the following command. npm run dev The Vector Store The vector store used in this project is MongoDB store where the word embeddings were stored in MongoDB. From the embeddings model instance we created on Azure AI Foundry we are able to create embeddings that can be stored in a vector store. The following code below shows our embeddings model instance. //create new embedding model instance const azOpenEmbedding = new AzureOpenAIEmbeddings({ azureADTokenProvider, azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME, azureOpenAIApiEmbeddingsDeploymentName: process.env.AZURE_OPENAI_API_DEPLOYMENT_EMBEDDING_NAME, azureOpenAIApiVersion: process.env.AZURE_OPENAI_API_VERSION, azureOpenAIBasePath: "https://eastus2.api.cognitive.microsoft.com/openai/deployments" }); The code in uploadDoc.js offers a simple way to do embeddings and store them to MongoDB. In this approach the text from the documents is loaded using the PDFLoader from Langchain community. The following code demonstrates how the embeddings are stored in the vector store. // Call the function and handle the result with await const storeToCosmosVectorStore = async () => { try { const documents = await returnSplittedContent() //create store instance const store = await MongoDBAtlasVectorSearch.fromDocuments( documents, azOpenEmbedding, { collection: vectorCollection, indexName: "myrag_index", textKey: "text", embeddingKey: "embedding", } ) if(!store){ console.log('Something wrong happened while creating store or getting store!') return false } console.log('Done creating/getting and uploading to store.') return true } catch (e) { console.log(`This error occurred: ${e}`) return false } } In this setup, Question Answering (QA) is achieved by integrating Azure OpenAI’s GPT-4o with MongoDB Vector Search through LangChain.js. The system processes user queries via an LLM (Large Language Model), which retrieves relevant information from a vectorized database, ensuring contextual and accurate responses. Azure OpenAI Embeddings convert text into dense vector representations, enabling semantic search within MongoDB. The LangChain RunnableSequence structures the retrieval and response generation workflow, while the StringOutputParser ensures proper text formatting. The most relevant code snippets to include are: AzureChatOpenAI instantiation, MongoDB connection setup, and the API endpoint handling QA queries using vector search and embeddings. There are some code snippets below to explain major parts of the code. Azure AI Chat Completion Model This is the model used in this implementation of RAG, where we use it as the model for chat completion. Below is a code snippet for it. const llm = new AzureChatOpenAI({ azTokenProvider, azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME, azureOpenAIApiDeploymentName: process.env.AZURE_OPENAI_API_DEPLOYMENT_NAME, azureOpenAIApiVersion: process.env.AZURE_OPENAI_API_VERSION }) Using a Runnable Sequence to give out Chat Output This shows how a runnable sequence can be used to give out a response given the particular output format/ output parser added on to the chain. //Stream response app.post(`${process.env.BASE_URL}/az-openai/runnable-sequence/stream/chat`, async (req,res) => { //check for human message const { chatMsg } = req.body if(!chatMsg) return res.status(201).json({ message:'Hey, you didn\'t send anything.' }) //put the code in an error-handler try{ //create a prompt template format template const prompt = ChatPromptTemplate.fromMessages( [ ["system", `You are a French-to-English translator that detects if a message isn't in French. If it's not, you respond, "This is not French." Otherwise, you translate it to English.`], ["human", `${chatMsg}`] ] ) //runnable chain const chain = RunnableSequence.from([prompt, llm, outPutParser]) //chain result let result_stream = await chain.stream() //set response headers res.setHeader('Content-Type','application/json') res.setHeader('Transfer-Encoding','chunked') //create readable stream const readable = Readable.from(result_stream) res.status(201).write(`{"message": "Successful translation.", "response": "`); readable.on('data', (chunk) => { // Convert chunk to string and write it res.write(`${chunk}`); }); readable.on('end', () => { // Close the JSON response properly res.write('" }'); res.end(); }); readable.on('error', (err) => { console.error("Stream error:", err); res.status(500).json({ message: "Translation failed.", error: err.message }); }); }catch(e){ //deliver a 500 error response return res.status(500).json( { message:'Failed to send request.', error:e } ) } }) To run the front end of the code, go to your BASE_URL with the port given. This enables you to run the chatbot above and achieve similar results. The chatbot is basically HTML+CSS+JS. Where JavaScript is mainly used with fetch API to get a response. Thanks for reading. I hope you play around with the code and learn some new things. Additional Reads Introduction to LangChain.js Create an FAQ Bot on Azure Build a basic chat app in Python using Azure AI Foundry SDK
theophilusO
Mar 12, 2025 Place Educator Developer Blog
586Views
0likes
0Comments
Introducing BioAgents: Advancing Bioinformatics with Multi-Agent Systems
BioAgents is a multi-agent system designed to improve bioinformatics analysis by leveraging specialized agents fine-tuned on bioinformatics data and enhanced with retrieval-augmented generation.
Venkat_Malladi
Jan 14, 2025 Place Healthcare and Life Sciences Blog
2.3KViews
3likes
0Comments
Tiny But Mighty: Unleashing the Power of Small Language Models 🚀
While Large Language Models (LLMs) like GPT-4 dominate headlines with their extensive capabilities, they often come at the cost of high computational requirements and complexity. For developers and organizations looking to implement AI solutions on edge devices or with limited resources, Small Language Models (SLMs) are emerging as a practical alternative. SLMs are not just "smaller" versions of their larger counterparts—they're designed to be faster, more efficient, and adaptable for specific tasks. With fewer parameters and lower computational needs, SLMs open the door to deploying AI on mobile devices, IoT systems, and edge environments without compromising performance. What You Stand to Learn 🧠 Introduction to Microsoft's AI Ecosystem Discover Microsoft's end-to-end AI development tools, from Azure AI Services to ONNX Runtime, enabling efficient and secure deployment of AI models across cloud and edge environments. The Advantages of SLMs over LLMs SLMs are game-changers for edge AI applications, providing faster training and inference times, reduced energy costs, and scalability across diverse devices. Hands-On with Phi-3 and ONNX Runtime Experience live demonstrations of SLMs in action with tools like Phi-3 and ONNX Runtime, showcasing how to fine-tune and deploy models on mobile devices, IoT, and hybrid cloud environments. Responsible AI Practices Understand how to safeguard your AI applications with Microsoft's Responsible AI toolkit, ensuring ethical and trustworthy deployments. Watch the Full Session 👨‍💻 📅 Date: December 12, 2024 ⏰ Time: 4 PM GMT | 5 PM CEST | 8 AM PT | 11 AM ET | 7 PM EAT A session packed with live demos, practical examples, and Q&A opportunities. Register NOW | Events | Microsoft Reactor Agenda 🔍 Introduction (5 min) A brief overview of the session and its focus on SLMs and LLMs. Microsoft AI Tooling (5 min) Explore the latest tools like Azure AI Services, Azure Machine Learning, and Responsible AI Tooling. How to Choose the Right Model (10 min) Key considerations such as performance, customizability, and ethical implications. Comparing SLMs vs LLMs (10 min) The strengths, weaknesses, and best use cases for both Small and Large Language Models. Deploying Models at the Edge (10 min) Insights into optimizing AI for mobile, IoT, and edge devices. Q&A Addressing participant questions about AI development and deployment.
RayanPopat
Dec 05, 2024 Place Educator Developer Blog
450Views
2likes
0Comments
What is retrieval-augmented generation (RAG)?
Discover how Retrieval-Augmented Generation (RAG) is transforming AI by combining data retrieval with language generation, delivering smarter and more precise responses in an information-rich world.
sourabhkv
Nov 12, 2024 Place Educator Developer Blog
4.9KViews
0likes
0Comments
Building Intelligent Applications with Local RAG in .NET and Phi-3: A Hands-On Guide
Let's learn how to do Retrieval Augmented Generation (RAG) using local resources in .NET! In this post, we’ll show you how to combine the Phi-3 language model, Local Embeddings, and Semantic Kernel to create a RAG scenario.
elbruno
Sep 19, 2024 Place Educator Developer Blog
18KViews
5likes
13Comments