ollama
8 TopicsUsing DeepSeek-R1 on Azure with JavaScript
The pace at which innovative AI models are being developed is outstanding! DeepSeek-R1 is one such model that focuses on complex reasoning tasks, providing a powerful tool for developers to build intelligent applications. The week, we announced its availability on GitHub Models as well as on Azure AI Foundry. In this article, we’ll take a look at how you can deploy and use the DeepSeek-R1 models in your JavaScript applications. TL;DR key takeaways DeepSeek-R1 models focus on complex reasoning tasks, and is not designed for general conversation You can quickly switch your configuration to use Azure AI, GitHub Models, or even local models with Ollama. You can use OpenAI Node SDK or LangChain.js to interact with DeepSeek models. What you'll learn here Deploying DeepSeek-R1 model on Azure. Switching between Azure, GitHub Models, or local (Ollama) usage. Code patterns to start using DeepSeek-R1 with various libraries in TypeScript. Reference links DeepSeek on Azure - JavaScript demos repository Azure AI Foundry OpenAI Node SDK LangChain.js Ollama Requirements GitHub account. If you don't have one, you can create a free GitHub account. You can optionally use GitHub Copilot Free to help you write code and ship your application even faster. Azure account. If you're new to Azure, get an Azure account for free to get free Azure credits to get started. If you're a student, you can also get free credits with Azure for Students. Getting started We'll use GitHub Codespaces to get started quickly, as it provides a preconfigured Node.js environment for you. Alternatively, you can set up a local environment using the instructions found in the GitHub repository. Click on the button below to open our sample repository in a web-based VS Code, directly in your browser: Once the project is open, wait a bit to ensure everything has loaded correctly. Open a terminal and run the following command to install the dependencies: npm install Running the samples The repository contains several TypeScript files under the samples directory that demonstrate how to interact with DeepSeek-R1 models. You can run a sample using the following command: npx tsx samples/<sample>.ts For example, let's start with the first one: npx tsx samples/01-chat.ts Wait a bit, and you should see the response from the model in your terminal. You'll notice that it may take longer than usual to respond, and see a weird response that starts with a <think> tag. This is because DeepSeek-R1 is designed to be used for task that need complex reasoning, like solving problems or answering math questions, and not for you usual chat interactions. Model configuration By default, the repository is configured to use GitHub Models, so you can run any example using Codespaces without any additional setup. While it's great for quick experimentation, GitHub models limit the number of requests you can make in a day and the amount of data you can send in a single request. If you want to use the model more extensively, you can switch to Azure AI or even use a local model with Ollama. You can take a look at the samples/config.ts to see how the different configurations are set up. We'll not cover using Ollama models in this article, but you can find more information in the repository documentation. Deploying DeepSeek-R1 on Azure To experiment with the full capabilities of DeepSeek-R1, you can deploy it on Azure AI Foundry. Azure AI Foundry is a platform that allows you to deploy, manage and develop with AI models quickly. To use Azure AI Foundry, you need to have an Azure account. Let's start by deploying the model on Azure AI Foundry. First, follow this tutorial to deploy a serverless endpoint with the model. When it's time to choose the model, make sure to select the DeepSeek-R1 model in the catalog. Once your endpoint is deployed, you should be able to see your endpoint details and retrieve the URL and API key: Screenshot showing the endpoint details in Azure AI Foundry Then create a .env file in the root of the project and add the following content: AZURE_AI_BASE_URL="https://<your-deployment-name>.<region>.models.ai.azure.com/v1" AZURE_AI_API_KEY="<your-api-key>" Tip: if you're copying the endpoint from the Azure AI Foundry portal, make sure to add the /v1 at the end of the URL. Open the samples/config.ts file and update the default export to use Azure: export default AZURE_AI_CONFIG; Now all samples will use the Azure configuration. Explore reasoning with DeepSeek-R1 Now that you have the model deployed, you can start experimenting with it. Open the samples/08-reasoning.ts file to see how the model handles more complex tasks, like helping us understand a well-known weird piece of code. const prompt = ` float fast_inv_sqrt(float number) { long i; float x2, y; const float threehalfs = 1.5F; x2 = number * 0.5F; y = number; i = *(long*)&y; i = 0x5f3759df - ( i >> 1 ); y = *(float*)&i; y = y * ( threehalfs - ( x2 * y * y ) ); return y; } What is this code doing? Explain me the magic behind it. `; Now run this sample with the command: npx tsx samples/08-reasoning.ts You should see the model's response streaming piece by piece in the terminal, while describing its thought process before providing the actual answer to our question. Screenshot showing the model's response streaming in the terminal Brace yourself, as it might take a while to get the full response! At the end of the process, you should see the model's detailed explanation of the code, along with some context around it. Leveraging frameworks Most examples in this repository are built with the OpenAI Node SDK, but you can also use LangChain.js to interact with the model. This might be especially interested if you need to integrate other sources of data or want to build a more complex application. Open the file samples/07-langchain.ts to have a look at the setup, and see how you can reuse the same configuration we used with the OpenAI SDK. Going further Now it's your turn to experiment and discover the full potential of DeepSeek-R1! You can try more advanced prompts, integrate it into your larger application, or even build agents to make the most out of the model. To continue your learning journey, you can check out the following resources: Generative AI with JavaScript (GitHub): code samples and resources to learn Generative AI with JavaScript. Build a serverless AI chat with RAG using LangChain.js (GitHub): a next step code example to build an AI chatbot using Retrieval-Augmented Generation and LangChain.js.Build AI Agents with MCP Tool Use in Minutes with AI Toolkit for VSCode
We’re excited to announce Agent Builder, the newest evolution of what was formerly known as Prompt Builder, now reimagined and supercharged for intelligent app development. This powerful tool in AI Toolkit enables you to create, iterate, and optimize agents—from prompt engineering to tool integration—all in one seamless workflow. Whether you're designing simple chat interactions or complex task-performing agents with tool access, Agent Builder simplifies the journey from idea to integration. Why Agent Builder? Agent Builder is designed to empower developers and prompt engineers to: 🚀 Generate starter prompts with natural language 🔁 Iterate and refine prompts based on model responses 🧩 Break down tasks with prompt chaining and structured outputs 🧪 Test integrations with real-time runs and tool use such as MCP servers 💻 Generate production-ready code for rapid app development And a lot of features are coming soon, stay tuned for: 📝 Use variables in prompts �� Run agent with test cases to test your agent easily 📊 Evaluate the accuracy and performance of your agent with built-in or your custom metrics ☁️ Deploy your agent to cloud Build Smart Agents with Tool Use (MCP Servers) Agents can now connect to external tools through MCP (Model Control Protocol) servers, enabling them to perform real-world actions like querying a database, accessing APIs, or executing custom logic. Connect to an Existing MCP Server To use an existing MCP server in Agent Builder: In the Tools section, select + MCP Server. Choose a connection type: Command (stdio) – run a local command that implements the MCP protocol HTTP (server-sent events) – connect to a remote server implementing the MCP protocol If the MCP server supports multiple tools, select the specific tool you want to use. Enter your prompts and click Run to test the agent's interaction with the tool. This integration allows your agents to fetch live data or trigger custom backend services as part of the conversation flow. Build and Scaffold a New MCP Server Want to create your own tool? Agent Builder helps you scaffold a new MCP server project: In the Tools section, select + MCP Server. Choose MCP server project. Select your preferred programming language: Python or TypeScript. Pick a folder to create your server project. Name your project and click Create. Agent Builder generates a scaffolded implementation of the MCP protocol that you can extend. Use the built-in VS Code debugger: Press F5 or click Debug in Agent Builder Test with prompts like: System: You are a weather forecast professional that can tell weather information based on given location. User: What is the weather in Shanghai? Agent Builder will automatically connect to your running server and show the response, making it easy to test and refine the tool-agent interaction. AI Sparks from Prototype to Production with AI Toolkit Building AI-powered applications from scratch or infusing intelligence into existing systems? AI Sparks is your go-to webinar series for mastering the AI Toolkit (AITK) from foundational concepts to cutting-edge techniques. In this bi-weekly, hands-on series, we’ll cover: 🚀SLMs & Local Models – Test and deploy AI models and applications efficiently on your own terms locally, to edge devices or to the cloud 🔍 Embedding Models & RAG – Supercharge retrieval for smarter applications using existing data. 🎨 Multi-Modal AI – Work with images, text, and beyond. 🤖 Agentic Frameworks – Build autonomous, decision-making AI systems. Watch on Demand Share your feedback Get started with the latest version, share your feedback, and let us know how these new features help you in your AI development journey. As always, we’re here to listen, collaborate, and grow alongside our amazing user community. Thank you for being a part of this journey—let’s build the future of AI together! Join our Microsoft Azure AI Foundry Discord channel to continue the discussion 🚀The Startup Stage: Powered by Microsoft for Startups at European AI & Cloud Summit
🚀 The Startup Stage: Powered by Microsoft for Startups Take center stage in the AI and Cloud Startup Program, designed to showcase groundbreaking solutions and foster collaboration between ambitious startups and influential industry leaders. Whether you're looking to engage with potential investors, connect with clients, or share your boldest ideas, this is the platform to shine. Why Join the Startup Stage? Pitch to Top Investors: Present your ideas and products to key decision-makers in the tech world. Gain Visibility: Showcase your startup in a vibrant space dedicated to innovation, and prove that you are the next game-changer. Learn from the Best: Hear from visionary thought leaders and Microsoft AI experts about the latest trends and opportunities in AI and cloud. AI Competition: Propel Your Startup Stand out from the crowd by participating in the European AI & Cloud Startup Stage competition, exclusively designed for startups leveraging Microsoft AI and Azure Cloud services. Compete for prestigious awards, including: $25,000 in Microsoft Azure Credits. A mentoring session with Marco Casalaina, VP of Products at Azure AI. Fast-track access to exclusive resources through the Microsoft for Startups Program. Get ready to deliver a pitch in front of a live audience and an expert panel on 28 May 2025! How to Apply: Ensure your startup solution runs on Microsoft AI and Azure Cloud. Register as a conference and submit your Competiton application form before the deadline: 14 April 2025 at European Cloud and AI Summit. Be Part of Something Bigger This isn’t just an exhibition—it’s a thriving community where innovation meets opportunity. Don’t miss out! With tickets already 70% sold out, now’s the time to secure your spot. Join the European AI and Cloud Startup Area with a booth or launchpad, and accelerate your growth in the tech ecosystem. Visit the [European AI and Cloud Summit](https://ecs.events) website to learn more, purchase tickets, or apply for the AI competition. Download the sponsorship brochure for detailed insights into this once-in-a-lifetime event. Together, let’s shape the future of cloud technology. See you in Düsseldorf! 🎉Building a DeepSeek Extension for GitHub Copilot in VS Code
DeepSeek has been getting a lot of buzz lately, and with a little setup, you can start using it today in GitHub Copilot within VS Code. In this post, I’ll walk you through how to install and run a VS Code extension I built, so you can take advantage of DeepSeek right on your machine. With this extension, you can use “@deepseek” to explore the deepseek-coder model. It’s powered by Ollama, enabling seamless, fully offline interactions with DeepSeek models—giving you a local coding assistant that prioritizes privacy and performance. In a future post I'll walk you through the extension code and explain how to call models hosted locally using Ollama. Feel free to subscribe to get notified. Features and Benefits Open-Source and Extendable As an open-source project, the DeepSeek for GitHub Copilot extension is fully customizable. Advanced users can modify and extend its functionality, build from source, tweak configurations, and even integrate additional AI capabilities. Local AI Processing With the DeepSeek for GitHub Copilot extension, all interactions are processed locally on your machine, ensuring complete data privacy and eliminating latency issues. This makes it an ideal solution for developers working on sensitive projects or in restricted environments. Seamless Integration with GitHub Copilot Chat The extension integrates natively with GitHub Copilot Chat, allowing you to invoke DeepSeek models effortlessly. If you're already familiar with GitHub Copilot, you'll find the workflow intuitive and easy to use. You can simply type "@deepseek" followed by your question to get started. Powered by Ollama Ollama, a lightweight AI model runtime, powers the execution of DeepSeek models. It simplifies model management by handling downloads and execution, so you can focus on coding. Customizable Model Selection You can configure the extension to use different DeepSeek models through a simple setting adjustment. This flexibility allows you to choose the right model size and capability for your hardware. Please note that running bigger models might not work on your local system. You can take advantage of Azure's infrastructure to run bigger models. Installation Guide Installing and Running Ollama DeepSeek for GitHub Copilot requires Ollama to function properly. Ollama is an AI model runtime that allows you to run and manage large language models efficiently on your local machine. Install Ollama: Download the installer, install, and start Ollama from the Ollama website. Install from Visual Studio Code Marketplace: The simplest way to get started is by installing the extension directly from the Visual Studio Code Marketplace. Open Visual Studio Code. Navigate to the Extensions panel (Ctrl + Shift + X). Search for DeepSeek for GitHub Copilot and click Install. Using the Extension: Once installed, using the extension is straightforward: Open the GitHub Copilot Chat panel. Type @deepseek followed by your prompt to interact with the model. Note: On the first run, the extension will automatically download the DeepSeek model. This may take a few minutes, depending on your internet connection. Configuration and Customization DeepSeek for GitHub Copilot allows users to configure the AI model through Visual Studio Code settings. To change the DeepSeek model, update the settings.json file: { "deepseek.model.name": "deepseek-coder:1.3b" } A list of available models can be found on the Ollama website. Limitations and Workarounds Current Limitations The extension does not have access to your files in this version, meaning it cannot provide context-aware completions. This is due to the fact that DeepSeek models don't support Function Calling. Limited to local machine performance—larger models may require more RAM and CPU power. Workarounds To provide context for completions, manually copy-paste the relevant code into the chat. Optimize performance by selecting smaller DeepSeek models (such as deepseek-coder:1.3b) if you experience lag. System Requirements To run DeepSeek for GitHub Copilot Chat, ensure you have the following: Visual Studio Code (latest version recommended) Ollama app installed and running (Download from ollama.com) Sufficient system resources Minimum: 8GB RAM, multi-core CPU Recommended: 16GB RAM, GPU acceleration (if available) Conclusion The DeepSeek for GitHub Copilot Chat extension provides an excellent way for delivering privacy, low-latency responses, and offline capabilities. 🔗 Get Started Today: Install the DeepSeek for GitHub Copilot Chat extension and supercharge your GitHub Copilot Chat experience with AI—entirely offline! 🚀 ■ Co-authored with CopilotAI Toolkit for VS Code January Update
AI Toolkit is a VS Code extension aiming to empower AI engineers in transforming their curiosity into advanced generative AI applications. This toolkit, featuring both local-enabled and cloud-accelerated inner loop capabilities, is set to ease model exploration, prompt engineering, and the creation and evaluation of generative applications. We are pleased to announce the January Update to the toolkit with support for OpenAI's o1 model and enhancements in the Model Playground and Bulk Run features. What's New? January’s update brings several exciting new features to boost your productivity in AI development. Here's a closer look at what's included: Support for OpenAI’s new o1 Model: We've added access to GitHub hosted OpenAI’s latest o1 model. This new model replaces the o1-preview and offers even better performance in handling complex tasks. You can start interacting with the o1 model within VS Code for free by using the latest AI Toolkit update. Chat History Support in Model Playground: We have heard your feedback that tracking past model interactions is crucial. The Model Playground has been updated to include support for chat history. This feature saves chat history as individual files stored entirely on your local machine, ensuring privacy and security. Bulk Run with Prompt Templating: The Bulk Run feature, introduced in the AI Toolkit December release, now supports prompt templating with variables. This allows users to create templates for prompts, insert variables, and run them in bulk. This enhancement simplifies the process of testing multiple scenarios and models. Stay tuned for more updates and enhancements as we continue to innovate and support your journey in AI development. Try out the AI Toolkit for Visual Studio Code, share your thoughts, and file issues and suggest features in our GitHub repo. Thank you for being a part of this journey with us!Bring your own models on AI Toolkit - using Ollama and API keys
As we have seen in past blog posts, AI toolkit supports a range of models using Github Marketplace of models. However, you might require support for external models hosted by Google, Anthropic and Open AI which are either not available in the Github catalog of models or might want to use the models served by Ollama. We will cover both of these scenarios in this blog post. OpenAI, Anthropic and Google Hosted models Once you click on the Model catalog window and selected the models hosted by Google, Anthropic and OpenAI, you should see the following models selected. Also, you can add your APIs keys to the model in the following way On the above models, you should click on the "Try in playground", just below the model name model card and you should be able to see the following dialog box on the top search bar of the VS Code window. Here I have clicked on the "Try it in playground" link for Anthropic Claude 3.5 Sonnet model Enter your API Key and you are good to go. Also, as you can see in the dropdown text, you can also edit or change the value later. Similarly, you can perform the same action for the Google and OpenAI hosted models. Once you have done this you are free to use these models in the playground and for using the other features of AI toolkit extension. Using models served by Ollama Several developers are also using Ollama to experiment and play with models using the command line. Ollama is an open-source AI tool that allows users to run large language models (LLMs) on their local systems. It's a valuable tool for industries that require data privacy, such as healthcare, finance, and government which might need locally hosted models. So, AI toolkit already supports some locally downloadable models such as those in the Phi-series by Microsoft or those by Mistral. Ollama supports a wider variety of models especially those from Meta's Llama series of LLMs and SLMs. The complete list of models currently supported by Ollama can be found at Ollama library. We will run ollama on windows and when you run ollama and see help command you get the following output. Once you have selected the model from the library, you can use the ollama pull or ollama run to download the model. The run command will download the model and then run it if it's not already downloaded. The pull command will just download the model from the repository. Since I want you show you a multimodal model that can be run locally, I will be go to the command line and download and run llama3.2-vision model. See the commands for the same below We can also list the models already downloaded. As you can see, I have downloaded and tried a bunch of models. It might take a bit of time to download the models based on the speed of your internet connection. As you can see some of the models are quite large downloads. Also, you might need to make sure you have enough RAM on your laptop or desktop to run the models. So now we have downloaded the models, and we have run the models. Now, let's see how we can access them in AI Toolkit for VSCode. Go to the My models window as seen below. Click on the '+' symbol as seen below in the screenshot. Once you click on this you will see a dropdown in search. Click on add an Ollama model You will now see the choice to either select a model from ollama library or a custom ollama endpoint. For the purposes of this tutorial, we will select models. Let's select Multimodal modal we downloaded earlier. We should see the models we had seen in ollama list command earlier. See below. Now we can select the checkbox alongside llama3.2-vision:latest and select okay. You should now see the model appear in the My Models window like below. You can now right click on it and start using the model by loading in the playground. Since this is a multimodal model, you can use it to generate text as below. The screenshot below shows the model loaded via ollama and at the same time you can see that it is has the clip symbol activated in the window. Since this is a multi-modal model, we can give it an image and ask it questions. Which is what we will do next. Next, now let's attach an image and ask it questions about this image. This might take a bit more time as it will need to analyze the image and answer any questions. Since it is a generative AI model it can give slightly different inputs when given the same image. I can also ask the models questions about the image, and it will use the information from the image - the objects shown, the relationship between the objects and its world knowledge from its training to answer the questions. For example, see the session below. So, as you can the AI toolkit can be a fantastic place to try out different models from various sources. Ollama is also a great tool to try pre-built models locally securely without sending your data to the cloud, which can make it suitable for air-gapped environments and data privacy sensitive industries like fintech, healthcare and government. You have greater control over the models and the environment and the data they run on. It is also possible to customize the models and then serve them via ollama. This helps you choose the best model for your AI application. Resources AI toolkit for VSCode - https://marketplace.visualstudio.com/items?itemName=ms-windows-ai-studio.windows-ai-studio AI toolkit for VSCode on Github - https://github.com/microsoft/vscode-ai-toolkit Ollama - https://github.com/ollama/ollama Ollama library - https://ollama.com/library Azure AI Discord - https://aka.ms/AzureAI/Discord5.5KViews4likes0CommentsAI Toolkit for Visual Studio Code: October 2024 Update Highlights
The AI Toolkit’s October 2024 update revolutionizes Visual Studio Code with game-changing features for developers, researchers, and enthusiasts. Explore multi-model integration, including GitHub Models, ONNX, and Google Gemini, alongside custom model support. Dive into multi-modal capabilities for richer AI testing and seamless multi-platform compatibility across Windows, macOS, and Linux. Tailored for productivity, the enhanced Model Catalog simplifies choosing the best tools for your projects. Try it now and share feedback to shape the future of AI in VS Code!2.8KViews4likes0CommentsGetting Started - Generative AI with Phi-3-mini: A Guide to Inference and Deployment
Getting started with Microsoft Phi-3-mini - Inference Phi-3-mini models, Discover how Phi-3-mini, a new series of models from Microsoft, enables deployment of Large Language Models (LLMs) on edge devices and IoT devices. Learn how to use Semantic Kernel, Ollama/LlamaEdge, and ONNX Runtime to access and infer phi3-mini models, and explore the possibilities of generative AI in various application scenarios50KViews4likes13Comments