Generative AI is certainly reaching greater heights every day. With the type of new innovations happening in the field, its really amazing to see the progress in this field. Large Language Models (LLMs) which are one of the foundational factors for this are very much necessary to create any application on Generative AI. OpenAI’s GPT, undoubtedly is one of the leading LLM being used in the today’s world. This can be consumed by making some API calls. The model really doesn’t reside on user’s infrastructure. Instead, it is hosted on a cloud infrastructure and user consumes it as a service. For example, Microsoft provides Azure OpenAI services through its own cloud – Microsoft Azure.
What if we want to use a LLM on our infrastructure? Is there any method to achieve this? Certainly, there is!! With the advent of Opensource LLMs, this is definitely a reality today. Hugging face proves a lot of opensource Language models like Mistral, Phi-2, Orca, Llama to name a few.
One might now wonder that how exactly can we run it on local machine? Will they be compatible? To answer this, we have to definitely need to consider a lot of aspects, Majorly the infrastructure should be able to cater the model needs. An on-premise GPU is definitely a benefit!
In this regard there are some solutions which provides an environment to readily run these models on our infrastructure. Microsoft recently has launched the Windows AI Studio which helps the users in this regard!
Windows AI Studio streamlines the development of generative AI applications by integrating advanced AI development tools and models from Azure AI Studio and other repositories like Hugging Face.
Developers using Windows AI Studio can refine, customize, and deploy cutting-edge Small Language Models (SLMs) for local use within their windows applications. The platform offers a comprehensive guided workspace setup, complete with a model configuration user interface and step-by-step instructions for fine-tuning popular SLMs (such as Phi) and state-of-the-art models like Llama 2 and Mistral.
Developers can efficiently test their fine-tuned models by utilizing the integrated Prompt Flow and Gradio templates within the workspace of Windows AI Studio.
Now since we know the benefits of Windows AI Studio, lets get started with the installation of this powerful tool!
The best part of this tool is that it is available on the Visual Studio Code as an extension! We can directly add it from the extensions tab in Visual Studio code. But before doing that we need to complete some basic checklist on our machine!
Navigate to Windows AI Studio. This should open a GitHub page with all the required documentation needed for installation. Please make sure to go through this once before proceeding ahead.
Note: Windows AI Studio will run only on NVIDIA GPUs for the preview, so please make sure to check your device spec prior to installing it. WSL Ubuntu distro 18.4 or greater should be installed and is set to default prior to using Windows AI Studio.
Lets begin with the initial installation of Windows Subsystem for Linux (WSL).This lets the developers install a Linux distribution (such as Ubuntu, OpenSUSE, Kali, Debian, Arch Linux, etc.) and use Linux applications, utilities, and Bash command-line tools directly on Windows, unmodified, without the overhead of a traditional virtual machine or dualboot setup.
Navigate to the start button on your machine and search for “Windows PowerShell”. Now right click and select “Run as administrator”. Click on “yes” in the dialogue box and the PowerShell should now be launched on your machine.
Once it is ready, type in the following command,
wsl --install
Note: You must be running Windows 10 version 2004 and higher (Build 19041 and higher) or Windows 11 to use the command. Incase if you are using previous versions refer manual installation.
Now again navigate to the start menu on your machine and search for Microsoft store. Once Microsoft store is launched, search for “Ubuntu”.
Download the application.. Once this is completed, Reboot the machine. The rebooting step is very crucial or else the changes will not be reflected.
Upon Rebooting, you must be able to see the ubuntu on start menu, type Ubuntu on the start menu and launch it.
Now enter the following command in the Ubuntu terminal
sudo apt update
Once this is completed, type in the following command
sudo apt upgrade
This will ask a Y/N, type in “Y” and the installation of packages should begin after that.
Now type in,
#! /bin/bash
Note:Make sure you type in the sudo apt update and sudo apt upgrade to verify that its ready.
Its time to get CUDA installed now, Let’s begin by adding necessary repositories,
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
Next type in the following command,
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
Once this is completed, the next task is to import the keys, to do so, use the following command,
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
Now type the following command,
sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
Press “Enter” key to continue.
Note: If this doesn’t work, run the command -
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub
and then retry running the above command.
Now again run the following command
sudo apt update
Its not time to finally install the CUDA!! Lets do it now
Use the following command to do it,
sudo apt install cuda
Press “Y” when prompted.
This is going to install the CUDA components and it might take a while to complete the process.
Once the process is completed, Run the following command,
sudo reboot
Restart the machine now and that’s it we have completed the steps for installation. Once the machine is restarted, launch the ubuntu terminal from the start menu
Now in the Ubuntu terminal, type
code .
This should launch the Visual studio code. Since this is the first launch, it will collect few things.
Now Visual studio code window will be launched.
On the activity bar Visual Studio Code window, there is an “Extension” option . Click on this and search for “Windows AI Studio” and install the extension, Once it is installed, we can see an extra icon on the activity bar.
Click on the windows AI Studio and let is run some tasks.Upon running the tasks we can see a tick mark on all the tasks except for one –“ VS Code running in a local session. (Remote sessions are currently not supported when running the Windows AI Studio Actions. Please switch to a local session.)”
Note: For some of the users, there might be a cross mark on Conda detected, which means that this is not able to detect conda. In Such cases simply click the “Setup WSL Environment” and let the process complete. Once it is done restart the Visual Studio code.
To make it working the last step is to close the remote connection. To do this, simply click on the session –“WSL:Ubuntu” highlighted on the visual studio code on the bottom left side of the window.
Now a dropdown is shown. In the options, select “Close remote connection”.
This will open a local session of VS Code. Now again click on the extension of Windows AI Studio on the activity bar of VS Code.
Note: GitHub account needs to be signed in and will be prompted to do so. Login using the GitHub account.
Finally!! The much-awaited step is here and we have our own Studio which has the Language models to be used readily on the local machine!!
We can currently see 4 sections,
- Model-Fine Tuning
- RAG Project – Coming soon
- Phi-2 Playground – Coming soon
- Windows optimized models
For now, I will choose “Model fine tuning”. In the given form, give a title to the project in the Project Name section, Choose the Project location and select a model(In this case I have used Phi-2). Once it is ready the “Configure Project” Button will be enabled and the project is created!
It takes a while to load the project settings. Once it is done we are supposed to select the Model Inference settings, Fine tune settings and Data settings.
Once these are done click on generate project
Provide the hugging face token when prompted,
Once it is completed, Relaunch the window in workspace.
The readme files guides through the project details.
That’s how the LLM can be made ready to serve our purposes in our local machines! The Windows AI Studio surely is a great tool which has some great capabilities.