Get excited: fine tuning is now available for GPT-3.5-Turbo, Babbage-002, and Davinci-002 in Public Preview! This update lets developers customize their favorite OpenAI models with their own data and easily deploy their new custom models, all within a super easy to use managed service. We launched Azure OpenAI Service in January, and it’s been amazing to watch developers bring the power of generative AI to their applications; today marks a new chapter in this journey – we’re making it possible to customize models using your data, to solve your problems.
In this blog we’ll talk about...
Today, we’re launching two new base inference models (Babbage-002 and Davinci-002) and fine-tuning capabilities for three models (Babbage-002, Davinci-002, and GPT-3.5-Turbo).
New models: Babbage-002 and Davinci-002 are GPT3 base models, intended for completion use cases. They can generate natural language or code, but they’re not trained for instruction following. Babbage-002 replaces the deprecated Ada and Babbage models, while Davinci-002 replaces Curie and Davinci. These models support the completion API.
Fine tuning: You’ll now be able to use Azure OpenAI Service to fine tune Babbage/Davinci-002 and GPT-3.5-Turbo. Babbage-002 and Davinci-002 support completion, while Turbo supports conversational interactions. You’ll be able to specify your base model, provide your data, train, and deploy – all with a few commands.
Fine tuning is one of the methods available to developers and data scientists looking to customize large language models for specific tasks. While approaches like Retrieval Augmented Generation (RAG) and prompt engineering work by injecting the right information and instructions into your prompt, fine tuning operates by customizing the large language model itself.
Azure OpenAI Service offers Supervised Fine Tuning, which allows you to provide custom data (prompt/completion or conversational chat, depending on the model) to teach the base model new skills.
Think of fine tuning as an “expert mode” feature: super powerful, but requiring a solid foundation built on the basics. Fine tuning can make good models better , but you need appropriate use case, high quality data, and the right models and prompts to succeed.
Before you start fine tuning, we recommend starting with prompt engineering or RAG (Retrieval Augmented Generation) to develop a baseline – it’s the fastest way to get started, and we make it easy with tools like Prompt Flow or On Your Data. Starting with prompt engineering and RAG will provide a baseline you can compare against in scenarios where you do need to fine tune a model. Most fine-tuned models in production will incorporate both prompt engineering and fine tuning so no effort is wasted!
Need help deciding when (or if) you should be fine tuning? A few ground rules can help guide you!
Don’t start with fine tuning if:
Consider fine tuning if:
Fine Tuning with Azure Open AI Service gets you the best of both worlds: the ability to customize advanced OpenAI LLMs, while deploying on Azure’s secure, enterprise ready cloud services. One of the risks of fine tuning is inadvertently introducing harmful data into your model; our content moderation allows you to fine tune with the data you need, while still filtering out any harmful responses.
If you’re new to Azure OpenAI Service and LLMs: welcome! We offer a super simple API to train and deploy your models – or if you’re more comfortable with a GUI, try out Azure OpenAI Studio. If you’re migrating to Azure from OpenAI, our APIs are compatible!
There are two parts to fine tuning: training your fine-tuned model and using your newly customized model for inference.
Training: Specify your base model, your training and validation data, and set any hyperparameters – and you’re ready to go! You can use the Azure OpenAI Studio for a simple GUI, or more advanced users can use our REST APIs or the OpenAI Python SDK.
When you’ve finished fine tuning, your completed job will return evaluation metrics like training and validation loss.
We offer fine tuning as a managed service, so you don’t have to worry about managing compute resources or capacity. When you submit a job, all you’ll be paying for is the active training time, billed in 15 minute intervals, for successful fine-tuning runs.
Inference in Azure OpenAI Service: When the training job has succeeded, your new model will be available within your resource. When you’re ready to start using your model for inferencing, your customized model can be deployed just like any other OpenAI LLM!
Fine tuned models are subject to an hourly hosting charge, as well as token based pricing for input and output data. Check out the Azure OpenAI pricing page for the latest pricing and more details. If you don’t need to use your model right away, there’s no charge for storing trained models without deploying them.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.