Blog Post

AI - Azure AI services Blog
5 MIN READ

Fine Tuning: now available with Azure OpenAI Service

AliciaFrame's avatar
AliciaFrame
Icon for Microsoft rankMicrosoft
Oct 16, 2023

Get excited: fine tuning is now available for GPT-3.5-Turbo, Babbage-002, and Davinci-002 in Public Preview! This update lets developers customize their favorite OpenAI models with their own data and easily deploy their new custom models, all within a super easy to use managed service. We launched Azure OpenAI Service in January, and it’s been amazing to watch developers bring the power of generative AI to their applications; today marks a new chapter in this journey – we’re making it possible to customize models using your data, to solve your problems.

 

In this blog we’ll talk about...

  • What’s new with Azure OpenAI Service  – new models and capabilities.
  • Why developers like you are fine tuning their models, and some tips and tricks for success along the way.
  • How you can get started today with Azure OpenAI Service
     

What’s new with Azure OpenAI Service?

Today, we’re launching two new base inference models (Babbage-002 and Davinci-002) and fine-tuning capabilities for three models (Babbage-002, Davinci-002, and GPT-3.5-Turbo).

 

New models: Babbage-002 and Davinci-002 are GPT3 base models, intended for completion use cases. They can generate natural language or code, but they’re not trained for instruction following. Babbage-002 replaces the deprecated Ada and Babbage models, while Davinci-002 replaces Curie and Davinci. These models support the completion API.

 

Fine tuning: You’ll now be able to use Azure OpenAI Service to fine tune Babbage/Davinci-002 and GPT-3.5-Turbo. Babbage-002 and Davinci-002 support completion, while Turbo supports conversational interactions. You’ll be able to specify your base model, provide your data, train, and deploy – all with a few commands.

 

 

Tell me more about fine tuning!

Fine tuning is one of the methods available to developers and data scientists looking to customize large language models for specific tasks. While approaches like Retrieval Augmented Generation (RAG) and prompt engineering work by injecting the right information and instructions into your prompt, fine tuning operates by customizing the large language model itself.

 

Azure OpenAI Service offers Supervised Fine Tuning, which allows you to provide custom data (prompt/completion or conversational chat, depending on the model) to teach the base model new skills.

 

Think of fine tuning as an “expert mode” feature: super powerful, but requiring a solid foundation built on the basics. Fine tuning can make good models better , but you need appropriate use case, high quality data, and the right models and prompts to succeed.

 

What fine tuning means for developers like you

Before you start fine tuning, we recommend starting with prompt engineering or RAG (Retrieval Augmented Generation) to develop a baseline – it’s the fastest way to get started, and we make it easy with tools like Prompt Flow or On Your Data.  Starting with prompt engineering and RAG will provide a baseline you can compare against in scenarios where you do need to fine tune a model. Most fine-tuned models in production will incorporate both prompt engineering and fine tuning so no effort is wasted!

 

Need help deciding when (or if) you should be fine tuning? A few ground rules can help guide you!

 

Don’t start with fine tuning if:

  • You want a simple and fast result: fine tuning is going to take a lot of data and time to train and evaluate your new model. If you’re short on time, you can usually get pretty far with just prompt engineering!
  • You need up-to-date or out of domain data: this is a perfect use case for RAG and Prompt Engineering!
  • You want to make sure your model is well grounded and avoid hallucinations:  this is another area where RAG shines!

Consider fine tuning if:

  • You want to teach the model a new skill so it’s good at one specific task like classification, summarization, or always responding in a specific format or tone. Sometimes you can fine tune a smaller model to perform just as well at a specific task as a bigger model!
  • You want to show the model how to do something with examples, where it’s too hard to explain in the prompt – or there are too many examples to fit in the context window. These are scenarios with lots of edge cases, like natural language to query, or teaching a model to speak in a specific voice or tone.
  • You want to reduce latency. Long prompts can take longer to process, and fine tuning lets you move those long prompts into the model itself.
     

Getting started with fine tuning on Azure

Fine Tuning with Azure Open AI Service gets you the best of both worlds: the ability to customize advanced OpenAI LLMs, while deploying on Azure’s secure, enterprise ready cloud services. One of the risks of fine tuning is inadvertently introducing harmful data into your model; our content moderation allows you to fine tune with the data you need, while still filtering out any harmful responses.

 

If you’re new to Azure OpenAI Service and LLMs: welcome!  We offer a super simple API to train and deploy your models – or if you’re more comfortable with a GUI, try out Azure OpenAI Studio. If you’re migrating to Azure from OpenAI, our APIs are compatible!

 

There are two parts to fine tuning: training your fine-tuned model and using your newly customized model for inference.

 

Training:  Specify your base model, your training and validation data, and set any hyperparameters – and you’re ready to go! You can use the Azure OpenAI Studio for a simple GUI, or more advanced users can use our REST APIs or the OpenAI Python SDK.

 

 

When you’ve finished fine tuning, your completed job will return evaluation metrics like training and validation loss.

 

We offer fine tuning as a managed service, so you don’t have to worry about managing compute resources or capacity. When you submit a job, all you’ll be paying for is the active training time, billed in 15 minute intervals, for successful fine-tuning runs.

 

Inference in Azure OpenAI Service: When the training job has succeeded, your new model will be available within your resource. When you’re ready to start using your model for inferencing, your customized model can be deployed just like any other OpenAI LLM!

 

Fine tuned models are subject to an hourly hosting charge, as well as token based pricing for input and output data. Check out the Azure OpenAI pricing page for the latest pricing and more details. If you don’t need to use your model right away, there’s no charge for storing trained models without deploying them.

Ready to get started?

Want to learn more?

 

 

Updated May 09, 2024
Version 3.0
  • This is exciting news for developers and data scientists! The availability of fine-tuning for GPT-3.5-Turbo, Babbage-002, and Davinci-002 in Public Preview is a significant step forward. It empowers developers to customize their favorite OpenAI models with their own data, offering a higher level of control and adaptability.

     

    The addition of new base inference models (Babbage-002 and Davinci-002) and the fine-tuning capabilities for three models mark a substantial enhancement to the Azure OpenAI Service. These new models, with their support for completion and conversational interactions, provide a broad spectrum of applications.

  • DmitriyBorodiy's avatar
    DmitriyBorodiy
    Copper Contributor

    When will Azure OpenAI services be available for all developers?

    Will there be a free plan for use in startups or applications in development?

  • agreca's avatar
    agreca
    Iron Contributor

    Is there any timeframe for when this will be available for more regions? Currently we are seeing the following message for East US 2.

     

  • Hi.

    I have a question. If I fine tune a GTP3.5-Turbo model to do something specific, for example respond in a local dialect, that means that my fine tuned model is gonna be good only for that new task and no longer gonna be good in other tasks like answering code?

  • MarcosInc001's avatar
    MarcosInc001
    Copper Contributor

    DiegoZumarragaMeraNein. Die Modelle sind auch nach einer Feinabstimmung oder wenn Sie per RAG Technik Manipuliert sind, für anwendungsspezifische Aufgaben wie "das Beantworten von Code", fast genauso*, genauso* oder sogar Besser* als das verwendete Basis Model.

    *Eine Ausnahme bilden die Techniken, in welchen man gezielt Fähigkeiten einschränkt, damit es zum Beispiel kein Code Beantwortet, um den gewollten Use-Case zu erfüllen.

    Leichte Veränderungen im Verhalten sind zu all dem immer vorbehalten.

     

    Fazit: Ja es sollte sich trotz der Manipulation gleich Entwickeln lassen.