Skip to content

Lesson 18

Open Source Models

Fine-Tuning Your LLM

Using large language models to build generative AI applications comes with new challenges. A key issue is ensuring response quality (accuracy and relevance) in content generated by the model for a given user request. In previous lessons, we discussed techniques like prompt engineering and retrieval-augmented generation that try to solve the problem by modifying the prompt input to the existing model.

In today's lesson, we discuss a third technique, fine-tuning, which tries to address the challenge by retraining the model itself with additional data. Let's dive into the details.

Learning Objectives

This lesson introduces the concept of fine-tuning for pre-trained language models, explores the benefits and challenges of this approach, and provides guidance on when and how to use fine tuning to improve the performance of your generative AI models.

By the end of this lesson, you should be able to answer the following questions:

  • What is fine tuning for language models?
  • When, and why, is fine tuning useful?
  • How can I fine-tune a pre-trained model?
  • What are the limitations of fine-tuning?

Ready? Let's get started.

Illustrated Guide

Want to get the big picture of what we'll cover before we dive in? Check out this illustrated guide that describes the learning journey for this lesson - from learning the core concepts and motivation for fine-tuning, to understanding the process and best practices for executing the fine-tuning task. This is a fascinating topic for exploration, so don't forget to check out the Resources page for additional links to support your self-guided learning journey!

Illustrated Guide to Fine Tuning Language Models

What is fine-tuning for language models?

By definition, large language models are pre-trained on large quantities of text sourced from diverse sources including the internet. As we've learned in previous lessons, we need techniques like prompt engineering and retrieval-augmented generation to improve the quality of the model's responses to the user's questions ("prompts").

A popular prompt-engineering technique involves giving the model more guidance on what is expected in the response either by providing instructions (explicit guidance) or giving it a few examples (implicit guidance). This is referred to as few-shot learning but it has two limitations:

  • Model token limits can restrict the number of examples you can give, and limit the effectiveness.
  • Model token costs can make it expensive to add examples to every prompt, and limit flexibility.

Fine-tuning is a common practice in machine learning systems where we take a pre-trained model and retrain it with new data to improve its performance on a specific task. In the context of language models, we can fine-tune the pre-trained model with a curated set of examples for a given task or application domain to create a custom model that may be more accurate and relevant for that specific task or domain. A side-benefit of fine-tuning is that it can also reduce the number of examples needed for few-shot learning - reducing token usage and related costs.

When and why should we fine-tune models?

In this context, when we talk about fine-tuning, we are referring to supervised fine-tuning where the retraining is done by adding new data that was not part of the original training dataset. This is different from an unsupervised fine-tuning approach where the model is retrained on the original data, but with different hyperparameters.

The key thing to remember is that fine-tuning is an advanced technique that requires a certain level of expertise to get the desired results. If done incorrectly, it may not provide the expected improvements, and may even degrade the performance of the model for your targeted domain.

So, before you learn "how" to fine-tune language models, you need to know "why" you should take this route, and "when" to start the process of fine-tuning. Start by asking yourself these questions:

  • Use Case: What is your use case for fine-tuning? What aspect of the current pre-trained model do you want to improve upon?
  • Alternatives: Have you tried other techniques to achieve the desired outcomes? Use them to create a baseline for comparison.
  • Prompt engineering: Try techniques like few-shot prompting with examples of relevant prompt responses. Evaluate the quality of responses.
  • Retrieval Augmented Generation: Try augmenting prompts with query results retrieved by searching your data. Evaluate the quality of responses.
  • Costs: Have you identified the costs for fine-tuning?
  • Tunability - is the pre-trained model available for fine-tuning?
  • Effort - for preparing training data, evaluating & refining model.
  • Compute - for running fine-tuning jobs, and deploying fine-tuned model
  • Data - access to sufficient quality examples for fine-tuning impact
  • Benefits: Have you confirmed the benefits for fine-tuning?
  • Quality - did fine-tuned model outperform baseline?
  • Cost - does it reduce token usage by simplifying prompts?
  • Extensibility - can you repurpose base model for new domains?

By answering these questions, you should be able to decide if fine-tuning is the right approach for your use case. Ideally, the approach is valid only if the benefits outweigh the costs. Once you decide to proceed, it's time to think about how you can fine tune the pre-trained model.

Want to get more insights on the decision-making process? Watch To fine-tune or not to fine-tune

How can we fine-tune a pre-trained model?

To fine-tune a pre-trained model, you need to have:

  • a pre-trained model to fine-tune
  • a dataset to use for fine-tuning
  • a training environment to run the fine-tuning job
  • a hosting environment to deploy fine-tuned model

Fine-Tuning In Action

The following resources provide step-by-step tutorials to walk you through a real example using a selected model with a curated dataset. To work through these tutorials, you need an account on the specific provider, along with access to the relevant model and datasets.

Provider Tutorial Description
OpenAI How to fine-tune chat models Learn to fine-tune a gpt-35-turbo for a specific domain ("recipe assistant") by preparing training data, running the fine-tuning job, and using the fine-tuned model for inference.
Azure OpenAI GPT 3.5 Turbo fine-tuning tutorial Learn to fine-tune a gpt-35-turbo-0613 model on Azure by taking steps to create & upload training data, run the fine-tuning job. Deploy & use the new model.
Hugging Face Fine-tuning LLMs with Hugging Face This blog post walks you fine-tuning an open LLM (ex: CodeLlama 7B) using the transformers library & Transformer Reinforcement Learning (TRL) with open datasets on Hugging Face.
🤗 AutoTrain Fine-tuning LLMs with AutoTrain AutoTrain (or AutoTrain Advanced) is a python library developed by Hugging Face that allows finetuning for many different tasks including LLM finetuning. AutoTrain is a no-code solution and finetuning can be done in your own cloud, on Hugging Face Spaces or locally. It supports both a web-based GUI, CLI and training via yaml config files.

Assignment

Select one of the tutorials above and walk through them. We may replicate a version of these tutorials in Jupyter Notebooks in this repo for reference only. Please use the original sources directly to get the latest versions.

Great Work! Continue Your Learning.

After completing this lesson, check out our Generative AI Learning collection to continue leveling up your Generative AI knowledge!

Congratulations!! You have completed the final lesson from the v2 series for this course! Don't stop learning and building. **Check out the RESOURCES page for a list of additional suggestions for just this topic.

Our v1 series of lessons have also been updated with more assignments and concepts. So take a minute to refresh your knowledge - and please share your questions and feedback to help us improve these lessons for the community.


Resources For Self-Guided Learning

The lesson was built using a number of core resources from OpenAI and Azure OpenAI as references for the terminology and tutorials. Here is a non-comprehensive list, for your own self-guided learning journeys.

1. Primary Resources

Title/Link Description
Fine-tuning with OpenAI Models Fine-tuning improves on few-shot learning by training on many more examples than can fit in the prompt, saving you costs, improving response quality, and enabling lower-latency requests. Get an overview of fine-tuning from OpenAI.
What is Fine-Tuning with Azure OpenAI? Understand what fine-tuning is (concept), why you should look at it (motivating problem), what data to use (training) and measuring the quality
Customize a model with fine-tuning Azure OpenAI Service lets you tailor our models to your personal datasets using fine-tuning. Learn how to fine-tune (process) select models using Azure AI Studio, Python SDK or REST API.
Recommendations for LLM fine-tuning LLMs may not perform well on specific domains, tasks, or datasets, or may produce inaccurate or misleading outputs. When should you consider fine-tuning as a possible solution to this?
Continuous Fine Tuning Continuous fine-tuning is the iterative process of selecting an already fine-tuned model as a base model and fine-tuning it further on new sets of training examples.
Fine-tuning and function calling Fine-tuning your model with function calling examples can improve model output by getting more accurate and consistent outputs - with similarly-formatted responses & cost-savings
Fine-tuning Models: Azure OpenAI Guidance Look up this table to understand what models can be fine-tuned in Azure OpenAI, and which regions these are available in. Look up their token limits and training data expiry dates if needed.
To Fine Tune or Not To Fine Tune? That is the Question This 30-min Oct 2023 episode of the AI Show discusses benefits, drawbacks and practical insights that help you make this decision.
Getting Started With LLM Fine-Tuning This AI Playbook resource walks you through data requirements, formatting, hyperparameter fine-tuning and challenges/limitations you should know.
Tutorial: Azure OpenAI GPT3.5 Turbo Fine-Tuning Learn to create a sample fine-tuning dataset, prepare for fine-tuning, create a fine-tuning job, and deploy the fine-tuned model on Azure.
Tutorial: Fine-tune a Llama 2 model in Azure AI Studio Azure AI Studio lets you tailor large language models to your personal datasets using a UI-based workflow suitable for low-code developers. See this example.
Tutorial:Fine-tune Hugging Face models for a single GPU on Azure This article describes how to fine-tune a Hugging Face model with the Hugging Face transformers library on a single GPU with Azure DataBricks + Hugging Face Trainer libraries
Training: Fine-tune a foundation model with Azure Machine Learning The model catalog in Azure Machine Learning offers many open source models you can fine-tune for your specific task. Try this module is from the AzureML Generative AI Learning Path
Tutorial: Azure OpenAI Fine-Tuning Fine-tuning GPT-3.5 or GPT-4 models on Microsoft Azure using W&B allows for detailed tracking and analysis of model performance. This guide extends the concepts from the OpenAI Fine-Tuning guide with specific steps and features for Azure OpenAI.

2. Secondary Resources

This section captures additional resources that are worth exploring, but that we did not have time to cover in this lesson. They may be covered in a future lesson, or as a secondary assignment option, at a later date. For now, use them to build your own expertise and knowledge around this topic.

Title/Link Description
OpenAI Cookbook: Data preparation and analysis for chat model fine-tuning This notebook serves as a tool to preprocess and analyze the chat dataset used for fine-tuning a chat model. It checks for format errors, provides basic statistics, and estimates token counts for fine-tuning costs. See: Fine-tuning method for gpt-3.5-turbo.
OpenAI Cookbook: Fine-Tuning for Retrieval Augmented Generation (RAG) with Qdrant The aim of this notebook is to walk through a comprehensive example of how to fine-tune OpenAI models for Retrieval Augmented Generation (RAG). We will also be integrating Qdrant and Few-Shot Learning to boost model performance and reduce fabrications.
OpenAI Cookbook: Fine-tuning GPT with Weights & Biases Weights & Biases (W&B) is the AI developer platform, with tools for training models, fine-tuning models, and leveraging foundation models. Read their OpenAI Fine-Tuning guide first, then try the Cookbook exercise.
Community Tutorial Phinetuning 2.0 - fine-tuning for Small Language Models Meet Phi-2, Microsoft’s new small model, remarkably powerful yet compact. This tutorial will guide you through fine-tuning Phi-2, demonstrating how to build a unique dataset and fine-tune model using QLoRA.
Hugging Face Tutorial How to Fine-Tune LLMs in 2024 with Hugging Face This blog post walks you thorugh how to fine-tune open LLMs using Hugging Face TRL, Transformers & datasets in 2024. You define a use case, setup a dev environment, prepare a dataset, fine tune the model, test-evaluate it, then deploy it to production.
Hugging Face: AutoTrain Advanced Brings faster and easier training and deployments of state-of-the-art machine learning models. Repo has Colab-friendly tutorials with YouTube video guidance, for fine-tuning. Reflects recent local-first update . Read the AutoTrain documentation