Notes on Fine-tuning LLMs

machine-learning, AI, NLP, GPT, fine-tuning
Generated by Stable Diffusion"

👉 Subscribe to my Substack to get the latest news and articles.

I recently dedicated some time to watching the new course on fine-tuning large language models offered by Deep Learning AI. I took notes during my viewing and would like to share them here.

What is fine-tuning? #

Specializing LLMs, teaching a new skill instead of expanding its knowledge base. GPT4 is fine-tuned for Copilot to be a code assistant like a doctor is trained to be a dermatologist.

What does fine-tuning do? #

Learn the data rather than just accessing it.

Pros & Cons #


Benefits #

Pretraining #

Finetuning #

Way to Fine-tuning #

Instruction Fine-tuning #

A specific way of fine-tuning that teaches models to follow instructions like a chatbot. It gave ChatGPT the ability to chat.

Data resources for instruction fine-tuning:s

This is what the model tries to predict at fine-tuning.

sample = """
### Instructions:

### Input:

### Response:

Data Preparation #

“Your model is what your data is.”

Things to consider when creating a dataset:

Preparation steps:

Training (Fun Part) #

Regarding regular LLM training, selecting the appropriate model to fine-tune is crucial. It is recommended to begin with around 1 billion parameter models for typical tasks, but in my personal experience, smaller models can be incredibly effective for specific tasks. Therefore, it is essential not to be swayed by the notion that bigger is always better in the LLM realm and to conduct research.

In addition, there are specific techniques, such as LoRA, that can enhance the efficiency of your training. You can incorporate one of these methods into your fine-tuning process if you have limited computing resources. By doing so, you can substantially reduce the amount of computing power needed without compromising performance.

Evaluation #

The evaluation process marks the start of the fine-tuning process rather than the conclusion. The objective is to continuously enhancing our model by conducting error analysis after each iteration. We assess our current model (at first, the base model), identifying errors and recurring issues. In the subsequent iteration, we determine if these issues have been resolved. If not, we incorporate additional targeted data to address these.

Common issues might be misspellings, long responses, repetitions, inconsistency, or hallucinations.

There are different formats of evaluation:

We can also use common benchmark datasets to compare our system: