Review of Chapter 6 of *Build an LLM from Scratch*

힘센캥거루
2025년 12월 18일(수정됨)
1
9

Chapter 6 is about fine-tuning for classification.

The example used is building a spam classifier.

A spam classifier determines whether something is spam or not spam, so the output needs to be values like 0 and 1.

1. Steps of Fine-tuning

Review of Chapter 6 of *Build an LLM from Scratch*-1

The fine-tuning process is similar to the process of training a model.

You prepare the dataset, load the weight values, then train and evaluate.

The slightly different part is that there is a step where the output layer is mapped to 0 (not spam) and 1 (spam).

Based on the final tensor, which contains the most information among the tensors produced, the model outputs whether the input is spam.

Finally, the loss is computed using cross-entropy.

2. Fine-tuning the Model with Supervised Learning Data

Review of Chapter 6 of *Build an LLM from Scratch*-2

The data is split into training and validation sets, and the model is trained over multiple epochs.

When training accuracy and validation accuracy remain close, it means the model is showing similar accuracy during training and validation.

In other words, there are no signs of overfitting.

Now, using this, we can distinguish spam.

3. Thoughts

Running a 1.2B model is a stretch even on my Mac mini, but it makes me think that if possible, I’d like to train an LLM and try various things with it.

I’m even considering using this approach when I write a paper next year.

I should hurry up and finish the book, and then start learning PyTorch in earnest.

관련 글

Book Review and Challenge Review of Chapter 7 of *Building an LLM from Scratch*
Book Review and Challenge Review of Chapter 7 of *Building an LLM from Scratch*
Chapter 7 covers the process of fine-tuning a model to follow instructions.In other words, making it give the desired response to a given question.As...
Review of Chapter 5 of *Building an LLM from Scratch*
Review of Chapter 5 of *Building an LLM from Scratch*
Today is December 14.The challenge period actually ended two weeks ago, but I couldn’t just give up on writing a review.Because these TILs I leave lik...
Impressions After Reading Chapter 4 of “LLM From Scratch”
Impressions After Reading Chapter 4 of “LLM From Scratch”
Today is November 26, so if I finish one chapter a day, I’ll complete the challenge.I’m not sure if I can do it with my first and second kids constant...
Review of Chapter 3 of Learning LLM from Scratch
Review of Chapter 3 of Learning LLM from Scratch
After spilling a bucket of water on my MacBook, I was in shock and wasted about 3-4 days. In retrospect, since my MacBook was already damaged, I should have thought of it as being sent for repair and done something. Anyway, although it's a bit late, I am determined to see it through and leave a review of Chapter 3. 1. Attention Mechanism Chapter 3...
Review of Chapter 2 of Learning LLM from Scratch
Review of Chapter 2 of Learning LLM from Scratch
Already in the second week of the challenge. I hadn't finished Chapter 2 until yesterday, but while attending a two-day retreat, I managed to catch up by coding until midnight. 1. Content The main focus of Chapter 2 was tokenization, encoding, decoding, and embedding vectors. I was familiar with others as I had made a one-hot encoder, but embedding...
Python OCR Recommendations for MacBook Users
Python OCR Recommendations for MacBook Users
It seems like I've tried every OCR available for recognizing students' medical certificates. I've used various OCRs such as Tesseract, EasyOCR, and PaddleOCR, but none had satisfactory performance with Korean. Recently, however, I discovered a Python library that wraps the Live Text functionality available on MacBook...

댓글을 불러오는 중...