1

Review of Chapter 2 of Learning LLM from Scratch

힘센캥거루
2025년 10월 19일(수정됨)
1
1
29
Review of Chapter 2 of Learning LLM from Scratch-1

Already in the second week of the challenge.

I hadn't finished Chapter 2 until yesterday, but while attending a two-day retreat, I managed to catch up by coding until midnight.

1. Content

The main focus of Chapter 2 was tokenization, encoding, decoding, and embedding vectors.

I was familiar with others as I had made a one-hot encoder, but the concept of embedding vectors itself was new.

The one-hot encoder creates a three-dimensional matrix for each word, marking that part as 1, while embedding vectors are represented as vectors in a three-dimensional space like x, y, z.

Review of Chapter 2 of Learning LLM from Scratch-2

2. Questions

Questions arose when dealing with embedding vectors.

  • Why are embeddings initialized using seeds to create non-overlapping random numbers?

  • Why is the matrix itself called three-dimensional when it seems two-dimensional?

  • What is the reason for adding token embeddings and positional embeddings?

These questions were resolved using Chat-GPT.

Review of Chapter 2 of Learning LLM from Scratch-3

Embedding vectors act like a dictionary for finding words.

Giving a random function with a seed to the embedding initially scatters the word positions differently in the coordinate system.

Using the same seed to create an embedding results in an embedding identical to the initial one, making the word position the same.

Therefore, by adding token embeddings and positional embeddings, the characteristics and context of the word are simultaneously represented.

3. Review

Though I vaguely understood embedding vectors from the Vercel AI SDK, I now have a clear understanding.

Attempting to express it mathematically is quite challenging, but understanding the meaning makes it more accessible.

I plan to continue working on it steadily.

관련 글

2026년 동국대학교 미래사회 교원역량 강화 포럼 오프라인 참여 후기
2026년 동국대학교 미래사회 교원역량 강화 포럼 오프라인 참여 후기
어느 선생님이 재미있어 보이는 연수를 하나 소개시켜 주셨다.동국대에서 진행하는 AI 관련 연수였다.AI인 것도 좋인데 연수가 호텔에서?이건 무조건 가야 한다 싶었다.해당일 연수가 열리자 마자 신청해서 오프라인으로 참석하게 되었다.1. 앰배서더 서울 풀만 호텔처음에는 접...
Global Skilled Crafts Promotion Institute Special Field Training – Woodworking Workshop Review
Global Skilled Crafts Promotion Institute Special Field Training – Woodworking Workshop Review
A teacher I know told me there was a residential woodworking workshop being held in Incheon.And among the options, they said I absolutely had to take...
A Beginner’s Hacking Guide for Aspiring White-Hat Hackers – First Impressions of “A Taste of Hacking”
A Beginner’s Hacking Guide for Aspiring White-Hat Hackers – First Impressions of “A Taste of Hacking”
The most important thing when running a home server was security.No matter how nicely I built the website features, once I got hit by hacking attempts...
Training on Educational Research and Statistical Analysis for Teachers – Summary of Sessions 21–30 and Reflections
Training on Educational Research and Statistical Analysis for Teachers – Summary of Sessions 21–30 and Reflections
Today I’d like to write down what I remember from sessions 21–30 of the educational research and statistical analysis course for teachers, along with...
Educational Research and Statistical Analysis Training for Teachers - Collection of R Practices from Sessions 13–20
Educational Research and Statistical Analysis Training for Teachers - Collection of R Practices from Sessions 13–20
Previously, I used to wonder whether I really needed to learn R when I already knew Python.Through this training, I realized that there’s actually no...
Teacher Training in Educational Research and Statistical Analysis – Sessions 10–12: Coefficient of Determination, Multiple Regression Analysis, etc.
Teacher Training in Educational Research and Statistical Analysis – Sessions 10–12: Coefficient of Determination, Multiple Regression Analysis, etc.
I wrote a reflection after each session every day, but with writing student records and doing this as well, I ended up having to cut down on sleep eve...

댓글을 불러오는 중...