Embeddings – Part 2

This is the 9th post in my series on building a toy GPT. For better understanding, I recommend reading my earlier posts first. Word embeddings convert words into fixed-length numerical arrays. Each number in these arrays corresponds to a specific characteristic of the word, such as its association with a place, person, gender, or concept.…

Embeddings – Part 1

This is the 8th post in my series on building a toy GPT. For better understanding, I recommend reading my earlier posts first. I love playing and watching cricket. The dominance India showed in the recently concluded World Cup is astounding. I have never seen anything like it in the four decades I’ve been following…

Neural Networks – Part 3

This is the seventh post in my series on making a toy GPT. For better understanding, I recommend reading my earlier posts first. The MNIST dataset is the “hello world” of machine learning, containing images of handwritten digits that are used to train machine learning models. It includes 60,000 training images and 10,000 test images…

Neural Networks – Part 2

This is my sixth post in a series on building a toy GPT. I recommend that you read my previous posts before reading this one. When using linear and logistic regression, which involve just a single linear equation, calculating derivatives is easy since there is only one equation to work with. Those derivatives help us…