Aditya x Mittal

LEARNIGS SO FAR...

LLMs: Self Attention

Hello, Today we will be covering one of the most fascinating mechanisms that changed AI forever, i.e., "Self Attention". We will first build some intuition for why it works before jumping directly to the implementation. I hope you have read my previous blog on Positional Embeddings , where we understood why "environment" information is also needed alongside "positional" information when generating a sentence embedding. Let me start with an analogy. Analogy Let's say you have

Apr 1410 min read

LLMs: Positional Embeddings

Hello readers, in our last blog, we learned about Token Embeddings and how they are actually implemented. Today, we will be extending that concept to generate embeddings for a whole sentence. Okay, so what we learned about token embeddings is that they are meaningful n-dimensional projections of a single-dimensional token, i.e.: word -> dog token -> 13 embedding -> [0.14, 0.573, -0.345 .... ] Now, let's try to create embeddings or "projections" of a sentence itself, a.k.a. S

Apr 119 min read

LLMs: Token Embeddings

Today, we will be covering another building block in our LLM series. I hope you have built a strong intuition around neural networks; if not, I would highly recommend reading my previous blog, Neural Networks , before proceeding. Okay, so until now we have understood that any input we supply to an LLM or a model is converted into a numeric representation via Tokenization , and these numerics are then fed to a mathematical equation, a.k.a. a neural network. Sample Neural Netwo

Apr 38 min read

LLMs: Neural Networks

I hope you have read my previous blogs covering the basic intuition and tokenization aspects of LLMs. If not, I highly recommend reading them. LLMs: Build Intuition LLMs: Tokenization So, as you remember, the whole “magic” behind LLMs eventually boils down to a state-of-the-art mathematical equation. Today, we are going to dig a little deeper into this mathematical equation, a.k.a. Neural Networks . Most probably, you would have seen an image like this showcasing a complex ne

Mar 289 min read

LLMs: Tokenization

If you’ve gone through the previous post, you should now have a basic intuition about how Large Language Models (LLMs) work. If not, I’d strongly recommend reading LLMs: Build Intuition before continuing. At a high level, everything in an LLM eventually reduces to mathematical computations. But before any computation can happen, text needs to be converted into numbers. This post focuses on exactly that: how text gets converted to a numerical representation . Tokenized Text L

Mar 193 min read