LLMs: Self Attention
Hello, Today we will be covering one of the most fascinating mechanisms that changed AI forever, i.e., "Self Attention". We will first build some intuition for why it works before jumping directly to the implementation. I hope you have read my previous blog on Positional Embeddings , where we understood why "environment" information is also needed alongside "positional" information when generating a sentence embedding. Let me start with an analogy. Analogy Let's say you have
17 hours ago10 min read


LLMs: Positional Embeddings
Hello readers, in our last blog, we learned about Token Embeddings and how they are actually implemented. Today, we will be extending that concept to generate embeddings for a whole sentence. Okay, so what we learned about token embeddings is that they are meaningful n-dimensional projections of a single-dimensional token, i.e.: word -> dog token -> 13 embedding -> [0.14, 0.573, -0.345 .... ] Now, let's try to create embeddings or "projections" of a sentence itself, a.k.a. S
4 days ago9 min read


LLMs: Token Embeddings
Today, we will be covering another building block in our LLM series. I hope you have built a strong intuition around neural networks; if not, I would highly recommend reading my previous blog, Neural Networks , before proceeding. Okay, so until now we have understood that any input we supply to an LLM or a model is converted into a numeric representation via Tokenization , and these numerics are then fed to a mathematical equation, a.k.a. a neural network. Sample Neural Netwo
Apr 38 min read


LLMs: Neural Networks
I hope you have read my previous blogs covering the basic intuition and tokenization aspects of LLMs. If not, I highly recommend reading them. LLMs: Build Intuition LLMs: Tokenization So, as you remember, the whole “magic” behind LLMs eventually boils down to a state-of-the-art mathematical equation. Today, we are going to dig a little deeper into this mathematical equation, a.k.a. Neural Networks . Most probably, you would have seen an image like this showcasing a complex ne
Mar 289 min read


LLMs: Tokenization
If you’ve gone through the previous post, you should now have a basic intuition about how Large Language Models (LLMs) work. If not, I’d strongly recommend reading LLMs: Build Intuition before continuing. At a high level, everything in an LLM eventually reduces to mathematical computations. But before any computation can happen, text needs to be converted into numbers. This post focuses on exactly that: how text gets converted to a numerical representation . Tokenized Text L
Mar 193 min read


LLMs: Build Intuition
There’s no denying that Large Language Models (LLMs) have revolutionized not only the tech industry but also everyday life. But have you ever wondered how these seemingly magical systems actually work? The good news is, you don’t need to deeply dive into their internals to use them effectively in your applications. However, having a rough understanding or intuition about how they work (instead of treating them as a complete black box) can make a huge difference. It helps us u
Jan 302 min read


