DeepMind’s New AI Remembers 10,000,000 Tokens!


We are living in the age of AI, where amazing assistants like ChatGPT are popping up almost every day. They can do amazing things like being a personalized teacher and helping you with assignments and homework; they can write computer code; they can be your assistant for daily planning; you name it. But Google DeepMind’s Gemini 1.5 Pro had something up its sleeve that others didn’t. We are going to talk about that and have a look at how your fellow scholars are already using it for lifting weights—yes, lifting weights with an AI—and so much more. We will talk about what the trouble is with this technique and if there are solutions. And, as I promised, we will look at the paper too and find out how it learned a language that is almost impossible to learn.

So what is Gemini 1.5 Pro’s trick? The trick is that it remembers. What does that mean? It means that it has something called a long context window, which makes it remember long passages. It can read a whole book, and then we can talk to the book. It can look at a huge codebase, and remember, most if not all of it. And finally, this context window is so long that we can even chuck a movie into it, make a crude drawing of a scene, and it will tell you exactly where it was shown in the movie. This truly feels like an AI assistant that came back from the future. You Fellow Scholars are already using it for amazing things, like having it look at lectures and writing lecture notes, or asking it to look at your weightlifting session and summarizing all the exercises you’ve done and get this, as well as the number of sets and reps you made. Wow! Or, it can look at your bookshelf and immediately give you a list of books that you have. And you can give it prompts that are even a thousand times longer than this video. Goodness! And if you wish to ask a question about a thousand-page legal correspondence and find an obscure detail, easy! And it can recall often every single detail, really. And hold on to your paper’s fellow scholars, because it gets even better! That was just the 1 million-token window, whereas in the paper, they note that it can do up to 10 million. 10 movies, if you will. This is going to be a movie historian soon.

So cool! But it’s not all sunshine and happiness. We have a problem. What is the problem? Well, formally, we say that the transformer neural network’s self-attention mechanism has a quadratic computational and memory complexity. Okay, but what does all this mean? This means trouble. Big trouble. Look, it takes a while. If you have one movie in there, it takes a minute. But if you had 10 movies, that would take 10 minutes, right? That is reasonable. Yes, that would be reasonable, but unfortunately, that is not the case. The quadratic complexity means that the 10 times bigger query might take not 10 times longer but a hundred times longer. Yes, it may take up to one and a half hours. Ouch. That is not practical. And it gets even worse. This is a property that is inherent to the structure of transformer neural networks. That is at the heart of the majority of AI assistants out there. So we won’t get these amazing AI assistants in the future; is everything lost? Well, not so fast. Just based on the fact that Google DeepMind put this out there in the wild for testing, a solution is hopefully possible and might already be on the way. If it appears, I’ll be here to tell you all about it. Make sure to subscribe and hit the bell icon to not miss out on it. There are now a few really cool tidbits from the paper that I loved. When we go from 1 million tokens to 10 million, let’s say, from 1 movie to 10 movies, how does the accuracy degrade?

It cannot be 100% accurate, right? Well, it’s not 100% accurate. It is only 99.7% accurate. Even the incredible GPT-4 turbo cannot do that. Wow! And here is another absolutely incredible insight from the paper. It learned about Calamang. If that language does not sound familiar to you, don’t despair; it is an endangered language that has fewer than 200 speakers across the world. And, after being given a book about this language, it learned how to translate to it, at least as one of the active speakers. My goodness, it can immortalize culture and languages! I am getting goosebumps. Truly an AI assistant from the future! What a time to be alive! And just imagine what we will be able to do with just two more papers down the line. Wait, what is this? This is not Gemini; this is Gemma, a small, open-model version of Gemini that you can try right now! Note that this is much smaller in size and context length; don’t expect the million-token thing from it. Not even close. However, it builds on a similar architecture to Gemini, and you can likely even run it on the phone in your pocket. So what is the verdict on Gemini Pro 1.5? I have been lucky to have had access to it for a while now, and the fact that it remembers so much is incredible. That is the good part.

There are now issues and limitations. You heard about the quadratic complexity; that is one. 2. If we are using the long context window to ask one thing, that is one needle in a big haystack where it performs nearly perfectly. But if we are looking for 50 or 100 needles, we see a bit of a degradation in their accuracy. Still quite good, though. 3. In my personal opinion, for my crazy calculations that I like to do, the GPT-4 Turbo still has more horsepower, and Claude seems a great deal better at coding. It is interesting how these major large language models seem to be so differentiated. Each of them is good at different things, so take that into consideration whenever you try them. But I hope this gives you the warm and tingly feeling that this is an incredible world that we live in today. Goodness, I love my job! And, if you wish to run your own experiments in the cloud, make sure to check out Microsoft Azure AI. Azure AI is a powerful cloud platform that offers you the best tools for your AI projects, with responsible AI built in.

About Anushka Agrawal

Leave a Reply

Your email address will not be published. Required fields are marked *