This has been a very interesting week in AI and I’m back to present you with another episode of AI News You Can Use. Because we have new features and updates from ChatGPT, a bunch of new open source news and apps that you will actually want to show your friends after you watch this episode.
Welcome to AI News You Can Use. So hey, let’s not waste any more time and dive right into this. First things first, let’s talk about ChatGPT updates. Because for most people watching this, including me, it is still our primary AI that we return to regularly. And the new feature here is quite simple but very impactful actually. So you might be familiar with the image generation capabilities of ChatGPT with the Dali 3 model. If you have the plus plan in a chat with GPT-4 you can generate an image of pretty much anything and it will do it. But now we have an additional feature. We can do so-called in-painting and what this means is that we can actually click this picture of the happy alpaca and you will see a new button up here saying select and this is so powerful because what we get to now is we get to increase or decrease our brush size and let’s just say we want to change the eye color. So I’m just gonna inpaint these two eyes and I’m gonna say make them blue. And it’s gonna regenerate the image but not the entire image. As you might already guess it’s going to change the eyes of the alpaca to blue like so. And you could do this with any image multiple times.
So you can go up here and say add a sun. Whatever it might be this is something that some other tools like majority or Leonardo already have and it is one of the most important features because look, you might just want a sun in your picture and there was no good way of doing that up until now without leaving the program. Not everybody knows Photoshop or wants to spend time with it. One thing that I was immediately excited for here is the ability to edit the text, but after testing it a little bit I found out that it’s not really good at that. Here I tried to remove this text, replace the main text with something else. It doesn’t really work at all, so you’ll still need some external image editing tools in order to do that. Okay, and second quick update from ChatGPT also is that it’s now accessible without logging in. Now I’m in Europe, so this has not rolled out to me yet, but team members from the US reports that you can just go to chat.openai.com and start using GPT 3.5 for free. This is on par with some other models and we have so many open source models now that are officially better than GPT 3.5 and that are freely available across the internet, but they just kind of had to do this at a certain point. Nevertheless, if you want somebody new to try chat GPT, you can go to the site, and from here on out, you shouldn’t have to log in anymore. Okay, moving on to the next one, which is Stability AI’s Stable Audio 2. And they made this completely for free, which was not the case before with their previous model. And if you’re not aware, they just recently had a leadership shift, and the direction of the company is changing. They’re trying to generate more revenue.
So a lot of their stuff is behind their membership, but stable audio you can just use straight up. What does it do? It generates music without lyrics. How does it do it? It’s really good, very solid. If you need background music, this is a fantastic tool. I’m sure there’s many more use cases. That’s just the one that comes to mind for me. So what are the key points here? Up to three minutes in audio length, you get this interface where you get to do it all. You can try it for free. Just a quick Google log-in later, and I have 20 tracks that I can be generating here. A few important points are made here. First of all, this is commercially usable because it builds up on Stable Audio 1 and only includes a data set that they actually license. So you can fully use these tracks. Second of all, and most interestingly, it’s not just text to audio; it’s also audio to audio. So I could record something here, and then it would reproduce that into a track. And that’s exactly what I’ll do in this demo here. Okay, exclusively for this, I’ll dust off my rusty beatboxing skills from 10 years ago or something. I never took it seriously, but it might just come in handy here. So let’s see, I’ll record, and let’s try and create some background beats here. It’s something okay, upload this, and you know, if I can do this, you can do this too. That’s the magic of this; it’s just going to transform it into proper music. So let’s have a look at this. Let’s pick something from the prompt library. Maybe a drum solo. I like that. Using the 2.0 model. Perfect. Do an 11-second-long thing. And I would have a drum solo for the intro to my YouTube video. Maybe. Let’s see. Gimme a result. Oh, let’s have a listen. Wow. That’s a nice little jazzy drum solo right there. There you go. It’s free. It’s fun. You can just put random noises into the mic, and it’s going to turn it into a song. Yeah, thank you so much, Stable Audio. Moving on. All right, so as mentioned before, this weekly series is all about AI news that you can use, which essentially means that I’ll be telling you about technologies that will enhance your skills to achieve your goals. But here’s the thing: You need a good baseline of skills to be enhanced because a lot of these tools don’t do things from scratch. You need some sort of input. Take me, for example. If I hadn’t had any coding skills whatsoever, I wouldn’t have been able to create this video. I guess my point is that these tools are often just extensions of yourself.
So that presents the question: How do you acquire some of these base skills? Now let’s talk about the open source space. I get a lot of comments that, hey, you go, you’re not covering open source enough, but there’s a good reason for that. I try to show you stuff that you can use and that you should use. And open source is for people who build apps, or it’s for people who really care about privacy, but most people just want the best results possible. And that’s where you get to the closed-source models like GPT-4, Cloud 3, or Gemini. But nevertheless, I try to cover all the important open source releases because they are relevant to a lot of people. And we got a brand new model this week, the DBRX from Databricks. And this one is the new best-in-class open source model. But wait a minute, this one is not fully open source because what they released this under is this Databricks open model license, which is almost open source but not quite. So similar to Lama, they have this clause in here that, hey, if you have over 700 million monthly active users, you must request a license from Databricks. If you want to look at the details, I’ll link it below, but I’m aware that most of you care about how well this performs. And for that, we shall have a look at benchmarks over here, but over time, only usage will show the true picture. But as you can see here on the popular MMLU benchmark, it is actually better than both Lama, Mixtral, and GROK1, while being way, way smaller. On programming, it’s absolutely amazing. This is their main claim to fame, as it is in math. And the big advantage on top of that, and probably also the reason why this is the most popular space on hugging phase over the course of the last week by a long shot, is that it’s two times faster at inference and it costs nearly four times less compute to train. So it’s not just better than all of these; it’s also way more efficient to train, and it responds twice as fast as some of these other models. That’s what inference means.
There is a hugging phase space up where you can try this out today. I’ll just go with the obligatory right-man essay about penguins and look at this inference speed. It’s right there. It’s super fast. There’s a limit on the token outputs, but you can test it in here, and yeah, apparently it’s really good for coding tasks, but over time, I learned to give this a little bit of time, get it into users hands, and get the community’s feedback on how it’s actually performing. And to round out this little coverage of the open source space, there is a brand new Mistral 2.8 dolphin model, and this one is a fully uncensored model. We talked about this in previous episodes. I showed you how to use this in a replicate space and in fact you can ask it every single question because if you’re not aware these dolphin models are completely Uncensored meaning literally every single question that you could think of this thing will answer and this is the newest version of it When we covered it, I believe that was dolphin 2.2, but this one has been even further refined than it’s supposed to perform better I’ll be playing with this but unfortunately, I haven’t found a super easy way to run this yourself without downloading it and running it locally I’ll keep an eye on this as a lot of people really enjoy using these uncensored models If I find a simple way to use this I’ll report back next week Alright, so moving on to the next piece of AI news that you can use This is less of a use case and more of a piece of knowledge that you should absolutely be aware of And namely, Ethan Molek here is tweeting about this brand new paper that came out that studies how well AI detectors perform. And the findings are sobering, to say the least. Basically, we tested all these different detectors that claim to be the ones that actually detect AI. But this table right here is what I want you to pay attention to. Because it looks at GPT-4 and its accuracy in predicting if the text is actually AI-generated or not. And the one thing I want you to notice is how all over the place this is, okay? All the way from Bart completely falling apart when you use some of these techniques to GPT becoming more AI-like when you use certain techniques and less AI-like if you use others. In other words, it depends on what model you’re using, and there are all these techniques that can fool these detectors pretty reliably. On top of that, the paper also talks about how people who come from a foreign background, where English might not be the first language, often get identified as AI-written. So just imagine you’re doing your semester abroad at university and you submit your English paper, and then the teacher runs it for an AI detector, and they’re like, Hey, you used AI. You’re just like, What? No, I can’t speak English. What are you talking about? Like, what the heck? I don’t speak English. But it won’t matter because they feel like their tool is going to deliver an accurate result, which is obviously not the case. So this just confirms that we already talked about it on the channel when it happened. And that is why OpenAI withdrew their AI detection software because it just did not work reliably no matter what they did. It was too easy to fool.
So just using AI detectors is not going to be a solution for this whole problem of: How do we identify AI-written text? This is just a part of society now, and you have to find other ways of dealing with it. I thought this was really significant because, really, whatever you’re using AI for, this is kind of a factor in all of it, right? And now we know that these detectors aren’t something anybody should be relying on because adding a few spelling errors is just something everybody could do, right? Okay, moving on, here’s a fun little app, and I found a similar one a few weeks ago that I showed you for Mac. It essentially takes all the images on your computer and names them with GPT Vision. Now the Mac one was even more pricey than this, and you do have to use your own API key to name each and every one of those. It’s not like a service where you pay $5 and it renames your entire iCloud photo library. But if you pay, you could totally do that. And this is the Windows version of it that is really easy to use now. At the time of the recording, this cost $25, which I think is actually quite reasonable. And then, as mentioned, you have to use the API to name all the different image files. So if you have messy hard drives and a bunch of unnamed pictures and you don’t know how to organize them, this might just be a great way. Look, you just select them, say Ren AI, and all of a sudden all of them will be named according to what is actually inside the picture. for a messy desktop like mine, which I will not show you because I’m ashamed of it. Cut that out. Nope. If you take a lot of pictures or screenshots, Renai might just be useful to you. And nope, this is not sponsored. I just found this really interesting, and I figured it could be really interesting to many of you. Okay. So next up, we have a very, very interesting one. We talked about AI avatars on the show a lot, and a lot of different ones came out. But I think everybody that I talked to recently agrees that HeyGen is kind of the leader of the pack there. And yet again, they’re pulling ahead by releasing a new feature soon where the virtual avatar is actually in motion. So the person is walking and presenting the words that you give to him, okay? So what you get to do today is go to this link, demo.hn.com slash avatar in motion, link in the description below, and you get to give it a quick phrase here and fill out your email, and then after a little bit of waiting, it’s going to email you the video. So what I did was let the avatar introduce this weekly show and give him a German name because that’s a little tricky and I want to see how they handle that. Let’s have a look at how it went. Welcome to AI News You Can Use. I am your host for today, Rüdiger Schulz. This has been a demo of Heijen’s avatar in motion. What do you think? I think this is extremely impressive. While you’re in a rush, scrolling through your Instagram feeds, you will not pick up on the fact that this is AI-generated. And we’re just getting started. This is literally the first version of this that we see in production. There’s no other company that does this well that I’m aware of. And as per usual, the comment section is there. If you know of one, please leave a comment below.
But basically, you can get your own little custom demo like this for free, and then you could download it and send it to your friends. You can have them say pretty much everything that doesn’t go against their privacy policies. And with that, let’s look at our future use case. And for this week’s future use case, it’s something we like to do at the end of the show to show you what’s coming up. I’ll be showing you one of the most exotic ways to evaluate an AI right here in this GitHub repo called LLM Colosseum. So as we talk about all these open source models and these new releases, it’s getting increasingly hard to benchmark them because six months ago, as a user, you just used to ask a new large-language model, Hey, create me a game of Snake, and then you could see if it actually completed it, if it got stuck halfway. These were very simple but quite effective little benchmarks for real-world use cases, but over time, these model makers adjusted and started adding these examples to their training data. From what it seems like, all they’re trying to do is just get a model that performs well on all these benchmarks and handles all the test cases that people across YouTube and Twitter throw at it, and then it looks really good in theory, but in practice, look at it, and they’re like, okay, so who cares that this is a better benchmark than GPT-4? It’s not nearly as good. So inside our AI advantage community, we had a very interesting discussion on how these things will be benchmarked moving forward.
And Daniel from the team brought my attention to this LLM Colosseum GitHub repo, which is essentially a brand new way of benchmarking and measuring how good these models are. And hear me out, the way it does it is that it lets the large language model play Street Fighter turn by turn, okay? So it uses vision to analyze the frame, and then it decides on the next move, and then it does the same for the second large language model, and whoever wins Street Fighter is the better LLM. I mean, admittedly, this is a bit ridiculous, but I just found this idea interesting enough to share with you. Who knows how these models will be evaluated in the future? If we’re going to have personal AI assistance at the end of the day, it matters that they actually help our lives and not some arbitrary benchmark. So as you can see, it looks at the game, and then it’s asked what your next move is, and then the LLM decides what the next move is, and it’s executed, and so on. So there you go. That was our little preview corner, where we looked at what was coming up. I think both corporations and the community are going to keep coming up with these exotic benchmarks, and then you get to choose which ones you care about. But I just wish we had a better measurement than these standardized benchmarks that everybody seems to be fooling these days and letting the AI play a game of Street Fighter. Time will show. Okay, now that you checked out all the news that you can use, you might just be interested in this chat GPT for beginners playlist I have on the channel. I created a lot of chat GPT tutorials over the years, but here I have collected all of them and organized them chronologically. So if you want to get more out of large language models, this is a great and completely free way of doing that.