OpenAI Sora: Beauty And Horror!


OpenAI’s text-to-video AI Sora took the world by storm just a few weeks ago. We all know these results, but good news: there are some new ones that I found absolutely stunning. And I learned a lot. Stay tuned, because here you will see beauty and imagination, and you will see horror. I was not ready to see the horror. Part one: let’s start with the beauty.

Here is a house tour. I absolutely love how well made this is, but what is perhaps even more impressive is that it really has a model of the house itself. It does not feel like it is just making things up as it goes, and as I am a light transport researcher by trade, that is ray tracing if you will, when I look at these beautiful reflections and refractions everywhere, this makes me really, really happy. And classy material models too. They are not perfect, but leaps and bounds beyond any AI-based text-to-video I have seen so far. And also, look, the resolution of some of the geometry is quite interesting; for instance, the bedsheets. This almost feels like being in a video game where the detail has been toned down a little. Perhaps it also looked at materials coming from video game engines too; I don’t know for sure, just speculation. However, when we look at this colorful paper flower blooming, I see high-resolution textures. There are lots of tiny folds and imperfections everywhere. It is quite interesting how it combines low- and high-resolution materials. Now, how about a little more action? Oh yes, here, I love how it understands the physics of the race car, and the movement of the wheels is also incredible. Not perfect, but incredibly good.

Also a tiny little detail that only appears in the last few frames but is super important. Look. Oh my, the interaction of the wheels and the dust. Absolutely lovely. Part two: creativity. Oh boy, this is going to be amazing. It can reimagine Niagara Falls, but with colorful paint instead of water. Once again, it puts up a clinic with its understanding of physics. As an undergrad student, it took me months to understand the underlying hydrodynamics, and this one, after training, can create these in a matter of seconds. I am not jealous. Okay, just a little bit. But it gets better here, in an abstract animation, where drops of ink form lifelike creatures. This, dear Fellow Scholars, is not just fluid simulation but also fluid control. Once upon a time, I wrote a paper on how to bend the laws of physics to control fluids so that they take the different shapes that we are looking for, and it was extremely difficult. Almost impossible. Yet, once again, the AI can pull it off just fine. I am completely stunned by this. Once again, I am not jealous. Just a little. But the creativity goes crazier. Much crazier. Yeah, if you wish to see a horse on roller skates,that’s not a problem. Once again, it is not perfect. I keep saying that there is a reason for it; I’ll get to that in just a moment. Or an elephant made out of leaves—not a problem. So much so that they even move properly, and the specular reflections off of them are absolutely incredible.

Perhaps a bit too bright, but still, I love it. And now, may I interest you in a cosmic tea, Fellow Scholars? Interestingly, it is made out of a more viscous material, a bit more like honey or oil, than just regular tea. I am loving how not only the theme but also the movement of the fluid has some creativity. And so far, you have seen prompts, but you haven’t seen what other things it can do. For instance, it can even mix the content of two videos together. It is able to use this city for most of the content and place this little winter wonderland in it. However, to do that, it has to be able to draw it from different viewpoints. And it not only does that, but it does it consistently. We talk a lot about mural radiation fields and other papers that researchers had to do, even a rudimentary version of this. It is hard, hard work. And once again, the AI just does it. I can’t believe it. I hope you appreciate this additional context from other papers. Now, part three. The horror. In this previous unassuming robot video, this one is doing pretty well. Until then, are you ready for this? Are you sure? Okay? Ouch! The legs are attached incorrectly to both people, and when we switch camera angles, they have grown some new ones. So, I keep saying that these are not perfect, so how much of a problem is that? How is it? It’s possible that it can create beautiful, often almost impeccable, artistic videos with fantastic camera work, consistency, and the whole package. And yet, it almost feels like it sometimes imagines people as Ikea furniture and attaches some of the parts completely incorrectly.

I have a guess. Once again, this is just speculation. But, look here, when we don’t have a great deal of compute, we get something like this. As we add more computational power, the whole concept really comes to life, and our current compute capacity might very well be the before version that pales in comparison to what we will have just a few months from now, or maybe a year from now. We need just one more paper. And, as the technology gets cheaper and cheaper, everyone will be able to become a movie director. You will be able to get 10,000 variants for the same prompts in a matter of seconds soon. What a time to be alive! I am also visiting the OpenAI lab in a few weeks, and I really hope that I’ll be able to run some prompts to show you in a video. You know, videos of scholars holding on to their papers need to be done. But, in any case, all this will likely be available later this year. I cannot wait. Wow. If you’re looking for inexpensive cloud GPUs for AI, Lambda now offers the best prices in the world for GPU cloud compute, with no commitments or negotiation required. Just sign up and launch an instance. And hold on to your papers, because with the Lambda GPU cloud, you can now get on-demand H100 instances, and they are one of the first cloud providers to offer publicly available on-demand H100 access. Did I mention they also offer persistent storage? So join researchers at organizations like Apple, MIT, and Caltech in using Lambda cloud instances, workstations, or servers.

About Anushka Agrawal

Leave a Reply

Your email address will not be published. Required fields are marked *