In 2021, Open AI released DALLE 2, a text-to-image AI system to much excitement. In rapid succession, this year Stability AI released Stable Diffusion, an open source text-to-image AI model that delivers similar results through a different process. A user can describe in text what they would like to see, for example, “3D render of a cute tropical fish in an aquarium on a dark blue background,” DALLE 2 and Stability AI can both generate pictures of that through their different datasets.
This begets the question: what does this mean for video? For Westworld fans out there, you may have had a glimpse of this when Dolores talks to generate a 3D story this last season. While we’re not fully there in our real world, AI has started to shake up video. There are already generative models that create high quality 3D shapes with realistic textures. Using text description, existing models can also generate a panorama scene with photorealistic lighting. Also using text, it is now possible to generate 3D avatars with specific attributes.
You can imagine that if all the pieces above are used together, 3D world building and engagement is going to become so much more interesting. Say you’re a baseball fan. You can generate an avatar of yourself as a player, place yourself in a realistic 3D replica of your team’s stadium, and generate a short video of you hitting a home run. Your team would be remiss to not play the best of this content at the upcoming game.
Compelling applications of this technology involve combining real action and imagined settings. Imagine an online class where a geography teacher explains volcanoes. The teacher can be recorded in real life — the setting of that recording wouldn’t matter — because generative AI can replace the background with a realistic 3D replica, first of a classroom, then then in the field next to an active volcano, and then back to a lab. Not only would this streamline the production process for educational content, it would also increase student engagement across the board.
As AI matures to generate even more realistic multimedia worlds, the possibilities for creative applications are endless. We are excited to be part of this transformation!
We’re inviting 1000 early access users to help us build the future of virtual production. Join the waitlist.