OpenAI, a leading figure in artificial intelligence, has unveiled its latest AI model, Sora, capable of crafting “realistic” and “imaginative” 60-second videos based on brief text prompts. According to a blog post released on Wednesday, Sora possesses the ability to generate videos with multiple characters, specific motions, and intricate background details, all derived from textual instructions.
The blog post emphasized that Sora not only comprehends the user’s prompt but also understands how those elements exist in the physical world. OpenAI envisions training Sora to tackle real-world interaction problems, marking a significant step in the company’s commitment to advancing generative AI capabilities.
While similar “multi-modal models” and text-to-video models exist, Sora stands out due to the claimed length and accuracy of its output, as stated by Reece Hayden, a senior analyst at ABI Research. Hayden anticipates potential impacts on digital entertainment markets, particularly in creating personalized content for various channels, such as supporting narratives in television scenes.
Despite its advancements, OpenAI acknowledges that Sora is a work in progress and identifies certain “weaknesses.” Spatial details, including left and right orientation and cause-and-effect relationships, present challenges. For instance, the model may create a video of someone biting a cookie without showing a corresponding bite mark.
Advertisement
OpenAI emphasizes safety as a priority, planning to collaborate with experts to evaluate the model, particularly in areas like misinformation, hateful content, and bias. The company is developing tools to detect misleading information. Sora will initially be accessible to cybersecurity professors, known as “red teamers,” for risk assessment, and visual artists, designers, and filmmakers for feedback on creative applications.
This update follows OpenAI’s ongoing efforts with ChatGPT. A recent development involves testing a feature that allows users to control ChatGPT’s memory, enabling personalized conversations by instructing the platform to remember or forget specific information from past chats.