Omar H. Fares, Toronto Metropolitan University
OpenAI’s new generative Sora tool has sparked lively technology discussions over the past week, generating both enthusiasm and concern among fans and critics.
Sora is a text-to-video model that significantly advances the integration of deep learning, natural language processing and computer vision to transform textual prompts into detailed and coherent life-like video content.
In contrast to previous text-to-video technologies, like Meta’s Make-A-Video, Sora is able to overcome limitations related to the type of visual data it can interpret, video length and resolution.
From what OpenAI has demonstrated, Sora can generate videos of various lengths, from short clips to full-minute narratives, and in high definition, accommodating a wide range of creative needs.
Although no official release date has been announced, Sora will likely be available to the public in the coming months, judging by OpenAI’s typical pattern of public releases. For now, it’s only available to experts and a few artists and filmmakers.
At the heart of Sora’s innovation is a technique that transforms visual data into a format it can easily understand and manipulate, similar to how words are broken down into tokens for AI processing by text-based applications.
This process involves compressing video data into a more manageable form and breaking it down into patches or segments. These segments act like building blocks that Sora can rearrange to create new videos.
Sora uses a combination of deep learning, natural language processing and computer vision to achieve its capabilities.
Deep learning helps it understand and generate complex patterns in data, natural language processing interprets text prompts to create videos, and computer vision allows it to understand and generate visual content accurately.
By employing a diffusion model — a type of model that’s particularly good at generating high-quality images and videos — Sora can take noisy, incomplete data and transform it into clear, coherent video content.
Sora’s approach differs from CGI character creation, which requires extensive manual effort, and from traditional deepfake technologies, which often lack ethical safeguards, by offering a scalable and adaptable method for generating video content based on textual input.
One of the most noteworthy aspects of Sora is its flexibility, as it supports various video formats and sizes, enhances framing and composition for a professional finish, and accepts text, images or videos as prompts for animating images or extending videos.
The emergence of Sora presents key opportunities for businesses across different sectors. In the near future, there are two key areas that may have significant applications.
The first area is in marketing and advertising. Just as ChatGPT has become a marketing and content creation tool, we can expect businesses to use Sora for similar reasons.
With the public release of Sora, brands and companies will be able to create highly engaging and visually appealing video content for marketing campaigns, social media and advertisements.
The ability to generate custom videos based on textual prompts will allow for greater creativity and personalization, possibly helping brands stand out in a crowded market.
The second area Sora could impact is training and education. Companies could use Sora to develop educational and training videos that are tailored to specific topics or scenarios. This could enhance the learning experience for employees and customers, making complex information more accessible and engaging.
Other sectors, such as e-commerce, also hold promising potential for the future application of Sora. Retailers could create dynamic product demonstrations that effectively showcase products in a more engaging and interactive manner.
This would be especially beneficial for companies that want to highlight specific aspects of products that might not be easily conveyed through static images or text, or for advertising products that require a detailed explanation.
Sora could also significantly reduce the uncertainty associated with online shopping by facilitating virtual try-on experiences, allowing customers to visualize how a product, such as clothing or accessories, would look on them without the need for a physical fitting. This, in turn, could result in a better return on investment.
While there are key opportunities ahead, OpenAI, regulators and users need to carefully consider key factors that could pose challenges, including copyright issues, ethical concerns and the consequences of increased digital noise.
With Sora’s ability to generate lifelike video content, there’s a risk of inadvertently creating videos that infringe on existing copyrights. OpenAI has already been sued several times over copyright infringement and intellectual property issues.
OpenAI hasn’t disclosed where the data used to train Sora is from, but it did tell the New York Times it was training the system using videos that were publicly available and licensed from copyright holders.
The technology also raises ethical questions, particularly around the creation of deepfake videos or misleading content.
Establishing guidelines and safeguards to prevent misuse will be essential for maintaining trust in the technology. In a post on its website, OpenAI stated it was working with experts to test the model before releasing it to the public.
As more businesses and individuals gain access to Sora, there’s a potential for an increase in low-quality or irrelevant video content, leading to increased “digital noise” that could overwhelm users. Finding ways to filter and curate content will become increasingly important for businesses looking to maintain their edge.
Last, but certainly not least, is the question of how Sora will impact the job market for content creators. While Sora does have the potential to automate certain aspects of video production, like ChatGPT, it’s unlikely to replace human creativity and insight anytime soon.
Instead, Sora could serve as a tool that enhances the capabilities of content creators, allowing them to produce higher-quality content more efficiently. As with any technological advancement, the key will be for professionals to adapt and find ways to integrate Sora into their workflows, leveraging its strengths to complement their own skills and creativity.
--
Omar H. Fares, Lecturer in the Ted Rogers School of Retail Management, Toronto Metropolitan University
This article is republished from The Conversation under a Creative Commons license. Read the original article.