Join executives from July 26-28 for Transform’s AI & Edge Week. Hear from top leaders discuss topics surrounding AL/ML technology, conversational AI, IVA, NLP, Edge, and more. Reserve your free pass now!
The world has been wowed by the newest displays of text-to-image technology by DALL-E 2 from OpenAI and Imagen from Google. Beautiful, amazingly creative compilations all generated by artificial intelligence (AI) systems. This is possible because AI has learned natural language understanding by looking at countless texts and images.
Today’s systems have been trained to output new images when text is entered alongside pictures, uniting two seemingly disparate things in unique ways, much to the delight of viewers. A traditional image, such as that of an oil painting, can be co-opted to express something new or evoke a completely different feeling. It’s a new way to create.
Imagine that with this AI technology, users no longer have to scroll through tons of image results to find the best content for their needs. In contrast with image search, people create something totally new, something that has never existed, something that perfectly suits their desires, whims or content direction. All they have to do is type in what they want, and the AI will draw images and construct photos as described in the given text, i.e., “Please give me a photo of a restaurant with a VentureBeat sign on the window that is on Mars.” New systems will return such photos. In essence, the system is an AI designer.
Such creative power grows exponentially when also deployed to make videos from text that describes a situation and mood, and which incorporates virtual actors. Or when text + AI construct the music to go with videos and images.
Text-to-video or music technologies are already commercially available and continue to be refined, opening up the potential to add more creative processes. If AI can draw images or design, human designers’ roles could evolve. Having brilliant and inspiring ideas would become more important, and the ability to discriminate the best output from the worst will be crucial as drawing skills might be replaced by AI. Those less skilled could also develop their own creative products with lower effort.
The new company creative
But it is not just what can be done for the sake of being creative, it’s how such technology can be used to influence our world. There are certainly many entertaining or even heartwarming uses that might enrich our personal lives, but how will AI reshape creativity in the business world?
To start with, it can dramatically reduce the time, money and resources spent by marketing and ad teams. New campaigns could be rolled out in the blink of an eye. AI-based creative would enable teams to respond to changes in the landscape, react to news or trends, or proactively launch products and services in entirely new ways.
Additionally, such content and materials could be readily replicated in multiple languages, using AI to enable companies to quickly, easily and affordably reach global audiences, further aiding in international expansion. These are powerful reasons to invest in AI technology for creative uses.
The evolving role of humans
The biggest question on many people’s minds, though, is what the deployment of such technology could mean for the role of humans? If they are no longer creating the creative, are they even needed? I would argue that they are still creating the creative, just with different tools that make it easier and more cost-efficient.
In a world where AI systems could enhance creative processes, humans still would be expected to take on higher-level tasks, such as developing ideas, giving instructions, evaluating, revising and making final decisions – and they would have infinitely more options at their disposal. They would be responsible for constructing and defining the elements of the composition but without the burden of putting it together. By using AI tools, productivity AND creativity could increase as people perform various activities more easily.
Potential for misuse
As with any groundbreaking technology, there is the potential for misuse. We’ve all seen how images, hate speech and disinformation have spread on social media – what would make AI-generated content different? I believe that our society can find ethical consensus for using the technology in positive ways, but we have the option of regulating if necessary.
Perhaps one potential issue is the copyright problem or plagiarism for an AI system applied for creative development. DALL-E 2 was trained with tons of images online, and it is possible for it to return an image that is very similar to an existing one. Likewise, issues can surface with AI writers, AI music compositions, and even more types of AI-generated algorithms.
Recently, for example, virtual humans with AI-generated voices and faces have become popular across the globe. In these cases, a virtual human’s face or voice can be much like an actual human’s identity based on big training data.
Applying human rules to AI’s creations
But, our society already has come to a consensus about plagiarism of writing or composition by humans. For AI, similar guidelines could be applied to the creators, and if needed, an AI-based plagiarism checker could help review users’ decisions for absolute clarity. Humans are in control. The content creators define the rules for how text, images, videos and voice can be combined; they set the course.
AI for creative uses will be leveraged to elevate brands. As such, AI vendors that make these advances possible may also be selected based on the types of licensing relationships they have, the volume and quality of images they have access to, the range of voice actors under contract, the capabilities to combine such assets to create unique footprints and much more, should more oversight be necessary.
And, if or when it doesn’t, new technologies are being developed rapidly that can preserve digital identities and the authenticity of images. For instance, every human voice, every face, is comprised of tens of thousands of characteristics. The same is true for images. This makes it very, very difficult to fully replicate them without permission.
Significant research is already underway for deep-fake detection. Similarly, researchers and data scientists are able to deconstruct the characteristics of a speaker’s voice to determine whether a unique voice was used in a video or audio snippet, or a combination of many voices were blended through the strategic and appropriate use of technology.
And researchers are hard at work developing other preventative solutions. The technology industry is learning from past mistakes in order to safeguard the future, particularly when it comes to AI.
We sit at the precipice of a moment when creativity can make a big leap forward. Amazing things will be possible if we only open our human minds to what could be.
Taesu Kim is the CEO of Neosapience.