Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Watch here.
Cartoonists have an excellent understanding of how stories are shaped in a concise way with an eye for design. Recently, cartoonist extraordinaire Roz Chast appeared in the New Yorker prompting DALL-E images and I was immediately drawn to her prompts above and beyond the actual output of the machine.
The article’s title, “DALL-E, Make Me Another Picasso, Please” is a play on words like the old Lenny Bruce joke about a genie in a bottle giving an old man anything he wants. The old man asks the genie to “make me a malted” and poof! the genie turns him into a milkshake.
Like the genie’s gift, AIs are powerful but unruly and open to abuse, making the intercession of a prompt engineer a new and important job in the field of data science. These are people who understand that in constructing a request they will rely on artful skill and persistence to pull a good (and non-harmful) result from the mysterious soul of a machine. The best AI prompt engineers would be those who would actually consider whether there is a need for more derivative Picasso art, or what obligations should be considered before asking a machine to plagiarize the work of a famous painter.
Lately, concerns have centered around whether DALL-E will change the already eternally muddy definition of artistic genius. But asking who gets to be called a creative misses the point. What is art, and who gets to claim the title of artist are philosophical (and infrequently ethical) questions that have been argued for millennia. They don’t address the fundamental fusion happening between data science and the humanities. Successful prompt craft, whether for DALL-E or GPT-3 or any future algorithm-driven image and language model, will come to require not only an engineer’s understanding of how machines learn, but an arcane knowledge of art history, literature and library science as well.
MetaBeat will bring together thought leaders to give guidance on how metaverse technology will transform the way all industries communicate and do business on October 4 in San Francisco, CA.
Artists and designers who claim that this kind of AI will end their careers are certainly invested in how this integration will progress. Vox recently published a video titled “What AI art means for human artists” that explores their anxiety in a way that acknowledges there is a very real evolution at hand despite the current dearth of “prompt craft” and wordsmithing involved. People are just starting to realize that we may reach a point where trademarking a word or phrase would not protect intellectual property in the same way that it does currently. What aspect of a prompt could we even copyright? How would derivative works be acknowledged? Could there be a metadata tag on every image stating whether it is “appropriate or permitted for AI consumption?” No one seems to be mentioning these speed bumps in the rush to get a personal MidJourney account.
Alex Shoop, an engineer at DataRobot and an expert in AI systems design, shared a few thoughts on this. “I think an important aspect of the ‘engineer’ part of ‘prompt engineer’ will include following best practices like robust testing, reproducible results and using technologies that are safe and secure,” he said. “For example, I can imagine a prompt engineer would set up many different prompt texts that are slightly varied, such as ‘cat holding red balloon in a backyard’ vs. ‘cat holding blue balloon in backyard’ in order to see how small changes would lead to different results even though DALL-E and generative AI models are unable to create deterministic or even reproducible results.” Despite this inability to create predictable artistic outcomes, Shoop says he feels that at least testing and tracking the experimentation setups should be one skill he would expect to see in a true “prompt engineer” job description.
Before the rise of high-end graphics and user interfaces, most science and engineering students saw little need to study visual art and product design. They weren’t as utilitarian as code. Now technology has created a symbiosis between these disciplines. The writer who contributed the original reference text descriptions, the cataloguer who constructed the metadata for the images as they were scraped and then dumped into a repository, the philosopher who evaluated the bias implicit in the dataset all provide necessary perspectives in this brave new world of image generation.
What results is a prompt engineer with a combination of similar skill sets who understands the repercussions if OpenAI uses more male artists than female. Or if one country’s art is represented more than another’s. Ask a librarian about the complexities of cataloging and categorization as it has been done for centuries and they will tell you: it’s painstaking. Prompt engineering will require attention to relationships, subgroups and location, along with an ability to examine censorship and respect copyright laws. While DALL-E was being trained on representative images of the Mona Lisa, the humans in the loop with an awareness of these minutiae were critical to reducing bias and encouraging fairness in all outcomes.
It’s not just offensive abuses that can be easily imagined. In a fascinating turn of events, there are even multi-million-dollar art forgeries being reported by artists who use AI as their medium of choice. All enormous datasets or large networks of models contain, buried deep within the data, intrinsic biases, labeling gaps and outright fraud that challenge quick ethical solutions. OpenAI’s Natalie Summers, who runs OpenAI’s Instagram account and is the “human in the loop” responsible for enforcing the rules that are supposed to guard against output that could damage reputations or incite outrage, expresses similar concerns.
This leads me to conclude that to be a prompt engineer is to be someone not only responsible for creating art, but willing to serve as a gatekeeper to prevent misuse like forgeries, hate speech, copyright violations, pornography, deepfakes and the like. Sure it’s nice to churn out dozens of odd, slightly disturbing surreal Dada art ‘products,’ but there should be something more compelling buried under the mound of dross that results from a toss-away visual experiment.
I believe DALL-E has brought us to an inflection point in AI art, where both artists and engineers will need to comprehend how data science manipulates and enables behavior while also being able to understand how machine learning models work. In order to design the output of these machine learning tools, we will need experience beyond engineering and design, in the same way that understanding the physics of light and aperture takes photographic art beyond the mundane.
This diagram is an abbreviation of Professor Neri Oxman’s “Cycle of Creativity.” Her work with the Mediated Matter research group at the MIT Media Lab explored the intersection of design, biology, computing and materials engineering with an eye on how all these fields optimally interact with one another. Likewise, in order to become a “prompt engineer” (an as-yet nonexistent job title that has yet to be formally embraced by any discipline), you will need an awareness of these intersections that are as broad as hers. It’s a serious job with multiple specialties.
Future DALL-E artists, whether self-taught or schooled, will always need the ability to communicate and design an original point of view. Like any librarian with image metadata and curation skills; like any engineer able to structure and test reproducible results; like historians able to connect Picasso’s influences with what was happening in the world as he was painting about war and beauty, “prompt engineer” will be an artistic career of the future, requiring a blend of scientific and artistic talents that will guide the algorithm. It will continue to be humans who inject their ideas into machines in service of the newer and ever-changing language of creation.
Tori Orr is a member of DataRobot’s AI Ethics Communications team.