Illustration is hard, especially about the future

Jan 12, 2024

Note: This essay was originally posted on LinkedIn on October 5, 2023. I decided to stop using AI generators. To learn why, click here.

I talk a lot about the need for us to play around with generative AI tools so that educators with actual experience can shape the institutional policies and government regulations that will structure their use. I posted recently about my own experiment with Khanmigo, a conversation with its Thomas Jefferson chatbot about Sally Hemings. I have also experimented with text-to-image generators to produce the header images for this weekly newsletter, 𝑨𝑰 𝑳𝒐𝒈. In today’s post, I want to talk about some of the problems I’ve encountered, both with using the tools and with the ethics of using AI-generated content.

I’m not a particularly visual learner. But using images to convey my ideas is important for the project of developing new conceptual models to help understand generative AI. I used DALL-E for my first post's header and when ideogram.ai made it possible to include actual words in an image, I made it my first choice. Images with words help create header images that are more immediately relevant to the writing that follows, like the use of “No Robots!” in the header above.

My first attempt to use words in an image was the one below, the header for an article about generative AI as the new buzzword in higher education, an exploration of the ways bias and discrimination are embedded in the output from these tools, and questions about who makes up the user community for generative AI. I was pleased with the output, which was created in response to my first prompt, but it left out the “r” in “Generative.” After dozens of attempts, most of which included robots and all of which misspelled “generative,” I gave up and hoped readers would miss or forgive the misspelling.

The image below was for my very first article about developments over the summer. I used DALL-E 2 and hit the mark on the first try in terms of tone, color, and image. However, it also gave the professor a weird, dog-like face. Again, dozens of attempts later, I gave up and went with the dog-faced figure hoping the storm and sunrise were eye-catching enough to pull the viewer to the horizon and its vibe of dangerous possibility.

Created with Bing Image Creator powered by DALL-E 2; Prompt: Create an expressive painting of a professor sitting on a beach reading a book at the beach while the sun rises on the horizon and a storm with dramatic lightening rages in the distance

My first try didn’t work when it came to the image below, the first logo for my newsletter. I went through several attempts using prompts with “electric blue log” or “log with blue electricity flowing through it.” I finally got something reasonably good by abandoning my original prompt and asking for a log with a blue non-robot figure on it. The humans all looked too weird, but I finally hit with a prompt asking for an electric blue bug on a log. It left out the bug but somehow produced an image I thought worked okay as an 𝑨𝑰 𝑳𝒐𝒈.

As frustrated as I’ve been with the process of creating header images using generative AI, I’m pleased that I am able to present illustrations of the concepts I’m writing about. I'm convinced that thinking about the future of AI requires more than words. Visualizing a future that isn't all about robots requires…well, it requires illustrations that don’t include robots. And DALL-E 2 wasn’t helpful. Ask it to create an image of AI teaching and it returned images of a robot doing something teachery.

Created with Bing Image Creator powered by DALL-E 2; Prompt: Create an image of an AI teaching

I asked it to create an image of AI teaching without a robot, and it returned a robot teaching with a human next to it.

Created with Bing Image Creator powered by DALL-E 2; Prompt: Create an image of an AI teaching without a robot

This is not so surprising, as DALL-E is simply taking the words I gave it and using them to predict the image an average human wants to see. It was trained on millions of images created by humans who have imagined AI as robots and teaching as standing in front of a chalkboard talking. As Jade E. Davis writes: "Artificial Intelligence assumes the past has the answer to the future’s questions."

The robot images above were all generated using DALL-E 2 in mid-September. DALL-E 3 is now available. I've been playing around with the new version and I'm hoping it will address some of the shortcomings of my past header images, such as Sally Hemings having two left hands and the weird left eye of my snake oil salesmen. However, I’m not feeling all that great about using a tool that has appropriated the work of artists.

In a rationalization that may not hold up, I have decided that since I don't use prompts that invoke specific artists and I'm not charging my readers, my use of image generators is ethically okay. I mean who can get through the day without a few juicy rationalizations?

My frustrations with using the tools and their difficulty with human faces and hands made me realize that I need something more advanced if I want to visualize my core arguments about the future of generative AI and education. So, I decided to work with a human. But as I thought about human creative work, my rationalizations about using AI started to feel less juicy.

Kelly McKernan, who I mentioned in a previous article's Link of the Week, has been talking to reporters about her experience discovering that people were prompting Midjourney by using her name to generate images similar to her original art. Here is a New Yorker article from February and a recent NBC news article about the lawsuit that McKernan and other artists are bringing. There are a number of important issues here about appropriation, intellectual property, and the relationship between creative labor and capital.

DALLE-3 does not allow prompts using the names of living artists and I hope other providers follow suit soon. And OpenAI has made an attempt to allow artists to remove their work from its training data, or as they put it to "opt out." I don't think this means what they think it means. Opting out is not really a thing you can do after the fact. I suppose this offer is better than ignoring the problem but it seems like a preemptive move in response to the lawsuits. Their process boils down to "Thank you for pointing out that we stole your intellectual property. Please fill out this form and we will check to see if your property is in our inventory and maybe remove it." I suppose we'll see what the courts think of this measure.

These issues come up in music and writing as well, but so far no form for authors whose work was pirated. It is worth distinguishing between creators like McKernan and Sarah Silverman, who are suing to stop the use of their work in training generative AI models, and celebrities like Grimes and Tyler Cowan who have been clear that they see fans prompting generative AI tools with their names and ideas as an extension of their brand. Not much of a Grimes fan myself…I’m more into Jackie Venson and Eileen Jewell. But I find Cowan’s food opinions align with my own, so I took him up on his suggestion to prompt ChatGPT-4 with “Where would Tyler Cowan eat in Chicago?” to help pick a restaurant when I traveled there for Educause next week.

For most artists and writers, especially freelancers who see work being sourced to generative AI as lost income, it is a different story. Maybe McKernan’s recent notoriety will lead to additional work for them, but that’s not a viable path for others looking for gigs. As the publisher of this newsletter, I have decided to stop using generators that allow using a living artist as a prompt, so I will be giving the Getty Images new AI generator a try.

The issue before us now is profound and we can’t wait for the courts or regulators to provide guidance. Even then, my personal ethics about producing visual images may not align with what government authorities decide. The important question for me is what is right for 𝑨𝑰 𝑳𝒐𝒈. I chose Phil Scroggs of the eponymous Phillustrations to help visualize my ideas because I’ve worked with him in the past. His pop-inflected, hand-drawn style fits the open and curious voice I’m looking to create in writing these articles and I look forward to sharing the images he creates in the future. Disclosure: I consider Phil a personal friend as well as an excellent illustrator.

The labor practices and ethical standards of a free newsletter are small potatoes, I realize. What about large potatoes? How should institutions of higher education think about these questions? Colleges and universities spend billions on marketing and advertising each year. How much of that spending will be sourced to generative AI this year? What about in the near future when new product development brings specialized graphics services and text-to-video generators? Scroggs, who is based in Seattle, has done freelance work for the University of Washington. Would the marketing department at UW Medicine decide to have an intern prompt Midjourney with “in the style of Phil Scroggs” to create images of Dr. Dubs for their next campaign?

These are the kinds of questions those who manage marketing budgets and create publications for colleges and for edtech companies will be answering this year. What are your ideas for collectively considering how we should approach answers?

Illustration is hard, especially about the future

Discussion about this post