01-04-2025
Picture This: Big Changes With ChatGPT's Image Release
Artist's palettes, Anchor Studio, Newlyn, Cornwall, 2019. Detail showing two of artist Virginia ... More Bounds' paint palettes on the paint-splattered floorboards of the artist's studio. Artist Steven Baker. (Photo by Historic England Archive/Heritage Images via Getty Images)
Some of the biggest news this past week was OpenAI's decision to put new image generation powers right in the hands of its model, which is the first of its kind to really become a household name.
Now, ChatGPT can give us a new kind of visual iconography, along with all of the knowledge work it already does. It's interesting to think about how this will change media and a lot of other domains.
I'm going to go over some of the major changes mentioned by Nathaniel Whittemore on a recent AI Daily Brief podcast, as he quotes an X post by Balaji Srinivasan.
So these are third-hand points, but I'll add my own insight based on what I've heard from conferences and events, and in classes, and all around the MIT community this year.
One of the obvious fruits of this innovation is that there will be less of a burden on the human to craft code, or prompt AI to produce robust visuals.
Srinivasan mentions the example of Instagram filters, where all you need with the new technology is a simple keyword.
Another quote from Srinivasan is that 'the baseline quality of memes should rise.'
It's easy to imagine how this works – you take the text, and ChatGPT capably adds the image – and it's perfect. Presumably, we will get better visual memes, although why we need the quality of memes to increase is sort of unclear. Yes, it's a modern form of communication, but you'd think there will be alternatives, such as the following:
This one, I thought, was extremely interesting. Essentially, the ChatGPT endowed with this image capacity will be able to go back and pull any piece of classic literature or other text, and add vibrant panel images in the style of a graphic novel or a comic book.
Imagine Dostoyevsky's Crime and Punishment brought to life, or Dickens' Bleak House with its unending accounting of decades-long lawsuits, or even just a restaurant menu from the 1700s. Or imagine all of that arcane religious text brought to visceral life, with new images of words that are frankly, in some cases, fairly inflammatory.
A graphic version of Malleus Malificarum, anyone?
As Solomon said: 'of the making of books, there will be no end.'
But the books will be able to interact with audiences in a new way.
Whittemore (again quoting the X post) also mentions children's story books, and this is an important point: new audiences will be introduced to whole new realms of literature and ancient text.
Two other things that Srinivasan and Whittemore point out are slides, the fodder for professional presentations, and websites.
One is made for a particular audience to view in real time. The other one is a digital storefront that should stand the test of time as users come and go.
In either case, this is going to make design and engineering almost mindless and immediate.
Instead of searching for images and curating them, humans can let AI do almost all of the work, down to the extent that all they have to do is press a few buttons.
Part of the enormous knowledge work that is saved here is the decision-making process. If I have to do 20 slides in a deck, and I have to type everything in by hand and add the images, I'm looking at a few hours of work. If, on the other hand, I can just verbally say to ChatGPT: 'come up with 20 slides for this,' it's all done almost before I can blink.
I think here the major takeaway has to do with deepfakes.
It will be ridiculously easy to put words in someone's mouth, or even show them saying things that they never said. Agentic AI may lead to people's agents posting things that they never would have posted on their own. This may not be very cohesive for social media, but it's certain to be extremely interesting.
Just like with books, movies could get a makeover, but it's different when you're already dealing with visual presentation. I guess what you would have is just AI-driven makeovers of the classics, to where a film like Psycho might be presented in color, or for instance, with more gender-bending, or different music, or a little more dynamism on the part of the main characters…?
The possibilities are endless. Nobody has a claim to knowing exactly how these innovations are going to shake out. But the above use cases are pretty good guesses, informed by professionals who understand this landscape pretty well.
I'll continue to bring you some of the best news in the digital world as we get used to all of these amazing things that the new models can do.