logo
Mollick Presents The Meaning Of New Image Generation Models

Mollick Presents The Meaning Of New Image Generation Models

Forbes08-04-2025

Paintbrush dynamically illustrates the innovative concept of generative AI art. This mesmerizing ... More image captures the essence of creativity and automation in the realm of digital masterpieces. Witness the fusion of human imagination and artificial intelligence as strokes of the brush evolve into intricate patterns, showcasing the potential of neural networks and creative evolution. This visual journey limitless and where technology transforms the canvas of artistic expression.
What does it mean when AI can build smarter pictures?
We found out a few weeks ago as both Google and OpenAI unveiled new image generation models that are fundamentally different than what has come before.
A number of important voices chimed in on how this is likely to work, but I didn't yet cover this timely piece by Ethan Mollick at One Useful Thing, in which the MIT graduate looks at these new models in a detailed way, and evaluates how they work and what they're likely to mean to human users.
The Promise of Multimodal Image Generation
Essentially, Mollick explains that the traditional image generation systems were a handoff from one model to another.
'Previously, when a Large Language Model AI generated an image, it wasn't really the LLM doing the work,' he writes. 'Instead, the AI would send a text prompt to a separate image generation tool and show you what came back. The AI creates the text prompt, but another, less intelligent system creates the image.'
Diffusion Models Are So 2021
The old models also mostly used diffusion to work.
How does diffusion work?
The traditional models have a single dimension that they use to generate images.
I remember a year ago I was writing an explanation for an audience of diffusion by my colleague Daniela Rus, who presented it at conferences.
It goes something like this – the diffusion model takes an image, introduces noise, and abstracts the image, before denoising it again to form a brand new image that resembles what the computer already knows from looking at images that match the prompt.
Here's the thing – if that's all the model does, you're not going to get an informed picture. You're going to get a new picture that looks like a prior picture, or more accurately, thousands of pictures that the computer saw on the Internet, but you're not going to get a picture with actionable information that's reasoned and considered by the model itself.
Now we have multimodal control, and that's fundamentally different.
No Elephants?
Mollick gives the example of a prompt that asks the model to create an image without elephants in the room, showing why there are no elephants in the room.
Here's the prompt: 'show me a room with no elephants in it, make sure to annotate the image to show me why there are no possible elephants.'
When you hand this to a traditional model, it shows you some elephants, because it doesn't understand the context of the prompt, or what it means. Furthermore, a lot of the text that you'll get is complete nonsense, or even made-up characters. That's because the model didn't know what letters actually looked like – it was getting that from training data, too.
Mollick shows when you hand the same prompt to a multimodal model. It gives you exactly what you want – a room with no elephants, and notes like 'the door is too small' showing why the elephants wouldn't be in there.
Challenges of Prompting Traditional Models
I know personally that this was how the traditional models worked. As soon as you asked them not to put something in, they would put it in, because they didn't understand your request.
Another major difference is that traditional models would change the fundamental image every time you ask for a correction or a tweak.
Suppose you had an image of a person, and you asked for a different hat. You might get an image of an entirely different person.
The multimodal image generation models know how to preserve the result that you wanted, and just change it in one single small way.
Preserving Habitats
Mollick gives another example of how this works: he shows an otter with a particular sort of display in its hands. Then the otter appears in different environments with different styles of background.
This also shows the detailed integration of multi Moto image generators.
A whole pilot deck.
For a used case scenario BB shows how you could take one of these multimodal models and have it designed an entire pitch deck for guacamole or anything else?
All you have to do is say come up with this type of deck and the model will get right to work looking at what else is on the Internet, Synthesizing it and giving you the result.
As Mick mentions this will make all sorts of human work obsolete very quickly.
We will need well considered framework

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

An OpenAI exec says she was diagnosed with breast cancer and that ChatGPT has helped her navigate it
An OpenAI exec says she was diagnosed with breast cancer and that ChatGPT has helped her navigate it

Yahoo

time2 hours ago

  • Yahoo

An OpenAI exec says she was diagnosed with breast cancer and that ChatGPT has helped her navigate it

Kate Rouch, OpenAI's chief marketing officer, said she was diagnosed with breast cancer. Rouch said she is expected to make a full recovery and urged other women to prioritize their health. She said she leaned on OpenAI's ChatGPT to navigate her treatment. Kate Rouch, the chief marketing officer at OpenAI, shared on Friday that she was diagnosed with invasive breast cancer weeks after assuming the role, which she called her "dream job," in December. In a thread posted on X, Rouch said she was sharing her story to help other women, adding, "We can't control what happens to us--but we can choose how we face it. My biggest lesson: no one fights alone." Prior to joining OpenAI as the company's first CMO, Rouch was CMO at Coinbase and, before that, spent over a decade at Meta, including as vice president, global head of brand and product marketing. Rouch said she started treatment right around the Super Bowl in February, when OpenAI aired its first-ever ad, and that she has since gone through 13 rounds of chemotherapy while leading OpenAI's marketing team. She wrote that she is expected to make a full recovery. "It has been the hardest season of life — for me, for my husband, and for our two young children," Rouch said, adding she has been supported by OpenAI "at every step." "Silicon Valley can be brutal and transactional. And yet — I've never felt more held," she said, adding that "people showed up in incredible and unexpected ways." Rouch also said OpenAI's ChatGPT has helped her navigate her diagnosis and treatment, including by explaining cancer in a way that is age-appropriate for her kids, helping her manage the side effects of chemo, and creating custom meditations. "Experiencing our work as a patient has made OpenAI's mission feel more personal and important," she said. Rouch said she was sharing her story to encourage other women to "prioritize their health over the demands of families and jobs." "A routine exam saved my life. It could save yours, too," she said. Business Insider reached out to OpenAI for comment. Kevin Weil, the chief product officer at OpenAI, expressed support for Rouch in a reply to her thread. "We love you @kate_rouch!" he wrote. "Proud of you for telling your story and for being so full of fight." Read the original article on Business Insider

An OpenAI exec says she was diagnosed with breast cancer and that ChatGPT has helped her navigate it
An OpenAI exec says she was diagnosed with breast cancer and that ChatGPT has helped her navigate it

Business Insider

time2 hours ago

  • Business Insider

An OpenAI exec says she was diagnosed with breast cancer and that ChatGPT has helped her navigate it

Kate Rouch, the chief marketing officer at OpenAI, shared on Friday that she was diagnosed with invasive breast cancer weeks after assuming the role, which she called her "dream job," in December. In a thread posted on X, Rouch said she was sharing her story to help other women, adding, "We can't control what happens to us--but we can choose how we face it. My biggest lesson: no one fights alone." Prior to joining OpenAI as the company's first CMO, Rouch was CMO at Coinbase and, before that, spent over a decade at Meta, including as vice president, global head of brand and product marketing. Rouch said she started treatment right around the Super Bowl in February, when OpenAI aired its first-ever ad, and that she has since gone through 13 rounds of chemotherapy while leading OpenAI's marketing team. She wrote that she is expected to make a full recovery. "It has been the hardest season of life — for me, for my husband, and for our two young children," Rouch said, adding she has been supported by OpenAI "at every step." "Silicon Valley can be brutal and transactional. And yet — I've never felt more held," she said, adding that "people showed up in incredible and unexpected ways." Rouch also said OpenAI's ChatGPT has helped her navigate her diagnosis and treatment, including by explaining cancer in a way that is age-appropriate for her kids, helping her manage the side effects of chemo, and creating custom meditations. "Experiencing our work as a patient has made OpenAI's mission feel more personal and important," she said. Rouch said she was sharing her story to encourage other women to "prioritize their health over the demands of families and jobs." "A routine exam saved my life. It could save yours, too," she said. Business Insider reached out to OpenAI for comment. Kevin Weil, the chief product officer at OpenAI, expressed support for Rouch in a reply to her thread. "We love you @kate_rouch!" he wrote. "Proud of you for telling your story and for being so full of fight."

Japanese AI Unicorn Notta Enters Otter AI's Market with Innovative Smart Voice Recorder
Japanese AI Unicorn Notta Enters Otter AI's Market with Innovative Smart Voice Recorder

Associated Press

time4 hours ago

  • Associated Press

Japanese AI Unicorn Notta Enters Otter AI's Market with Innovative Smart Voice Recorder

Notta Bets Big on AI Agent-Powered Voice Recorder as Hardware-SaaS Convergence Accelerates TOKYO, June 6, 2025 /PRNewswire/ -- As OpenAI's ChatGPT dominates enterprise AI adoption and Google Gemini reshapes how businesses think about AI assistants, Japanese unicorn Notta announced its ambitious push into the U.S. hardware market with the Notta Memo AI Voice Recorder. From Tokyo to Silicon Valley: The Underdog Story Notta's journey reads like a classic tech disruption playbook, albeit with a Japanese twist. Since launching in 2020, the company has quietly built a formidable AI transcription empire, amassing over 10 million users globally and signing up 4,000 enterprise customers. The real validation? A staggering 68% of Japan's Nikkei 225 companies—the country's most elite corporations—have integrated Notta's AI solutions into their workflows. 'Most observers underestimate just how competitive the Japanese enterprise market is,' explains industry analyst Sarah from TechInsight Research. 'If you can win over two-thirds of the Nikkei 225, you've proven enterprise-grade reliability at scale.' Notta's 2022 Airgram acquisition was just the opening move—the company believes the real opportunity lies in completely reimagining how voice technology integrates into daily workflows. Security-First Approach Gives Notta Enterprise Edge Unlike many AI startups that retrofit compliance as an afterthought, Notta built enterprise security into its foundation from day one. The company holds ISO/IEC 27001 and SOC 2 Type II certifications—credentials that remain elusive for many competitors still scrambling to meet Fortune 500 requirements. More critically, Notta has secured the regulatory trifecta that unlocks major markets: GDPR compliance for European expansion, HIPAA certification for healthcare disruption, and CCPA adherence for California's privacy-conscious landscape. This compliance-first strategy positions Notta to compete for high-value enterprise contracts that security-conscious organizations won't trust to less mature platforms. Hardware Meets AI: The Five-Platform Ecosystem Play The Notta Memo distinguishes itself in the crowded voice recording market through its unique five-platform ecosystem integration—seamlessly connecting web, iOS, Android, Chrome extension, and the Memo device itself. This comprehensive approach addresses the fragmented user experience that has long plagued the industry. Unlike traditional voice recording solutions that force users to choose between SaaS flexibility and hardware reliability, Notta Memo delivers both. The device features advanced bone-conduction technology for complete phone call recording, intelligent noise reduction for crystal-clear audio capture, and one-tap synchronization with Notta. 'We're not just launching another recording device,' said Ryan, CEO of Notta. 'We're introducing a complete AI-powered voice ecosystem that adapts to how people work and communicate in today's hybrid world. Our hardware becomes an extension of our proven AI platform that has already transformed millions of meetings worldwide.' Beyond recording, Notta Memo is your AI-powered agent and centralized knowledge hub—seamlessly integrated across all your workflows. Competitive Positioning in a Dynamic Market Notta enters the US AI transcription market with an integrated approach that combines software and hardware innovation. While the market includes established players like Otter AI in the SaaS space and Plaud AI in hardware solutions, Notta differentiates itself through its comprehensive platform that leverages AI capabilities refined through years of serving millions of global users. The company acknowledges the challenges in competing against established players with loyal American user bases. However, Notta's proven track record of rapid user acquisition and revenue growth in Japan, combined with its technological differentiation, positions it as a formidable new entrant. Market Timing and Strategic Vision Notta's American expansion comes at a pivotal moment in the AI industry. As enterprises increasingly adopt AI agents for productivity enhancement and consumers demand more intelligent, seamless technology experiences, the convergence of AI SaaS and purpose-built hardware represents a natural evolution. The Notta Memo's launch signals more than just product diversification – it represents Notta's vision of AI-native hardware that thinks, learns, and adapts to user needs. This positions the company at the forefront of the next wave of AI innovation, where the distinction between SaaS and hardware dissolves into unified, intelligent experiences. About Notta Notta is the AI unicorn that's redefining how the world captures and understands conversations. Seamlessly integrating advanced SaaS with intelligent hardware, Notta turns conversations from any meeting, call, or discussion into actionable insights. This AI-powered, end-to-end solution captures and structures spoken interactions, creating a centralized knowledge hub. Media Contact: [email protected] View original content to download multimedia: SOURCE Notta

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store