Latest news with #Imagen3

Google's Imagen 4 text-to-image model promises 'significantly improved' boring images

Engadget

25-06-2025

Entertainment
Engadget

Google's Imagen 4 text-to-image model promises 'significantly improved' boring images

Google has unveiled its latest text-to-image model Imagen 4 with the usual promise of "significantly improved text rendering" over the previous version, Imagen 3. The company also introduced a new deluxe version called Imagen 4 Ultra designed to follow more precise text prompts if you're willing to pay extra. Both arrive to a paid preview in the Gemini API and for limited free testing in Google AI Studio. Google describes the main Imagen 4 model as "your go-to for most tasks" with a price of $.04 per image. Imagen 4 Ultra, meanwhile, is for "when you need your images to precisely follow instructions" with the promise of "strong" output results compared to other image generators like Dall-E and Midjourney. That model boosts the price by 50 percent to $.06 per image. The company showed off a range of images including a three-panel comic generated by Imagen 4 Ultra showing a small spaceship being attacked by a giant blue... space lizard? with some sound effects like "Crunch!" and inexplicably, "Had!!" The image followed the listed prompt beat for beat and looked okay, not unlike a toon rendering from a 3D app. Another prompt read " front of a vintage travel postcard for Kyoto: iconic pagoda under cherry blossoms, snow-capped mountains in distance, clear blue sky, vibrant colors." Imagen 4 output that to a "T," albeit in a generic style lacking any charm. Another image showed a hiking couple waving from atop a rock and another, a fake "avant garde" fashion shoot. The images were definitely of good quality and followed the text prompts precisely but still looked highly machine generated. Imagen 4 is fine and does seem a mild improvement from before, but I'm not exactly wowed by it — particularly compared to the market leaders, Dall-E 3 and Midjourney 7. Plus, following an initial rush of enthusiasm, the public seems to be getting sick of AI art, with the main use case apparently being spammy ads on social media or at the bottom of articles.

I tested Gemini's latest image generator and here are the results

Android Authority

22-06-2025

Android Authority

I tested Gemini's latest image generator and here are the results

Back in November, I tested the image generation capabilities within Google's Gemini, which was powered by the Imagen 3 model. While I liked it, I ran into its limitations pretty quickly. Google recently rolled out its successor — Imagen 4 — and I've been putting it through its paces over the last couple of weeks. I think the new version is definitely an improvement, as some of the issues I had with Imagen 3 are now thankfully gone. But some frustrations still remain, meaning the new version isn't quite as good as I'd like. How often do you create images with AI? 0 votes It's a daily thing for me. NaN % Maybe once per week. NaN % A few times per month. NaN % Never. NaN % So, what has improved? The quality of the images produced has generally improved, though the improvement isn't massive. Imagen 3 was already generally good at creating images of people, animals, and scenery, but the new version consistently produces sharper, more detailed images. When it comes to generating images of people — which is only possible with Gemini Advanced — I had persistent issues with Imagen 3 where it would create cartoonish-looking photos, even when I wasn't asking for that specific style. Prompting it to change the image to something more realistic was often a losing battle. I haven't experienced any of that with Imagen 4. All the images of people it generates look very professional — perhaps a bit too much, which is something we'll touch on later. One of my biggest frustrations with the older model was the limited control over aspect ratios. I often felt stuck with 1:1 square images, which severely limited their use case. I couldn't use them for online publications, and printing them for a standard photo frame was out of the question. While Imagen 4 still defaults to a 1:1 ratio, I can now simply prompt it to use a different one, like 16:9, 9:16, or 4:3. This is the feature I've been waiting for, as it makes the images created far more versatile and usable. Imagen 4 also works a lot more smoothly. While I haven't found it to be noticeably faster — although a faster model is reportedly in the works — there are far fewer errors. With the previous version, Gemini would sometimes show an error message, saying it couldn't produce an image for an unknown reason. I have received none of those with Imagen 4. It just works. Still looks a bit too retouched While Imagen 4 produces better images, is more reliable, and allows for different aspect ratios, some of the issues I encountered when testing its predecessor are still present. My main problem is that the images often aren't as realistic as I'd like, especially when creating close-ups of people and animals. Images tend to come out quite saturated, and many feature a prominent bokeh effect that professionally blurs the background. They all look like they were taken by a photographer with 15 years of experience instead of by me, just pointing a camera at my cat and pressing the shutter. Sure, they look nice, but a 'casual mode' would be a fantastic addition — something more realistic, where the lighting isn't perfect and the subject isn't posing like a model. I prompted Gemini to make an image more realistic by removing the bokeh effect and generally making it less perfect. The AI did try, but after prompting it three or four times on the same image, it seemed to reach its limit and said it couldn't do any better. Each new image it produced was a bit more casual, but it was still quite polished, clearly hinting that it was AI-generated. You can see that in the images above, going from left to right. The first one includes a strong bokeh effect, and the man has very clear skin, while the other two progress to the man looking older and older, as well as more tired. He even started balding a bit in the last image. It's not what I really meant when prompting Gemini to make the image more realistic, although it does come out more casual. Imagen 4 does a much better job with random images like landscapes and city skylines. These images, taken from afar, don't include as many close-up details, so they look more genuine. Still, it can be a hit or miss. An image of the Sydney Opera House looks great, although the saturation is bumped up quite a bit — the grass is extra green, and the water is a picture-perfect blue. But when I asked for a picture of the Grand Canyon, it came out looking completely artificial and wouldn't fool anyone into thinking it was a real photo. It did perform better after a few retries, though. Editing is better, but not quite there One of my gripes with the previous version was its clumsy editing. When asked to change something minor — like the color of a hat — the AI would do it, but it would also generate a brand new, completely different image. The ideal scenario would be to create an image and then be allowed to edit every detail precisely, such as changing a piece of clothing, adding a specific item, or altering the weather conditions while leaving everything else exactly as is. Imagen 4 is better in this regard, but not by much. When I prompted it to change the color of a jacket to blue, it created a new image. However, by specifically asking it to keep all other details the same, it managed to maintain a lot of the scenery and subject from the original. That's what happened in the examples above. The woman in the third image was the same, and she appeared to be in a similar room, but her pose and the camera angle were different, making it more of a re-shoot than an edit. Here's another example of a cat eating a popsicle. I prompted Gemini to change the color of the popsicle, and it did, and it kept a lot of the details. The cat's the same, and so is most of the background. But the cat's ears are now sticking out, and the hat is a bit different. Still, a good try. Despite its shortcomings, Imagen 4 is a great tool Even with its issues and a long wishlist of missing functionality, Imagen 4 is still among the best AI image generators available. Most of the problems I've mentioned are also present in other AI image-generation software, so it's not as if Gemini is behind the competition. It seems there are significant technical hurdles that need to be overcome before these types of tools can reach the next level of precision and realism. Other limitations are still in place, such as the inability to create images of famous people or generate content that violates Google's safety guidelines. Whether that's a good or a bad thing is a matter of opinion. For users seeking fewer restrictions, there are alternatives like Grok. Have you tried out the latest image generation in Gemini? Let me know your thoughts in the comments.

Adobe's Firefly comes to iOS and Android

TechCrunch

17-06-2025

Business
TechCrunch

Adobe's Firefly comes to iOS and Android

Adobe has been on a quest to attract users to its platform for their AI needs. The company in April launched a redesigned Firefly web app that lets users use Adobe's own Firefly image- and video-generation models as well as third-party models. Now, it is releasing a Firefly app on both iOS and Android that lets people use all of its models as well as models from OpenAI (GPT image generation), Google (Imagen 3 and Veo 2), and Flux (Flux 1.1 Pro). Like the web app, the new smartphone apps let you use prompts to generate images or videos or convert images into videos. You can also edit certain parts of images using generative fill or expand an image with generative expand. Adobe Creative Cloud subscribers can start a project on the Firefly mobile app and store it in the cloud to access it through the web or desktop app. The company is now also supporting more third-party models, including Flux.1 Kontext by Black Forest Labs, Ideogram 3.0 by Ideogram, and Gen-4 Image by Runway. The company is also updating Adobe Canvas, its collaborative whiteboarding tool, with the ability to generate videos. Canvas lets users generate videos with Adobe's own video models as well as those made by its competitors. Adobe said users have so far created more than 24 billion media assets with its Firefly models, and that its AI features have been a big factor in increasing the number of first-time subscribers 30% quarter-over-quarter.

Imagen 4 is Google's newest AI image generator

Yahoo

20-05-2025

Yahoo

Imagen 4 is Google's newest AI image generator

Google is rolling out a new image-generating AI model, Imagen 4, that the company claims delivers higher-quality results than its previous image generator, Imagen 3. Unveiled at Google I/O 2025 on Tuesday, Imagen 4 is capable of rendering "fine details" like fabrics, water droplets, and animal fur, Google says. The model can handle both photorealistic and abstract styles, creating images in a range of aspect ratios and up to 2K resolution. "Imagen 4 is [a] huge step forward in quality," Josh Woodward, who leads Google's Labs group, said during a press briefing. "We've also [paid] a lot of attention and fixes around how it generates text and topography, so it's wonderful for creating slides or invitations, or any other thing where you might need to blend imagery and text." There's no shortage of AI image generators out there, from ChatGPT's viral tool to Midjourney's V7. They're all relatively sophisticated, customizable, and capable of creating high-quality AI artwork. So what makes Imagen 4 stand out from the crowd? According to Google, Imagen 4 is fast — faster than Imagen 3. And it'll soon get faster. In the near future, Google plans to release a variant of Imagen 4 that's up to 10x quicker than Imagen 3. Imagen 4 is available as of this morning in the Gemini app, Google's Whisk and Vertex AI platforms, and across Google Slides, Vids, Docs, and more in Google Workspace. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Hedra, the app used to make talking baby podcasts, raises $32M from a16z

Yahoo

15-05-2025

Business
Yahoo

Hedra, the app used to make talking baby podcasts, raises $32M from a16z

People are using AI video generation tools to contribute to an unexpected new viral trend: podcasts featuring AI-generated talking babies. And one of the companies helping artists do this is Hedra. The startup, launched in 2023, offers a web-based video generation and editing suite powered by its Character-3 model, which lets users make videos with an AI-generated character as the focus, as well as transfer styles across images and audio. This is what people are using to make podcast videos like this one, in which an AI-generated dog talks about what it's like to live with a new baby in the house. We're not sure how much Hedra has benefited from this trend, but it's receiving ample investor attention nevertheless: the company on Thursday said it has raised $32 million in a Series A funding round led by Andreessen Horowitz's Infrastructure fund. Its previous investors are participating in the round, and a16z's Matt Bornstein will join the startup's board. Michael Lingelbach, the company's founder and CEO (pictured below), told TechCrunch the startup was inspired by the gap he noticed between companies like Synthesia, which let users superimpose AI-generated avatars over presentations, and startups like Runway, which provide video generation tools for creating short clips. "I thought what if we did something at the intersection of video generation and 3D characters, with long dialogues and better controllability," he said. Hedra launched its first video model in June 2024, and quickly attracted investor interest, securing $10 million in seed funding from Index Ventures, Abstract Ventures, and a16z speedrun. Earlier this year, Amazon also backed the company through its venture capital arm, Alexa Fund. Lingelbach noted that the launch of the Character-3 model in March was a big inflection point (shortly after the company signed its term sheet with a16z), and is now driving a lot of user growth. The startup wants to use fresh cash to train its next model, which it says enables better customization, as well as develop technology to let its AI-generated characters interact with users. The company is now focusing on attracting creators and prosumers, and said it has received inbound interest from marketing departments of enterprises as well. While Hedra's own model is centered around character movement and expression, the app lets you employ other models like Veo 2 and Kling for video generation; Flux, Imagen3, Sana, and Ideogram V2 for image generation; and audio models from ElevenLabs and Cartesia for voice generation or cloning. Hedra's competitors include Captions (also backed by a16z), which is focused more on smartphones; Greycroft-backed Cheehoo, which works with Hollywood studios to create animated features; Synthesia, and HeyGen. Hedra claims the videos generated with its platform have more expressive characters than those made using its competition. a16z's Bornstein thinks that as the AI-powered video generation space evolves, we will see more tools focusing on characters, motion, voice, editing and the like. "AI companies can produce amazing clips of environments and simple actions. But they can't generate meaningful dialogue or animation. It's not just about making a video, it's about making a story that resonates. This is largely down to the people and characters in the story. That's exactly what Hedra is building," he told TechCrunch in an emailed statement. This article originally appeared on TechCrunch at Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Latest news with #Imagen3

Google's Imagen 4 text-to-image model promises 'significantly improved' boring images

I tested Gemini's latest image generator and here are the results

Adobe's Firefly comes to iOS and Android

Imagen 4 is Google's newest AI image generator

Hedra, the app used to make talking baby podcasts, raises $32M from a16z

Get Started Now: Download the App