logo
#

Latest news with #Imagen3

Imagen 4 is Google's newest AI image generator
Imagen 4 is Google's newest AI image generator

Yahoo

time20-05-2025

  • Yahoo

Imagen 4 is Google's newest AI image generator

Google is rolling out a new image-generating AI model, Imagen 4, that the company claims delivers higher-quality results than its previous image generator, Imagen 3. Unveiled at Google I/O 2025 on Tuesday, Imagen 4 is capable of rendering "fine details" like fabrics, water droplets, and animal fur, Google says. The model can handle both photorealistic and abstract styles, creating images in a range of aspect ratios and up to 2K resolution. "Imagen 4 is [a] huge step forward in quality," Josh Woodward, who leads Google's Labs group, said during a press briefing. "We've also [paid] a lot of attention and fixes around how it generates text and topography, so it's wonderful for creating slides or invitations, or any other thing where you might need to blend imagery and text." There's no shortage of AI image generators out there, from ChatGPT's viral tool to Midjourney's V7. They're all relatively sophisticated, customizable, and capable of creating high-quality AI artwork. So what makes Imagen 4 stand out from the crowd? According to Google, Imagen 4 is fast — faster than Imagen 3. And it'll soon get faster. In the near future, Google plans to release a variant of Imagen 4 that's up to 10x quicker than Imagen 3. Imagen 4 is available as of this morning in the Gemini app, Google's Whisk and Vertex AI platforms, and across Google Slides, Vids, Docs, and more in Google Workspace. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Hedra, the app used to make talking baby podcasts, raises $32M from a16z
Hedra, the app used to make talking baby podcasts, raises $32M from a16z

Yahoo

time15-05-2025

  • Business
  • Yahoo

Hedra, the app used to make talking baby podcasts, raises $32M from a16z

People are using AI video generation tools to contribute to an unexpected new viral trend: podcasts featuring AI-generated talking babies. And one of the companies helping artists do this is Hedra. The startup, launched in 2023, offers a web-based video generation and editing suite powered by its Character-3 model, which lets users make videos with an AI-generated character as the focus, as well as transfer styles across images and audio. This is what people are using to make podcast videos like this one, in which an AI-generated dog talks about what it's like to live with a new baby in the house. We're not sure how much Hedra has benefited from this trend, but it's receiving ample investor attention nevertheless: the company on Thursday said it has raised $32 million in a Series A funding round led by Andreessen Horowitz's Infrastructure fund. Its previous investors are participating in the round, and a16z's Matt Bornstein will join the startup's board. Michael Lingelbach, the company's founder and CEO (pictured below), told TechCrunch the startup was inspired by the gap he noticed between companies like Synthesia, which let users superimpose AI-generated avatars over presentations, and startups like Runway, which provide video generation tools for creating short clips. "I thought what if we did something at the intersection of video generation and 3D characters, with long dialogues and better controllability," he said. Hedra launched its first video model in June 2024, and quickly attracted investor interest, securing $10 million in seed funding from Index Ventures, Abstract Ventures, and a16z speedrun. Earlier this year, Amazon also backed the company through its venture capital arm, Alexa Fund. Lingelbach noted that the launch of the Character-3 model in March was a big inflection point (shortly after the company signed its term sheet with a16z), and is now driving a lot of user growth. The startup wants to use fresh cash to train its next model, which it says enables better customization, as well as develop technology to let its AI-generated characters interact with users. The company is now focusing on attracting creators and prosumers, and said it has received inbound interest from marketing departments of enterprises as well. While Hedra's own model is centered around character movement and expression, the app lets you employ other models like Veo 2 and Kling for video generation; Flux, Imagen3, Sana, and Ideogram V2 for image generation; and audio models from ElevenLabs and Cartesia for voice generation or cloning. Hedra's competitors include Captions (also backed by a16z), which is focused more on smartphones; Greycroft-backed Cheehoo, which works with Hollywood studios to create animated features; Synthesia, and HeyGen. Hedra claims the videos generated with its platform have more expressive characters than those made using its competition. a16z's Bornstein thinks that as the AI-powered video generation space evolves, we will see more tools focusing on characters, motion, voice, editing and the like. "AI companies can produce amazing clips of environments and simple actions. But they can't generate meaningful dialogue or animation. It's not just about making a video, it's about making a story that resonates. This is largely down to the people and characters in the story. That's exactly what Hedra is building," he told TechCrunch in an emailed statement. This article originally appeared on TechCrunch at Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Adobe releases new Firefly image generation models and a redesigned Firefly web app
Adobe releases new Firefly image generation models and a redesigned Firefly web app

Yahoo

time24-04-2025

  • Business
  • Yahoo

Adobe releases new Firefly image generation models and a redesigned Firefly web app

Adobe on Thursday launched the latest iteration of its Firefly family of image generation AI models, a model for generating vectors, and a redesigned web app that houses all its AI models, plus some from its competitors. There's also a mobile app for Firefly in the works. The new Firefly Image Model 4, Adobe says, improves on its predecessors in terms of quality, speed and the amount of control over the structure and style of outputs, camera angles, and zoom. It can generate images in resolutions up to 2K. There's also a tweaked, more capable version of this model called Image Model 4 Ultra that can render complex scenes consisting of small structures and lots of detail. Alexandru Costin, VP of Generative AI at Adobe, said that the company trained the models with a higher order of compute magnitude to enable them to generate more detailed images. He added that, as compared to previous generation models, the new models improve text generation in images and have features that let users use images of their choice to get the model to generate pictures in that style. The company is also making its Firefly video model, which launched in limited beta last year, available to everyone. It lets users generate video clips from a text prompt or image, use camera angles, specify start and end frames to control shots, generate atmospheric elements, and customize motion design elements. The model can generate video clips from text at resolutions up to 1080p. Meanwhile, the Firefly Vector Model can create editable vector-based artwork, iterate and generate variations of logos, product packaging, icons, scenes, and patterns. Notably, Firefly's web app gives you access to all these models as well as a few image and video generation models from OpenAI (GPT image generation), Google (Imagen 3 and Veo 2) and Flux (Flux 1.1 Pro). Users can switch between any of these models at any point, and images generated from any model will have content credentials attached to them. The company hinted that other AI models may be added to the web app in the future. Adobe is also publicly testing a new product called Firefly Boards, a canvas for ideation or moodboarding. It lets users generate or import images, remix them, and collaborate with others — similar to what you can do with other AI-based ideaboards from Visual Electric, Cove or Kosmik. Boards is available through the Firefly web app. The company said these new models would soon be integrated into its product portfolio, but didn't give a timeline for the rollout. Adobe is also making its Text-to-Image API and Avatar API generally available, and said a new Text-to-Video API is now available for use in beta. These APIs are available through the company's Firefly Services collection of APIs, tools and services. Adobe is also testing a web app called Adobe Content Authenticity to let users attach credentials to their work to indicate ownership and attribution through metadata. Users can also indicate whether AI companies can use any images for AI model training. This article originally appeared on TechCrunch at Sign in to access your portfolio

I tested the future of AI image generation. It's astoundingly fast.
I tested the future of AI image generation. It's astoundingly fast.

Yahoo

time23-03-2025

  • Yahoo

I tested the future of AI image generation. It's astoundingly fast.

One of the core problems with AI is the notoriously high power and computing demand, especially for tasks such as media generation. On mobile phones, when it comes to running natively, only a handful of pricey devices with powerful silicon can run the feature suite. Even when implemented at scale on cloud, it's a pricey affair. Nvidia may have quietly addressed that challenge in partnership with the folks over at the Massachusetts Institute of Technology and Tsinghua University. The team created a hybrid AI image generation tool called HART (hybrid autoregressive transformer) that essentially combines two of the most widely used AI image creation techniques. th result is a blazing fast tool with dramatically lower compute requirement. Just to give you an idea of just how fast it is, I asked it to create an image of a parrot playing a bass guitar. It returned with the following picture in just about a second. I could barely even follow the progress bar. When I pushed the same prompt before Google's Imagen 3 model in Gemini, it took roughly 9-10 seconds on a 200 Mbps internet connection. When AI images first started making waves, the diffusion technique was behind it all, powering products such as OpenAI's Dall-E image generator, Google's Imagen, and Stable Diffusion. This method can produce images with an extremely high level of detail. However, it is a multi-step approach to creating AI images, and as a result, it is slow and computationally expensive. The second approach that has recently gained popularity is auto-regressive models, which essentially work in the same fashion as chatbots and generate images using a pixel prediction technique. It is faster, but also a more error-prone method of creating images using AI. The team at MIT fused both methods into a single package called HART. It relies on an autoregression model to predict compressed image assets as a discrete token, while a small diffusion model handles the rest to compensate for the quality loss. The overall approach reduces the number of steps involved from over two dozen to eight steps. The experts behind HART claim that it can 'generate images that match or exceed the quality of state-of-the-art diffusion models, but do so about nine times faster.' HART combines an autoregressive model with a 700 million parameter range and a small diffusion model that can handle 37 million parameters. Interestingly, this hybrid tool was able to create images that matched the quality of top-shelf models with a 2 billion parameter capacity. Most importantly, HART was able to achieve that milestone at a nine times faster image generation rate, while requiring 31% less computation resources. As per the team, the low-compute approach allows HART to run locally on phones and laptops, which is a huge win. So far, the most popular mass-market products such as ChatGPT and Gemini require an internet connection for image generation as the computing happens in the cloud servers. In the test video, the team showcased it running natively on an MSI laptop with Intel's Core series processor and an Nvidia GeForce RTX graphics card. That's a combination you can find on a majority of gaming laptops out there, without spending a fortune, while at it. HART is capable of producing 1:1 aspect ratio images at a respectable 1024 x 1024 pixels resolution. The level of detail in these images is impressive, and so is the stylistic variation and scenery accuracy. During their tests, the team noted that the hybrid AI tool was anywhere between three to six times faster and offered over seven times higher throughput. The future potential is exciting, especially when integrating HART's image capabilities with language models. 'In the future, one could interact with a unified vision-language generative model, perhaps by asking it to show the intermediate steps required to assemble a piece of furniture,' says the team at MIT. They are already exploring that idea, and even plan to test the HART approach at audio and video generation. You can try it out on MIT's web dashboard. Before we dive into the quality debate, do keep in mind that HART is very much a research project that is still in its early stages. On the technical side, there are a few hassles highlighted by the team, such as overheads during the inference and training process. The challenges can be fixed or overlooked, because they are minor in the bigger scheme of things here. Moreover, considering the sheer benefits HART delivers in terms of computing efficiency, speed, and latency, they might just persist without leading to any major performance issues. In my brief time prompt-testing HART, I was astonished by the pace of image generation. I barely ran into a scenario where the free web tool took more than two seconds to create an image. Even with prompts that span three paragraphs (roughly over 200 words in length), HART was able to create images that adhere tightly to the description. Aside from descriptive accuracy, there was plenty of detail in the images. However, HART suffers from the typical failings of an AI image generator tool. It struggles with digits, basic depictions like eating food items, character consistency, and failing at perspective capture. Photorealism in human context is one area where I noticed glaring failures. On a few occasions, it simply got the concept of basic objects wrong, like confusing a ring with a necklace. But overall, those errors were far, few, and fundamentally expected. A healthy bunch of AI tools still can't get that right, despite being out there for a while now. Overall, I am particularly excited by the immense potential of HART. It would be interesting to see whether MIT and Nvidia create a product out of it, or simply adopt the hybrid AI image generation approach in an existing product. Either way, it's a glimpse into a very promising future.

Series Entertainment Accelerates Game Development by 90% with Google Cloud's AI
Series Entertainment Accelerates Game Development by 90% with Google Cloud's AI

Associated Press

time17-03-2025

  • Entertainment
  • Associated Press

Series Entertainment Accelerates Game Development by 90% with Google Cloud's AI

Series Entertainment, a fast-growing startup dedicated to making great games with the power of AI, today announced its collaboration with Google Cloud, leveraging its cutting-edge AI technology and scalable infrastructure to dramatically accelerate the creation of immersive, high-quality video games, reducing game development cycles by an estimated 90%. In today's competitive landscape, game developers face increasing pressure to deliver high-quality, immersive experiences at an accelerated pace. Traditional content creation workflows often struggle to keep up with the demand for dynamic and engaging gameplay. Series Entertainment is addressing these challenges by using Google Cloud's modern AI infrastructure to streamline content creation at scale, enabling their teams to develop games faster than traditional workflows and provide gameplay experience never seen before. Series Entertainment's 'Inference at Scale' platform is built with Google Cloud's AI and modern, scalable infrastructure, delivering the following benefits: Vertex AI, Google Cloud's unified AI development platform, accelerates content creation and model evaluation with generative AI pipelines and rapid large language model (LLM) testing. Google Kubernetes Engine (GKE) powers real-time AI inference for dynamic gameplay and scales to meet increasing feature demands. Vision API, including Google's Imagen 3 and Veo 2 models, enhances art workflows with AI-generated concept art and consistent video assets. Compute Engine manages large-scale AI workloads, while Cloud Storage and Networking ensure efficient asset management and delivery. By using these technologies, Series Entertainment is driving a 90% acceleration in content creation, dramatically shortening development cycles and freeing up creative teams to spend more time on rich narrative development, refined gameplay features, and deeper world-building. In addition, real-time, AI-driven non-player characters (NPCs) and accelerated art generation enhance player engagement and visual consistency, resulting in more immersive gameplay. For instance, environment artists can spin up new area concepts in a matter of hours, quickly integrating player feedback into live game updates, ensuring each location feels fresh, responsive, and more engaging for our audience. And finally, Google Cloud's scalable AI infrastructure enables rapid prototyping, efficient model training, and smooth game asset delivery. This enhanced efficiency has already contributed to the successful launch of a new game this week, with three more releases planned in the coming months. Notably, AI-generated dialogue has been integrated into certain branching story arcs within the popular 'Choices' series, showcasing the practical application of this technology. 'By seamlessly integrating Google Cloud's AI and infrastructure with our proprietary toolchain, we are achieving unprecedented levels of real-time AI capability,' said Josh English, chief technology officer, Series Entertainment. 'This integration not only streamlines developer workflows, empowering our teams to innovate more efficiently, but also delivers deeply immersive player experiences through dynamic, AI-driven non-playable characters that react and evolve within the game environment.' 'Series Entertainment's innovative use of AI demonstrates the transformative potential of Google Cloud's AI technology for both game development and new gameplay experiences,' said Jack Buser, global director for Games, Google Cloud. 'By leveraging Google Cloud's AI solutions, they are not only accelerating their development cycles but also creating deeply engaging experiences for players.' Series Entertainment is drawing on the proven success of Pixelberry's narrative interactive fiction, including the popular Choices series, to anchor new AI-driven game experiences. The company is also aligning its strategy with robust analytics and continuous live-ops to ensure every new game or feature benefits from an active and engaged audience. Series Entertainment is dedicated to making great games, powered by AI. Founded by former Kongregate and Snap executive Pany Haritatos, the team is on a mission to set the standard for game development processes using its AI-native platform. To learn more about Series, please visit and follow us on LinkedIn. Ryan Yuen SOURCE: Series Entertainment Copyright Business Wire 2025. PUB: 03/17/2025 08:50 AM/DISC: 03/17/2025 08:52 AM

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store