logo
#

Latest news with #WebDevArena

Google unveils Gemini 2.5 upgrades for reasoning & security
Google unveils Gemini 2.5 upgrades for reasoning & security

Techday NZ

time22-05-2025

  • Business
  • Techday NZ

Google unveils Gemini 2.5 upgrades for reasoning & security

Google has provided a series of updates to its Gemini 2.5 model series, with enhancements spanning advanced reasoning, developer capabilities and security safeguards. The company reported that Gemini 2.5 Pro is now the leading model on the WebDev Arena coding leaderboard, holding an ELO score of 1415. It also leads across all leaderboards in LMArena, a platform that measures human preferences in multiple dimensions. Additionally, Gemini 2.5 Pro's 1 million-token context window was highlighted as supporting strong long context and video understanding performance. Integration with LearnLM, a family of models developed with educational experts, resulted in Gemini 2.5 Pro apparently becoming the foremost model for learning. According to Google, in direct comparisons focusing on pedagogy and effectiveness, Gemini 2.5 Pro was favoured by educators and experts over other models in a wide range of scenarios. The model outperformed others based on the five principles of learning science used in AI system design for education. Gemini 2.5 Pro introduced an experimental capability called Deep Think, which is being tested to enable enhanced reasoning by allowing the model to consider multiple hypotheses before responding. The company said, "2.5 Pro Deep Think gets an impressive score on 2025 USAMO, currently one of the hardest math benchmarks. It also leads on LiveCodeBench, a difficult benchmark for competition-level coding, and scores 84.0% on MMMU, which tests multimodal reasoning." Safety and evaluation measures are being emphasised with Deep Think. "Because we're defining the frontier with 2.5 Pro DeepThink, we're taking extra time to conduct more frontier safety evaluations and get further input from safety experts. As part of that, we're going to make it available to trusted testers via the Gemini API to get their feedback before making it widely available," the company reported. Google announced improvements to 2.5 Flash, describing it as the most efficient in the series, tailored for speed and cost efficiency. This version now reportedly uses 20-30% fewer tokens in evaluations and delivers improved performance across benchmarks for reasoning, multimodality, code, and long-context tasks. The updated 2.5 Flash is now available for preview in Google AI Studio, Vertex AI, and the Gemini app. New features have also been added to the Gemini 2.5 series. The Live API now offers a preview version supporting audio-visual input and native audio output. This is designed to create more natural and expressive conversational experiences. According to Google, "It also allows the user to steer its tone, accent and style of speaking. For example, you can tell the model to use a dramatic voice when telling a story. And it supports tool use, to be able to search on your behalf." Early features in this update include Affective Dialogue, where the model can detect and respond to emotions in a user's voice; Proactive Audio, which enables the model to ignore background conversations and determine when to respond; and enhanced reasoning in live API use. Multi-speaker support has also been introduced for text-to-speech capabilities, allowing audio generation with two distinct voices and support for over 24 languages, including seamless transitions between them. Project Mariner's computer use capabilities are being integrated into the Gemini API and Vertex AI, with multiple enterprises testing the tool. Google stated, "Companies like Automation Anywhere, UiPath, Browserbase, Autotab, The Interaction Company and Cartwheel are exploring its potential, and we're excited to roll it out more broadly for developers to experiment with this summer." On the security front, Gemini 2.5 includes advanced safeguards against indirect prompt injections, which involve malicious instructions embedded into retrieved data. According to disclosures, "Our new security approach helped significantly increase Gemini's protection rate against indirect prompt injection attacks during tool use, making Gemini 2.5 our most secure model family to date." Google is introducing new developer tools with thought summaries in the Gemini API and Vertex AI. These summaries convert the model's raw processing into structured formats with headers and action notes. Google stated, "We hope that with a more structured, streamlined format on the model's thinking process, developers and users will find the interactions with Gemini models easier to understand and debug." Additional features include thinking budgets for 2.5 Pro, allowing developers to control the model's computation resources to balance quality and speed. This can also completely disable the model's advanced reasoning capability if desired. Model Context Protocol (MCP) support has been added for SDK integration, aiming to enable easier development of agentic applications using both open-source and hosted tools. Google affirmed its intention to sustain research and development efforts as the Gemini 2.5 series evolves, stating, "We're always innovating on new approaches to improve our models and our developer experience, including making them more efficient and performant, and continuing to respond to developer feedback, so please keep it coming! We also continue to double down on the breadth and depth of our fundamental research — pushing the frontiers of Gemini's capabilities. More to come soon."

Google I/O Spotlights Gemini's Multimodal AI Surge
Google I/O Spotlights Gemini's Multimodal AI Surge

Yahoo

time20-05-2025

  • Business
  • Yahoo

Google I/O Spotlights Gemini's Multimodal AI Surge

Alphabet (NASDAQ:GOOG) flexes Gemini AI at I/O with record user growth and next-gen features. Sundar Pichai touted Gemini 2.5 Pro as the top LLM on LMArena and WebDev Arena benchmarks, noting the model now serves 400 million monthly active users and powers AI Overviews used by over 1.5 billion people each month. He previewed Google Beama 2D-to-3D video experience launching later this year in partnership with HPand unveiled the Ironwood TPU for customer deployment. At the Shoreline Amphitheater, Google also expanded real-time speech translation in Meet with English and Spanish support (additional languages arrive soon) and confirmed enterprise rollout before year-end. Gemini Live's new Project Astra features bring camera and screen-sharing to Android and iOS starting today, while Project Mariner gains multitasking capabilitiesoverseeing up to 10 tasks with teach and repeatnow available to developers via the Gemini API. Chrome Search and the Gemini App will get agentic upgrades, and the new Personal Context feature promises consent-based smart replies in Gmail. Last week's early access to the Gemini 2.5 Pro Preview (I/O edition) set the stage for today's announcements, underscoring Google's push to embed AI across products. With AI Mode positioned as the next evolution of Search and Gemini's footprint growing aggressively, Google is betting on seamless, multimodal intelligence to deepen user engagement. Why It Matters: Google's Gemini platform is scaling rapidly, and the infusion of new featuresfrom 3D video to real-time translationcould redefine enterprise and consumer workflows. Investors will also watch developer uptake and TPU deployments when Alphabet reports next quarter's financials. This article first appeared on GuruFocus. Sign in to access your portfolio

Google I/O Spotlights Gemini's Multimodal AI Surge
Google I/O Spotlights Gemini's Multimodal AI Surge

Yahoo

time20-05-2025

  • Business
  • Yahoo

Google I/O Spotlights Gemini's Multimodal AI Surge

Alphabet (NASDAQ:GOOG) flexes Gemini AI at I/O with record user growth and next-gen features. Sundar Pichai touted Gemini 2.5 Pro as the top LLM on LMArena and WebDev Arena benchmarks, noting the model now serves 400 million monthly active users and powers AI Overviews used by over 1.5 billion people each month. He previewed Google Beama 2D-to-3D video experience launching later this year in partnership with HPand unveiled the Ironwood TPU for customer deployment. At the Shoreline Amphitheater, Google also expanded real-time speech translation in Meet with English and Spanish support (additional languages arrive soon) and confirmed enterprise rollout before year-end. Gemini Live's new Project Astra features bring camera and screen-sharing to Android and iOS starting today, while Project Mariner gains multitasking capabilitiesoverseeing up to 10 tasks with teach and repeatnow available to developers via the Gemini API. Chrome Search and the Gemini App will get agentic upgrades, and the new Personal Context feature promises consent-based smart replies in Gmail. Last week's early access to the Gemini 2.5 Pro Preview (I/O edition) set the stage for today's announcements, underscoring Google's push to embed AI across products. With AI Mode positioned as the next evolution of Search and Gemini's footprint growing aggressively, Google is betting on seamless, multimodal intelligence to deepen user engagement. Why It Matters: Google's Gemini platform is scaling rapidly, and the infusion of new featuresfrom 3D video to real-time translationcould redefine enterprise and consumer workflows. Investors will also watch developer uptake and TPU deployments when Alphabet reports next quarter's financials. This article first appeared on GuruFocus. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Google is on the brink of achieving API
Google is on the brink of achieving API

Phone Arena

time20-05-2025

  • Phone Arena

Google is on the brink of achieving API

Receive the latest Google news Subscribe By subscribing you agree to our terms and conditions and privacy policy How is that possible? Google's ongoing I/O live conference is not lacking any Google humor – early in the presentation, CEO Sundar Pichai bragged that Google's next-level AI model, Gemini 2.5 Pro, has successfully completed Pokémon on whom you ask, you might want to know that Pokémon Blue is not the easiest video game there is, so the achievement is not just funny, but notable, too. Sundar clarified that their AI model collected all 8 badges and thus, Google is closer to achieving API – "Artificial Pokémon Intelligence", as he put it, drawing laughs and applause from the live the game, players step into the shoes of a young trainer with a simple mission: catch, train, and battle the cute creatures known as Pokémon to become the Pokémon League Champion. The game is set in the Kanto region, where players travel through towns, forests, and caves, battling wild Pokémon and rival trainers in turn-based combat. Players can challenge eight gym leaders, thwart the villainous Team Rocket, and attempt to complete the Pokédex by catching all 151 Poké 2.5 Pro, Google's most advanced AI model to date, continues to set new benchmarks in performance, the company says – especially in complex reasoning, web development, and model now leads the WebDev Arena coding leaderboard and ranks first across all categories on the LMArena, which measures human user preferences, which speaks volumes about its popularity. Thanks to its massive 1 million-token context window, Gemini 2.5 Pro also delivers state-of-the-art long-context and video we say a model has a 1 million-token context window, it means it can take in and understand up to 1 million tokens of text (or code, or data) at once. This is a huge amount – equivalent to hundreds of thousands of words – allowing the model to consider long conversations, documents, or video transcripts without losing context or forgetting earlier model has seen significant updates recently, including new features like native audio output for more natural interactions, advanced security safeguards, and enhanced computer interaction abilities through Project Mariner. A Deep Think mode is also being added, designed to boost its capabilities in solving complex math and programming the integration of LearnLM, Gemini 2.5 Pro has become a top tool for learning as well. It scored highest in educator evaluations based on five core principles of learning science, Google claims. Developers can benefit from expanded tools as well, including thought summaries, extended thinking budgets, and improved support for open-source 2.5 Flash is now publicly available, while wider releases of 2.5 Pro are expected soon through Google AI Studio and Vertex AI.

Google's Gemini 2.5 ranks first in coding charts, AI IQ tests
Google's Gemini 2.5 ranks first in coding charts, AI IQ tests

Coin Geek

time20-05-2025

  • Business
  • Coin Geek

Google's Gemini 2.5 ranks first in coding charts, AI IQ tests

Getting your Trinity Audio player ready... Google's (NASDAQ: GOOGL) Gemini 2.5 has come out on top across a range of artificial intelligence (AI) testing benchmarks, outperforming the rest of its peers. According to the rankings, the AI chatbot sits on top of the leaderboard on WebDev Arena, an AI ranking site for coding. A quick scan of WebDev Arena reveals that Gemini 2.5 ranks ahead of Claude and ChatGPT 4 in standardized coding tests for large language models (LLMs). Apart from setting the pace in coding functionalities, Gemini 2.5 also clinched first place in creative writing and style control. When placed in standardized IQ tests, Gemini 2.5 outclassed its peers to achieve an IQ of 124 on the Mensa Norway test. However, the model scored 115 in offline mode, ranking in joint second place with OpenAI's ChatGPT. Gemini 2.5 scored 86.7% and 84% on the AIME 2025 math test and the GPQA science assessment, respectively. Despite scoring only 18.8% on Humanity's Last Exam, Gemini came first, outperforming Claude 3.7 Sonnet and OpenAI's o3. Gemini 2.5's successes across the board are propped by its context window, allowing up to 1 million tokens. Its closest competitors, Claude and ChatGPT's flagship models, are only designed to handle 128K tokens, with Gemini 2.5 ranking above the field. Google has unveiled plans to expand the context window to 2 million tokens. 'Gemini 2.5 models are thinking models, capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy,' said Google in a statement during its commercial release. A pro version pushes the frontiers for Gemini 2.5, with costs starting at $2.50 and $15.00 for input and output costs. Google's Gemini 2.5 is significantly cheaper than its peers, offering advanced enterprise functionalities, including blockchain-based smart contract audits. Real-world AI use cases heat up Apart from the academic discourse around AI models, real-world applications are rising. AI chatbots are changing the landscape in the workplace, offering automation and advanced personalization perks for consumers. Governments are also turning to AI to improve the scope of public services for citizens, but concerns over misuse remain. The UN has warned of the potential risks stemming from AI abuse, including censorship and the proliferation of fake news, while authorities are cracking down on misuse in financial markets. Microsoft-funded Space and Time goes live with an array of major builders Space and Time, a new blockchain project, has launched its mainnet to offer advanced data infrastructure using zero-knowledge (ZK) proofs. According to the press statement, the project will bring ZK-proven data infrastructure to digital asset service providers. Designed by MakeInfinite Labs, the project offers service providers with decentralized and verifiable databases. Space and Time allow developers to access large datasets off-chain and verify their accuracy on-chain using smart contracts. The Microsoft-backed Space and Time leans on major distributed ledgers to index data while providing a safe platform for developers to query gleaned data via its proprietary Proof of SQL. Space and Time's Proof of SQL mechanism verifies that an SQL query on a dataset is accurate despite off-chain computations. Each query result is wrapped as a ZK proof and submitted to a distributed ledger for smart contracts to verify their proof. 'Prior to Space and Time, onchain applications had no way to query basic user data from a database of blockchain activity without introducing security risks and tampering,' said Space and Time co-founder Scott Dykstra. The use cases in digital asset verticals are broad, with decentralized finance (DeFi) applications racking the most utility. Service providers and developers can confirm asset prices without exposing price feeds. Furthermore, DeFi protocols can update interest rates using ZK proof-based data infrastructures while leaning on the offering for Proof-of-Reserves. Video game developers can provide on-chain rewards based on in-game activities, while decentralized autonomous organizations (DAOs) can use the offering for automated treasury activities. The project is off to a good start with Dykstra confirming that major technology giants are building with Space and Time's expansive solutions. Google BigQuery and Azure are leaning on Space and Time as the project braces for an avalanche of users in the coming weeks. Blockchain use cases outside DeFi continue to surge Outside of DeFi, blockchain records myriad utilities across cybersecurity, artificial intelligence, Internet of Things (IoT), and supply chain. To stifle the trend of AI misinformation, researchers are turning to blockchain to fight bias and deepfakes, with one report tagging the technology as the missing link for trust. Blockchain is also being used in public services, with governments rolling out Web3-based solutions around subsidies and digital IDs. An integration of blockchain with IoT is tipped to solve a slew of climate change issues, with previous use cases in agriculture yielding benefits. In order for artificial intelligence (AI) to work right within the law and thrive in the face of growing challenges, it needs to integrate an enterprise blockchain system that ensures data input quality and ownership—allowing it to keep data safe while also guaranteeing the immutability of data. Check out CoinGeek's coverage on this emerging tech to learn more why Enterprise blockchain will be the backbone of AI . Watch | Alex Ball on the future of tech: AI development and entrepreneurship title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="">

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store