Latest news with #Gemma31B

Google launches app to let anyone run AI models on their phone without internet

India Today

4 days ago

India Today

Google launches app to let anyone run AI models on their phone without internet

Google has quietly launched a new app called the AI Edge Gallery, and it aims to change how artificial intelligence works on smartphones. The app allows Android users to run powerful AI models directly on their devices, completely offline. In other words, you can now generate images, write code, or get smart answers without needing to connect to the key benefit of this move is enhanced privacy and faster performance. Since everything runs on the device itself, there's no need to transmit data to cloud servers, reducing the risk of security breaches. It also means no waiting for the server to respond — AI answers arrive instantly. advertisementAt the core of this experience is a language model named Gemma 3 1B. Weighing in at just 529MB, this compact model can process up to 2,585 tokens per second, enabling rapid text generation and seamless interactions. Despite its small size, Gemma is powerful enough to support everything from custom content creation to document analysis and smart replies in messaging apps. The app also draws from Hugging Face, one of the most trusted sources of open AI models, and is built on Google's AI Edge platform. That means it benefits from technologies like TensorFlow Lite and MediaPipe, which help optimise performance across a wide range of devices, even those with modest hardware. That said, Google has pointed out that performance may vary depending on the device. Older or mid-range phones might struggle with larger models, so opting for lighter models is advisable in such to Google, users will find the interface refreshingly straightforward. Features like AI Chat and Ask Image offer intuitive access to AI tools, while the Prompt Lab allows people to experiment with short, single-turn prompts. The lab also includes preset templates and settings for tweaking how models the app is still in what Google calls an 'experimental Alpha release,' it's fully open source under the Apache 2.0 licence. This means developers and companies alike are free to use it, modify it, or even integrate it into commercial products. An iOS version is also reportedly on the development comes at a time amid ongoing scrutiny of Google's broader AI ambitions. Just last week, the US Department of Justice opened a civil antitrust investigation into the company's licensing deal with AI startup Critics have raised concerns that the agreement may have been designed to dodge a federal merger that, the release of AI Edge Gallery positions Google as a leader in making offline AI not only possible but also practical. And with the ability to run powerful models right from your pocket, it's a step toward making AI more personal, private, and always ready — no Wi-Fi required.

Microsoft's New Compact 1-Bit LLM Needs Just 400MB of Memory

Yahoo

22-04-2025

Science
Yahoo

Microsoft's New Compact 1-Bit LLM Needs Just 400MB of Memory

Microsoft's new large language model (LLM) puts significantly less strain on hardware than other LLMs—and it's free to experiment with. The 1-bit LLM (1.58-bit, to be more precise) uses -1, 0, and 1 to indicate weights, which could be useful for running LLMs on small devices, such as smartphones. Microsoft put BitNet b1.58 2B4T on Hugging Face, a collaboration platform for the AI community. 'We introduce BitNet b1.58 2B4T, the first open-source, native 1-bit Large Language Model (LLM) at the 2-billion parameter scale,' the Microsoft researchers wrote. 'Trained on a corpus of 4 trillion tokens, the model has been rigorously evaluated across benchmarks covering language understanding, mathematical reasoning, coding proficiency, and conversational ability.' The keys to b1.58 2B4T are the performance and efficiency it provides. Where other LLMs often use 16-bit (or 32-bit) floating-point formats. The weights (parameters) are expressed using just the three values (-1, 0,1). Although this isn't the first BitNet of its kind, its size makes it unique. As TechRepublic points out, this is the first 2 billion-parameter, 1-bit LLM. Credit: Microsoft An important goal when developing LLMs for less-powerful hardware is to reduce the model's memory needs. In the case of b1.58 2B4T, it requires only 400MB, a dramatic drop from previous record holders, like Gemma 3 1B, which uses 1.4GB. 'The core contribution of this work is to demonstrate that a native 1-0bit LLM, when trained effectively at scale, can achieve performance comparable to leading open-weight, full-precision models of similar size across a wide range of tasks,' the researchers wrote in the report. One thing to keep in mind is that BitNet b1.58 2B4T only works on Microsoft's own system, instead of other traditional frameworks. Training the LLM requires three steps, or phases. The first is pre-training is broken into several of its own steps and (in the case of the researchers' testing) involves 'synthetically generated mathematical data,' along with data from large web crawls, educational web pages, and other 'publicly available' text. The next phase is supervised fine-tuning (SFT). Researchers used WildChat for conversational training. The last phase, direct preference optimization (DPO), is meant to improve the AI's conversational skills and to put it in sync with your target audience's preferences.

Latest news with #Gemma31B

Google launches app to let anyone run AI models on their phone without internet

Microsoft's New Compact 1-Bit LLM Needs Just 400MB of Memory

Get Started Now: Download the App