
Free, offline ChatGPT on your phone? Technically possible, basically useless
Another day, another large language model, but news that OpenAI has released its first open-weight models (gpt-oss) with Apache 2.0 licensing is a bigger deal than most. Finally, you can run a version of ChatGPT offline and for free, giving developers and us casual AI enthusiasts another powerful tool to try out.
As usual, OpenAI makes some pretty big claims about gpt-oss's capabilities. The model can apparently outperform o4-mini and scores quite close to its o3 model — OpenAI's cost-efficient and most powerful reasoning models, respectively. However, that gpt-oss model comes in at a colossal 120 billion parameters, requiring some serious computing kit to run. For you and me, though, there's still a highly performant 20 billion parameter model available.
Can you now run ChatGPT offline and for free? Well, it depends.
In theory, the 20 billion parameter model will run on a modern laptop or PC, provided you have bountiful RAM and a powerful CPU or GPU to crunch the numbers. Qualcomm even claims it's excited about bringing gpt-oss to its compute platforms — think PC rather than mobile. Still, this does beg the question: Is it possible to now run ChatGPT entirely offline and on-device, for free, on a laptop or even your smartphone? Well, it's doable, but I wouldn't recommend it.
What do you need to run gpt-oss?
Edgar Cervantes / Android Authority
Despite shrinking gpt-oss from 120 billion to 20 billion parameters for more general use, the official quantized model still weighs in at a hefty 12.2GB. OpenAI specifies VRAM requirements of 16GB for the 20B model and 80GB for the 120B model. You need a machine capable of holding the entire thing in memory at once to achieve reasonable performance, which puts you firmly into NVIDIA RTX 4080 territory for sufficient dedicated GPU memory — hardly something we all have access to.
For PCs with a smaller GPU VRAM, you'll want 16GB of system RAM if you can split some of the model into GPU memory, and preferably a GPU capable of crunching FP4 precision data. For everything else, such as typical laptops and smartphones, 16GB is really cutting it fine as you need room for the OS and apps too. Based on my experience, 24GB RAM is required; my 7th Gen Surface Laptop, complete with a Snapdragon X processor and 16GB RAM, worked at an admittedly pretty decent 10 tokens per second, but barely held on even with every other application closed.
Despite it's smaller size, gpt-oss 20b still needs plenty of RAM and a powerful GPU to run smoothly.
Of course, with 24 GB RAM being ideal, the vast majority of smartphones cannot run it. Even AI leaders like the Pixel 9 Pro XL and Galaxy S25 Ultra top out at 16GB RAM, and not all of that's accessible. Thankfully, my ROG Phone 9 Pro has a colossal 24GB of RAM — enough to get me started.
How to run gpt-oss on a phone
Robert Triggs / Android Authority
For my first attempt to run gpt-oss on my Android smartphone, I turned to the growing selection of LLM apps that let you run offline models, including PocketPal AI, LLaMA Chat, and LM Playground.
However, these apps either didn't have the model available or couldn't successfully load the version downloaded manually, possibly because they're based on an older version of llama.cpp. Instead, I booted up a Debian partition on the ROG and installed Ollama to handle loading and interacting with gpt-oss. If you want to follow the steps, I did the same with DeepSeek earlier in the year. The drawback is that performance isn't quite native, and there's no hardware acceleration, meaning you're reliant on the phone's CPU to do the heavy lifting.
So, how well does gpt-oss run on a top-tier Android smartphone? Barely is the generous word I'd use. The ROG's Snapdragon 8 Elite might be powerful, but it's nowhere near my laptop's Snapdragon X, let alone a dedicated GPU for data crunching.
gpt-oss can just about run on a phone, but it's barely usable.
The token rate (the rate at which text is generated on screen) is barely passable and certainly slower than I can read. I'd estimate it's in the region of 2-3 tokens (about a word or so) per second. It's not entirely terrible for short requests, but it's agonising if you want to do anything more complex than say hello. Unfortunately, the token rate only gets worse as the size of your conversation increases, eventually taking several minutes to produce even a couple of paragraphs.
Robert Triggs / Android Authority
Obviously, mobile CPUs really aren't built for this type of work, and certainly not models approaching this size. The ROG is a nippy performer for my daily workloads, but it was maxed out here, causing seven of the eight CPU cores to run at 100% almost constantly, resulting in a rather uncomfortably hot handset after just a few minutes of chat. Clock speeds quickly throttled, causing token speeds to fall further. It's not great.
With the model loaded, the phone's 24GB was stretched as well, with the OS, background apps, and additional memory required for the prompt and responses all vying for space. When I needed to flick in and out of apps, I could, but this brought already sluggish token generation to a virtual standstill.
Another impressive model, but not for phones
Calvin Wankhede / Android Authority
Running gpt-oss on your smartphone is pretty much out of the question, even if you have a huge pool of RAM to load it up. External models aimed primarily at the developer community don't support mobile NPUs and GPUs. The only way around that obstacle is for developers to leverage proprietary SDKs like Qualcomm's AI SDK or Apple's Core ML, which won't happen for this sort of use case.
Still, I was determined not to give up and tried gpt-oss on my aging PC, equipped with a GTX1070 and 24GB RAM. The results were definitely better, at around four to five tokens per second, but still slower than my Snapdragon X laptop running just on the CPU — yikes.
In both cases, the 20b parameter version of gpt-oss certainly seems impressive (after waiting a while), thanks to its configurable chain of reasoning that lets the model 'think' for longer to help solve more complex problems. Compared to free options like Google's Gemini 2.5 Flash, gpt-oss is the more capable problem solver thanks to its use of chain-of-thought, much like DeepSeek R1, which is all the more impressive given it's free. However, it's still not as powerful as the mightier and more expensive cloud-based models — and certainly doesn't run anywhere near as fast on any consumer gadgets I own.
Still, advanced reasoning in the palm of your hand, without the cost, security concerns, or network compromises of today's subscription models, is the AI future I think laptops and smartphones should truly aim for. There's clearly a long way to go, especially when it comes to mainstream hardware acceleration, but as models become both smarter and smaller, that future feels increasingly tangible.
A few of my flagship smartphones have proven reasonably adept at running smaller 8 billion parameter models like Qwen 2.5 and Llama 3, with surprisingly quick and powerful results. If we ever see a similarly speedy version of gpt-oss, I'd be much more excited.
Follow

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
10 minutes ago
- Yahoo
Meta Picks Pimco, Blue Owl for $29 Billion Data Center Deal
(Bloomberg) -- Meta Platforms Inc. has selected Pacific Investment Management Co. and Blue Owl Capital Inc. to lead a $29 billion financing for its data center expansion in rural Louisiana as the race for artificial intelligence infrastructure heats up, according to people with knowledge of the matter. All Hail the Humble Speed Hump Three Deaths Reported as NYC Legionnaires' Outbreak Spreads Mayor Asked to Explain $1.4 Billion of Wasted Johannesburg Funds Major Istanbul Projects Are Stalling as City Leaders Sit in Jail What England's New National Cycling Network Needs to Get Rolling Pimco is expected to lead a $26 billion debt portion of the financing, while Blue Owl is providing $3 billion of equity, said the people, who asked not to be identified because the discussions are private. The debt portion is likely to be issued in the form of investment-grade bonds backed by the data center's assets, they said. The social media company has been working with Morgan Stanley to raise funds in a competitive process that pitted some of the largest names in private credit against each other. Apollo Global Management Inc. and KKR & Co. were also vying to lead the financing until the final round of talks, said the people. Other investors may be added at a later stage, they added. Representatives for Meta, Pimco and Blue Owl declined to comment. Morgan Stanley did not immediately respond to a request for comment. Blue Owl Capital shares were up 2.4% in premarket trading on Friday. Meta climbed 0.4%. Private investment firms have been aggressively seeking to deploy capital in transactions secured by physical assets or for higher-rated companies in a bid to differentiate their business. Many see the multi-trillion dollar market for private asset-based finance and data centers in particular as a massive opportunity to expand their revenue streams. Research by the the management consulting firm McKinsey & Co Inc. estimates that data centers will require $6.7 trillion to meet demand for computing power globally by 2030. AI Development The Meta financing will help the firm accelerate its development of artificial intelligence, which executives have said is already producing 'meaningful' revenue for the company. Meta said costs will grow at an even faster pace next year — particularly as it focuses on AI infrastructure needs and the niche technical talent that can fine-tune its models. 'We generally believe that there will be models here that will attract significant external financing to support large-scale data center projects that are developed using our ability to build world-class infrastructure while providing us with flexibility should our infrastructure requirements change over time,' Chief Financial Officer Susan Li told investors during an earnings call last week. Other tech giants have partnered with investment firms to fund AI data centers. Microsoft Corp. has teamed up with BlackRock Inc. to raise $30 billion in private equity capital for strategy that could deploy as much as $100 billion in the space, while Elon Musk's xAI Corp. raised $5 billion in the broadly syndicated debt market in June as it pushes ahead with the build-out of advanced AI models. Earlier this week, Apollo said it had agreed to buy a majority stake in Stream Data Centers. --With assistance from Kat Hidalgo. (Updates with stock price movements in paragraph five.) The Pizza Oven Startup With a Plan to Own Every Piece of the Pie Digital Nomads Are Transforming Medellín's Housing Russia's Secret War and the Plot to Kill a German CEO It's Only a Matter of Time Until Americans Pay for Trump's Tariffs The Game Starts at 8. The Robbery Starts at 8:01 ©2025 Bloomberg L.P. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data
Yahoo
10 minutes ago
- Yahoo
Why VC investments into crypto are seen to hit $25bn in 2025
Venture capital investors will back crypto startups to the tune of $25 billion in 2025. That's according to Michael Martin, director at Ava Labs' incubator Codebase, who said a perfect storm of bullish signals will incentivise investors to back crypto companies. He told DL News that factors like Circle's successful public float, booming crypto market, Stripe's acquisition of Privy, Wall Street pouring into blockchain projects, and new rules of the road for digital assets will drive even more investment in the last half of the year. 'People have the proof points to invest with a level of confidence in earlier stage, real-world blockchain companies that they may not have had in the past,' Martin told DL News. The prediction comes during a week that saw 12 crypto projects raise $121 million, pushing the total amount bagged by industry players this year to $13.2 billion, according to DefiLlama. That investment is already 40% higher in August this year than all investment into crypto last year. That also puts it on the path to break the $18 billion expected by PitchBook analysts earlier this year. Martin's prognosis echoes that of Galaxy Venture's Mike Giampapa, who told DL News earlier this summer that he expected crypto startups to secure $25 billion in 2025. Critical juncture Investor' optimism comes at a critical juncture for the industry. The Trump administration's pro-industry tilt has emboldened not just sector players, but also larger financial institutions and banks to increasingly tap into digital assets. With Trump having rubberstamped the Genius Act in July and more crypto bills coming up for votes on Capitol Hill, the industry is getting the clarity it's clamoured for for years. That clarity is expected to further fuel the adoption of blockchain technology by traditional financial institutions. Wall Street companies and fintech firms may also see this as an opportunity to follow in Stripe's footsteps and acquire crypto companies to embed their solutions in their services 'You're going to see more of that,' Martin said. Macroeconomy risks To be sure, there are clouds on the horizon that risk derailing the investment boom. Even so, macroeconomic uncertainties — particularly those surrounding US President Donald Trump's tariffs — jeopardise the rally, Martin said. Other concerns include whether or not public crypto companies like Circle and Coinbase will perform as well as expected. Martin said that if public crypto companies and leading cryptocurrencies like Bitcoin were to underperform analysts' expectations, that could rattle investors and scare them into tightening their grips around their cheque books. Apart from the risk of public crypto companies underperforming, there is also a chance that macroeconomic conditions will give investors reason to halt their investment strategies. For instance, Trump's tariffs against some of the US' closest trading partners combined with jobs growth having stalled this summer have rattled investors. No one knows what will happen next. 'VCs have capital they need to deploy,' Martin said. 'But how are you going to deploy it if you don't know if X, Y and Z is going to happen?' You're reading the latest installment of The Weekly Raise, our column covering fundraising deals across the crypto and DeFi spaces, powered by DefiLlama. Eric Johansson is DL News' interim managing editor. Got a tip? Email at eric@ Errore nel recupero dei dati Effettua l'accesso per consultare il tuo portafoglio Errore nel recupero dei dati Errore nel recupero dei dati Errore nel recupero dei dati Errore nel recupero dei dati

Yahoo
10 minutes ago
- Yahoo
TeraWulf Q2 2025 revenue rises to $47.6 million as HPC takes shape
TeraWulf (WULF) reported Q2 revenue of $47.6 million, up 34% from $35.6 million a year earlier, driven by expanded mining capacity and higher bitcoin prices. Cost of revenue, excluding depreciation, rose 59% to $22.1 million, pushing costs to 46.4% of revenue compared to 39.1% in Q2 2024. Adjusted EBITDA fell to $14.5 million from $19.5 million. The company self-mined 485 bitcoin, down from 699 last year due to the April 2024 halving and the sale of its Nautilus facility, while operational hashrate increased to 12.2 EH/s from 8.0 EH/s. Power cost per bitcoin doubled to $45,555, reflecting the halving, higher network difficulty, and energy price volatility. TeraWulf expects to begin recognizing revenue from HPC hosting in Q3, starting with its 'WULF Den' deployment, followed by 'CB-1' in August and 'CB-2' in Q4. It also received interconnection approval for 500 MW at its Lake Mariner site, with additional capacity of up to 750 MW pending. At quarter-end, the company held $90 million in cash and bitcoin, against about $500 million in 2.75% convertible debt due 2030. At time of publication, WULF is up 3.6% during pre-market hours.