
AMD brings 128B LLMs to Windows PCs with Ryzen AI Max+ 395
With this update, AMD is allowing users to access and deploy advanced AI models locally, bypassing the need for third-party infrastructure, which can provide greater control, lower ongoing costs, and improved privacy.
The company says this shift addresses growing demand for scalable and private AI processing at the client device level.
Previously, models of this scale, such as those approaching the size of ChatGPT 3.0, were operable only within large-scale data centres. The new functionality comes through an upgrade to AMD Variable Graphics Memory, included with the upcoming Adrenalin Edition 25.8.1 WHQL drivers.
This upgrade leverages the 96GB Variable Graphics Memory available on the Ryzen AI Max+ 395 128GB machine, supporting the execution of memory-intensive LLM workloads directly on Windows PCs.
A broader deployment
This update also marks the AMD Ryzen AI Max+ 395 (128GB) as the first Windows AI PC processor to run Meta's Llama 4 Scout 109B model - specifically with full vision and multi-call processing (MCP) support.
The processor can manage all 109 billion parameters in memory, although the mixture-of-experts (MoE) architecture means only 17 billion parameters are active at any given time. The company reports output rates of up to 15 tokens per second for this model.
According to AMD, the ability to handle such large models locally is important for users who require high-capacity AI assistants on-the-go. The system also supports flexible quantisation and can facilitate a range of LLMs, from compact 1B parameter models to Mistral Large, using the GGUF format. This isn't just about bringing cloud-scale compute to the desktop; it's about expanding the range of options for how AI can be used, built, and deployed locally.
The company further states that performance in MoE models like Llama 4 Scout correlates with the number of active parameters, while dense models depend on the total parameter count.
The memory capacity of the AMD Ryzen AI Max+ platform allows users to opt for higher-precision models, supporting up to 16-bit models through llama.cpp when trade-offs between quality and performance are warranted.
Context and workflow
AMD also highlights the importance of context size when working with LLMs. The AMD Ryzen AI Max+ 395 (128GB), equipped with the new driver, can run Meta's Llama 4 Scout at a context length of 256,000 (with Flash Attention ON and KV Cache Q8), significantly exceeding the standard 4,096 token window default in many applications.
Examples provided include demonstrations where an LLM summarises extensive documents, such as an SEC EDGAR filing, requiring over 19,000 tokens to be held in context. Another example cited the summarisation of a research paper from the ARXIV database, needing more than 21,000 tokens from query initiation to final output. AMD notes that more complex workflows might require even greater context capacity, particularly for multi-tool and agentic scenarios.
AMD states that while occasional users may manage with a context length of 32,000 tokens and a lightweight model, more demanding use cases will benefit from hardware and software that support expansive contexts, as offered by the AMD Ryzen AI Max+ 395 128GB.
Looking ahead, AMD points to an expanding set of agentic workflows as LLMs and AI agents become more widely adopted for local inferencing. Industry trends indicate that model developers, including Meta, Google, and Mistral, are increasingly integrating tool-calling capabilities into their training runs to facilitate local personal assistant use cases.
AMD also provides guidance on maintaining caution when enabling tool access for large language models, noting the potential for unpredictable system behaviour and outcomes. Users are advised to install LLM implementations only from trusted sources.
The AMD Ryzen AI Max+ 395 (128GB) is now positioned to support most models available through llama.cpp and other tools, offering flexible deployment and model selection options for users with high-performance local AI requirements.

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


NZ Herald
3 days ago
- NZ Herald
Tech firms say deals for power give new life to nuclear plants at risk of going offline
Meta signed a 20-year agreement for the power flowing from a large legacy reactor in Illinois. Microsoft struck a deal to restart a reactor next to the one at Pennsylvania's Three Mile Island plant that was closed in 1979 by a partial meltdown. And Amazon last month in the same state locked up power from a 42-year-old nuclear plant down the Susquehanna River. Tech companies are scouring the nation for other geriatric nuclear plants to power their AI dreams, according to interviews with nuclear industry officials and company earnings calls. Their interest is focused on the roughly two dozen operating plants in unregulated markets, which are in many cases free to sell power to the highest bidder. They make up about half of the 54 plants still operating in the US. The tech firms say the deals give new life to plants at risk of going offline or that have already been shut down. Contracts that lock in rates for decades are attractive to plant operators, and the electricity flows without directly generating new carbon emissions. Critics say Silicon Valley's nuclear spree will make it more likely that consumers will face electricity rate hikes or shortages in coming years as the US faces soaring demand for power - driven in part by new data centres. By locking up ageing nuclear plants instead of building new power generation, tech firms could leave communities to fall back on fossil fuels, extending the life of polluting coal and gas plants. A few years ago, nuclear energy struggled to compete with cheaper renewables and natural gas, but all power sources are now in greater demand. Contracts with tech firms can offer nuclear plant operators as much as double the market rate for electricity. Jackson Morris, a director with the environmental advocacy group Natural Resources Defence Council, said tapping nuclear energy allows companies to keep pledges to use carbon-free power, but 'doesn't do anything to solve for the impact they're having on consumers'. 'They're insulating themselves from their own impact,' he said. Amazon, Google, Meta, and Microsoft declined to answer questions about which additional nuclear plants they may be seeking to buy power from, as well as the potential impacts of such purchases on other ratepayers and the environment. Amazon founder Jeff Bezos owns the Washington Post. All of the companies say they mitigate the impact of their energy use on other customers, by working with utilities to shield customers from funding infrastructure that serves only data centres and investing in bringing new clean technologies to the power grid. Tech firms say their data centres will eventually be powered by a new generation of cheaper but more sophisticated nuclear reactors, to be designed with help from AI. However, the technology has been stymied by engineering issues, supply chain challenges and regulatory hurdles. Google and Microsoft are also investing in fusion energy, which is even less proven. Controls, monitors and indicator lights fill the main control room at Three Mile Island last year. Photo / Wesley Lapointe, The Washington Post 'It turns out it is hard to go from all of that fancy new technology on a spreadsheet to an actual piece of infrastructure that isn't run with analogue controls,' said Ted Nordhaus, co-founder of the Breakthrough Institute, a California-based energy think-tank. 'Right now there is not much else to do other than try to squeeze every electron you can out of the existing nuclear fleet.' Chain reaction Energy companies that own nuclear plants are thrilled by the tech industry's recent interest, calling it a springboard for nuclear power's resurgence. New Jersey power company PSEG told investors in February that it is in talks with tech firms about selling large amounts of power directly from its nuclear reactors on what is known as the Artificial Island complex in Delaware Bay. Company chief executive Ralph LaRossa said in April that requests for new power from the utility by data centres has exploded over the past year, jumping 16-fold to 6.4 gigawatts, an amount of electricity that could power several million homes. In Texas, energy company Vistra says it is in talks with tech firms interested in buying energy from the Comanche Peak nuclear plant, near Fort Worth, and possibly others it owns in Ohio and Pennsylvania. 'I think we will see more large deals,' said Dan Eggers, executive vice-president at Constellation Energy, which owns or partially owns 13 nuclear energy complexes across the country. Constellation has already rezoned land next to the Byron Clean Energy Centre, a nuclear plant in Illinois, so tech companies can build data centres there. It is seeking similar changes at the campus of the Calvert Cliffs nuclear plant in Maryland on Chesapeake Bay. The company says it is also contemplating new deals with tech companies for long-term nuclear power contracts in Pennsylvania and New York. Lawmakers and regulators in some communities are concerned data centre nuclear deals could increase costs for other ratepayers and weaken the power grid. Some Maryland lawmakers want to ban Constellation from inviting data centre construction alongside Calvert Cliffs, which produces nearly 40% of the state's electricity. A report from the state's Public Service Commission warns that siphoning energy from the plant away from the power grid for data centres could destabilise the system. The Calvert Cliffs nuclear power plant in Lusby, Maryland, is seen in 2011. Photo / Jonathan Newton, The Washington Post 'In addition to being costly to replace a large nuclear plant, the quality of the generation … would be difficult to replace,' the report says. Unlike solar or wind facilities, nuclear power provides round-the-clock electricity when the plants are operating, in any weather. In many cases, nuclear power that gets redirected to tech companies would be backfilled on the power grid with gas or coal generation. Nuclear industry officials say the solution is not restricting deals, but building more plants. 'It is short sighted to say we will just ignore all this demand over the next few years and tell these companies to get their power somewhere else, when this could set us up for a lot of growth in the industry,' said Benton Arnett, senior director of markets and policy at the Nuclear Energy Institute, an industry group. But even nuclear executives working with tech firms acknowledge that pulling zero emissions nuclear energy away from other customers will have an impact on the climate and can be out of sync with ambitious commitments tech firms have made to reduce their carbon footprint. 'A growing list of people are realising they can't have everything they want,' said Robert Coward, principal officer at MPR Associates, one of the nuclear industry's leading technical services firms. Critical mass The scramble by tech firms to secure more nuclear energy quickly has led Silicon Valley companies to some unexpected places. They include a dormant construction site in South Carolina, where plans to build a Three Mile Island-size nuclear plant were abandoned in 2017, after the developer burned through US$9 billion on a project that struggled with cost overruns and engineering setbacks. Local ratepayers were saddled with the bill. Federal prosecutors in 2020 secured prison sentences for executives involved with the project for lying to investors and ratepayers about its viability. Now, several big tech companies are among those that have expressed interest in bringing the VC Summer nuclear project back to life, according to testimony from officials at utility Santee Cooper, after it invited proposals for restarting the project. A utility spokesperson would not say if there are tech companies among the three or four proposals she said are finalists for a potential deal. Tech firms are also eyeing a revival of Duane Arnold Energy Centre in Iowa, a 1970s vintage nuclear plant majority-owned by NextEra that was mothballed in 2020 after a fierce storm damaged its cooling towers, according to company earnings calls. The repairs were initially deemed too costly, but data centres have shifted the economics of nuclear energy, and NextEra is mulling a reboot to serve the facilities. 'If we continue to see the kind of prices Microsoft is willing to pay for nuclear power from Three Mile Island, these type of deals become a solid economic proposition,' said Carly Davenport, a utilities analyst at Goldman Sachs. She said estimates show the tech company is paying as much as twice the going rate on the open market, and locking in for a 20-year contract. Duane Arnold is one of the last retired plants intact enough to restart. Many of the retired plants in the US have already been dismantled. But tech companies are finding ways to squeeze more juice out of active reactors in the ageing national fleet, pursuing reactor 'uprates' from federal regulators that allow increased output. Nuclear power companies aim to increase the power output of the existing US nuclear fleet by the equivalent of three large new reactors using that tactic. As more deals involving ageing reactors emerge, consumer advocates and environmental groups are growing concerned about the impact on everyday ratepayers and the planet. Amazon reconfigured its deal in Pennsylvania after it was rejected by federal regulators that expressed concern about the effects on consumer electricity bills. The company had proposed routing power from the plant directly to nearby data centres, allowing it to avoid paying usage fees for the electric grid. A caution sign warns of radioactive exposure on the turbine deck at Three Mile Island, which is being renamed Crane Clean Energy Centre. Photo / Wesley Lapointe, The Washington Post The online retailer last month announced a deal with plant owner Talen in which it agreed to pay grid fees, a contract that will effectively lock up a large chunk of existing power generation at a time the Mid-Atlantic power grid desperately needs more energy. The deal is notable because it puts an existing nuclear plant on sound economic footing for another decade of emissions-free power generation, said former federal energy commissioner Allison Clements. However, Amazon is also removing supply from the grid just as demand from AI and other uses such as electric cars and air conditioners is spiking. 'There isn't enough power on the grid,' Clements said, and the increased load forecast by analysts, utilities and grid operators cannot be met by existing power sources. 'There's not enough room on the system.'


NZ Herald
3 days ago
- NZ Herald
NZ sharemarket down as Mainfreight, Infratil decline
Craigs Investment Partners investment director Mark Lister said Mainfreight's margins in particular were much lower than investors were expecting. 'It just looks like they've had a really tough start to the 2026 financial year. I think everyone knows and believes that it's a great business for the long term, but over the near term, meaning the next six months to 12 months, uncertainty is high,' Lister said. Craigs analysts also downgraded the business and said the short-term outlook was uncertain, but the firm liked the long-term growth potential of the business. Infratil also had a soft day, falling 3.08% to $11.63, on turnover worth $13.4m. Spark also traded on high turnover with the shares dipping 0.61% to $2.43. Meanwhile, Fisher and Paykel Healthcare rose 0.11% late in the day, lifting its share price to $36.80, as did Contact Energy, which was up 0.22% at $9.11. In the US overnight the Federal Reserve met today, with no change to interest rates. Lister said Jerome Powell was probably a little more hawkish than people were expecting. Tech giants Meta and Microsoft also both released strong results, with Microsoft's reporting profit of US$27.2 billion ($33.4b) because of its AI and cloud growth, while Meta beat expectations, reporting a revenue jump of 22% year-over-year to US$47.5 billion ($58.3b). 'All eyes will be on Amazon and Apple overnight tonight, and then you've got some really big economic releases on Friday in the US including the jobs report, which is a key one.' Overseas news Wall Street stocks finished mostly lower on Wednesday after the Federal Reserve kept interest rates flat and refrained from signalling it will soon cut interest rates. The Fed, as expected, held interest rates steady, despite relentless pressure from US President Donald Trump for an interest rate cut. In a press conference, Fed Chairman Jerome Powell emphasised future monetary policy decisions would depend on economic data. 'Powell sounded more hawkish than what markets were hoping for,' said Angelo Kourkafas of Edward Jones. Futures markets lowered their odds for a September interest rate cut following the press conference and statement, which included no major tweaks that would have implied an imminent interest rate cut. The Dow Jones Industrial Average finished down 0.4% at 44,461.28. The broad-based S&P 500 shed 0.1% to 6362.90, while the tech-rich Nasdaq Composite Index rose 0.2% to 21,129.67. Earlier, economic data showed the US economy returned to expansion in the second quarter, notching 3% growth after a contraction in the first quarter. But GDP in both quarters was heavily influenced by import activity in response to Trump's aggressive trade policy. – Additional reporting AFP Tom Raynel is a multimedia business journalist for the Herald, covering small business, retail and tourism.


Techday NZ
4 days ago
- Techday NZ
AMD brings 128B LLMs to Windows PCs with Ryzen AI Max+ 395
AMD has announced a free software update enabling 128 billion parameter Large Language Models (LLMs) to be run locally on Windows PCs powered by AMD Ryzen AI Max+ 395 128GB processors, a capability previously only accessible through cloud infrastructure. With this update, AMD is allowing users to access and deploy advanced AI models locally, bypassing the need for third-party infrastructure, which can provide greater control, lower ongoing costs, and improved privacy. The company says this shift addresses growing demand for scalable and private AI processing at the client device level. Previously, models of this scale, such as those approaching the size of ChatGPT 3.0, were operable only within large-scale data centres. The new functionality comes through an upgrade to AMD Variable Graphics Memory, included with the upcoming Adrenalin Edition 25.8.1 WHQL drivers. This upgrade leverages the 96GB Variable Graphics Memory available on the Ryzen AI Max+ 395 128GB machine, supporting the execution of memory-intensive LLM workloads directly on Windows PCs. A broader deployment This update also marks the AMD Ryzen AI Max+ 395 (128GB) as the first Windows AI PC processor to run Meta's Llama 4 Scout 109B model - specifically with full vision and multi-call processing (MCP) support. The processor can manage all 109 billion parameters in memory, although the mixture-of-experts (MoE) architecture means only 17 billion parameters are active at any given time. The company reports output rates of up to 15 tokens per second for this model. According to AMD, the ability to handle such large models locally is important for users who require high-capacity AI assistants on-the-go. The system also supports flexible quantisation and can facilitate a range of LLMs, from compact 1B parameter models to Mistral Large, using the GGUF format. This isn't just about bringing cloud-scale compute to the desktop; it's about expanding the range of options for how AI can be used, built, and deployed locally. The company further states that performance in MoE models like Llama 4 Scout correlates with the number of active parameters, while dense models depend on the total parameter count. The memory capacity of the AMD Ryzen AI Max+ platform allows users to opt for higher-precision models, supporting up to 16-bit models through when trade-offs between quality and performance are warranted. Context and workflow AMD also highlights the importance of context size when working with LLMs. The AMD Ryzen AI Max+ 395 (128GB), equipped with the new driver, can run Meta's Llama 4 Scout at a context length of 256,000 (with Flash Attention ON and KV Cache Q8), significantly exceeding the standard 4,096 token window default in many applications. Examples provided include demonstrations where an LLM summarises extensive documents, such as an SEC EDGAR filing, requiring over 19,000 tokens to be held in context. Another example cited the summarisation of a research paper from the ARXIV database, needing more than 21,000 tokens from query initiation to final output. AMD notes that more complex workflows might require even greater context capacity, particularly for multi-tool and agentic scenarios. AMD states that while occasional users may manage with a context length of 32,000 tokens and a lightweight model, more demanding use cases will benefit from hardware and software that support expansive contexts, as offered by the AMD Ryzen AI Max+ 395 128GB. Looking ahead, AMD points to an expanding set of agentic workflows as LLMs and AI agents become more widely adopted for local inferencing. Industry trends indicate that model developers, including Meta, Google, and Mistral, are increasingly integrating tool-calling capabilities into their training runs to facilitate local personal assistant use cases. AMD also provides guidance on maintaining caution when enabling tool access for large language models, noting the potential for unpredictable system behaviour and outcomes. Users are advised to install LLM implementations only from trusted sources. The AMD Ryzen AI Max+ 395 (128GB) is now positioned to support most models available through and other tools, offering flexible deployment and model selection options for users with high-performance local AI requirements.