Score! RTX 5070 Ti OLED gaming laptop is $450 off for a limited time

10-06-2025

Summer is upon us, and with it comes the first major discounts I've seen on gaming laptops packing the latest Nvidia GeForce RTX 50-series graphics cards.
The best deal I've seen so far is this Lenovo Legion Pro 7i with RTX 5070 Ti for $2,399 at B&H, which knocks nearly $500 off the asking price for this high-end gaming laptop with one of the newest Nvidia GeForce RTX 50-series GPUs you can get.
This Lenovo Legion Pro 7i is a cutting-edge gaming laptop thanks to its Nvidia GeForce RTX 5070 Ti GPU, the Intel Core i9-275HX CPU, 32GB of RAM and 2TB of storage. That's more than enough power to make all your favorite games run great on the 16-inch 1600p 240Hz OLED display.
The Nvidia GeForce RTX 5070 Ti hit the market just a few months ago, and it looks to be the ideal value offering in the RTX 50-series lineup right now. And while it's not the highest-end 50-series card, it offers more than enough muscle to run even the best PC games well on this machine.
Plus, the laptop itself is a well-designed 16-inch gaming notebook that's equally good for gaming or productivity work. If you read our Lenovo Legion Pro 7i review, you can see how thin and elegant it is in person, along with shots of the plentiful port array and test results, which prove why it ranks among the best gaming laptops on the market.
That 16-inch (2560 x 1600) 240Hz OLED display looks lovely to boot, and it will make all your favorite games and movies look fantastic—and since it supports HDR and Dolby Vision, you can enjoy your media to the fullest.
Of course, we haven't had a chance to test this RTX 5070 Ti version yet, but it's sure to outperform its predecessors and run games well thanks to the power of Nvidia's latest laptop GPUs. Factor in the 32GB of DDR5 RAM and 2TB of SSD storage, and you see why you don't have to stress about this laptop running out of RAM or room for your favorite games anytime soon.
With Wi-Fi 7 and a full, comfy keyboard, you can cart this beast to the coffee shop when you want to work, and when you're done, you can lug it back to the living room and play PC games on your big screen via the HDMI 2.1, Thunderbolt 4 or USB-C ports. You also get USB-A and RJ-45 Ethernet ports, so you can count on being able to plug in old accessories and jack into wired Internet when gaming online.
Admittedly, this is a hefty beast that weighs over six pounds, so you'll probably want to keep it on your desk or coffee table most of the time. But that's true of most gaming laptops, and for my money, this is the best deal on an RTX 50-series machine I've seen all month.

Hashtags

#LenovoLegionPro7i

#NvidiaGeForceRTX

#RTX5070Ti

#NvidiaGeForceRTX5070Ti

#RTX50-series

#B&H

#Nvidia

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Asia markets set for a mixed open as U.S.-China tariff truce deadline looms

CNBC

12 minutes ago

CNBC

Asia markets set for a mixed open as U.S.-China tariff truce deadline looms

Asia-Pacific markets were set to open mixed as traders awaited announcement on whether the Aug. 12 deadline for U.S.–China tariff truce would be extended. Happy Monday from Singapore. Asia markets were poised for a mixed open. Australia's S&P/ASX 200 was set to start the day lower with futures tied to the benchmark at 8,768, compared with the index's last close of 8,807.1. Futures for Hong Kong's Hang Seng Index stood at 24,937, pointing to a higher open compared with the HSI's last close of 24,858.82. Japan markets are closed for a holiday. — Lee Ying Shan Visitors visit the NVIDIA booth at the 3rd China International Supply Chain Expo in Beijing, China, on July 20, 2025. Nurphoto | Nurphoto | Getty Images Chip giant Nvidia pushed back Sunday in response to allegations from Chinese state media that its H20 artificial intelligence chips are a national security risk for China. Earlier in the day, Reuters reported Yuyuan Tantian, an account affiliated with Chinese state broadcaster CCTV, said in an article published on WeChat that the Nvidia H20 chips are not technologically advanced or environmentally friendly. "When a type of chip is neither environmentally friendly, nor advanced, nor safe, as consumers, we certainly have the option not to buy it," the Yuyuan Tantian article reportedly said, adding that the article said chips could achieve functions including "remote shutdown" through a hardware "backdoor." In response, a Nvidia spokesperson told CNBC that "cybersecurity is critically important to us. NVIDIA does not have 'backdoors' in our chips that would give anyone a remote way to access or control them." Nvidia on Tuesday rejected similar accusations. For more, read here. — Pia Singh

Will Reinforcement Learning Take Us To AGI?

Forbes

12 minutes ago

Forbes

Will Reinforcement Learning Take Us To AGI?

There aren't many truly new ideas in artificial intelligence. More often, breakthroughs in AI happen when concepts that have existed for years suddenly take on new power because underlying technology inputs—in particular, raw computing power—finally catch up to unlock those concepts' full potential. Famously, Geoff Hinton and a small group of collaborators devoted themselves tirelessly to neural networks starting in the early 1970s. For decades, the technology didn't really work and the outside world paid little attention. It was not until the early 2010s—thanks to the arrival of sufficiently powerful Nvidia GPUs and internet-scale training data—that the potential of neural networks was finally unleashed for all to see. In 2024, more than half a century after he began working on neural networks, Hinton was awarded the Nobel Prize for pioneering the field of modern AI. Reinforcement learning has followed a similar arc. Richard Sutton and Andrew Barto, the fathers of modern reinforcement learning, laid down the foundations of the field starting in the 1970s. Even before Sutton and Barto began their work, the basic principles underlying reinforcement learning—in short, learning by trial and error based on positive and negative feedback—had been developed by behavioral psychologists and animal researchers going back to the early twentieth century. Yet in just the past year, advances in reinforcement learning (RL) have taken on newfound importance and urgency in the world of AI. It has become increasingly clear that the next leap in AI capabilities will be driven by RL. If artificial general intelligence (AGI) is in fact around the corner, reinforcement learning will play a central role in getting us there. Just a few years ago, when ChatGPT's launch ushered in the era of generative AI, almost no one would have predicted this. Deep questions remain unanswered about reinforcement learning's capabilities and its limits. No field in AI is moving more quickly today than RL. It has never been more important to understand this technology, its history and its Learning 101 The basic principles of reinforcement learning have remained consistent since Sutton and Barto established the field in the 1970s. The essence of RL is to learn by interacting with the world and seeing what happens. It is a universal and foundational form of learning; every human and animal does it. In the context of artificial intelligence, a reinforcement learning system consists of an agent interacting with an environment. RL agents are not given direct instructions or answers by humans; instead, they learn through trial and error. When an agent takes an action in an environment, it receives a reward signal from the environment, indicating that the action produced either a positive or a negative outcome. The agent's goal is to adjust its behavior to maximize positive rewards and minimize negative rewards over time. How does the agent decide which actions to take? Every agent acts according to a policy, which can be understood as the formula or calculus that determines the agent's action based on the particular state of the environment. A policy can be a simple set of rules, or even pure randomness, or it can be represented by a far more complex system, like a deep neural network. One final concept that is important to understand in RL, closely related to the reward signal, is the value function. The value function is the agent's estimate of how favorable a given state of the environment will be (that is, how many positive and negative rewards it will lead to) over the long run. Whereas reward signals are immediate pieces of feedback that come from the environment based on current conditions, the value function is the agent's own learned estimate of how things will play out in the long term. The entire purpose of value functions is to estimate reward signals, but unlike reward signals, value functions enable agents to reason and plan over longer time horizons. For instance, value functions can incentivize actions even when they lead to negative near-term rewards because the long-term benefit is estimated to be worth it. When RL agents learn, they do so in one of three ways: by updating their policy, updating their value function, or updating both together. A brief example will help make these concepts concrete. Imagine applying reinforcement learning to the game of chess. In this case, the agent is an AI chess player. The environment is the chess board, with any given configuration of chess pieces representing a state of that environment. The agent's policy is the function (whether a simple set of rules, or a decision tree, or a neural network, or something else) that determines which move to make based on the current board state. The reward signal is simple: positive when the agent wins a game, negative when it loses a game. The agent's value function is its learned estimate of how favorable or unfavorable any given board position is—that is, how likely the position is to lead to a win or a loss. As the agent plays more games, strategies that lead to wins will be positively reinforced and strategies that lead to losses will be negatively reinforced via updates to the agent's policy and value function. Gradually, the AI system will become a stronger chess player. In the twenty-first century, one organization has championed and advanced the field of reinforcement learning more than any other: DeepMind. Founded in 2010 as a startup devoted to solving artificial intelligence and then acquired by Google in 2014 for ~$600 million, London-based DeepMind made a big early bet on reinforcement learning as the most promising path forward in AI. And the bet paid off. The second half of the 2010s were triumphant years for the field of reinforcement learning. In 2016, DeepMind's AlphaGo became the first AI system to defeat a human world champion at the ancient Chinese game of Go, a feat that many AI experts had believed was impossible. In 2017, DeepMind debuted AlphaZero, which taught itself Go, chess and Japanese chess entirely via self-play and bested every other AI and human competitor in those games. And in 2019, DeepMind unveiled AlphaStar, which mastered the video game StarCraft—an even more complex environment than Go given the vast action space, imperfect information, numerous agents and real-time gameplay. AlphaGo, AlphaZero, AlphaStar—reinforcement learning powered each of these landmark achievements. As the 2010s drew to a close, RL seemed poised to dominate the coming generation of artificial intelligence breakthroughs, with DeepMind leading the way. But that's not what happened. Right around that time, a new AI paradigm unexpectedly burst into the spotlight: self-supervised learning for autoregressive language models. In 2019, a small nonprofit research lab named OpenAI released a model named GPT-2 that demonstrated surprisingly powerful general-purpose language capabilities. The following summer, OpenAI debuted GPT-3, whose astonishing abilities represented a massive leap in performance from GPT-2 and took the AI world by storm. In 2022 came ChatGPT. In short order, every AI organization in the world reoriented its research focus to prioritize large language models and generative AI. These large language models (LLMs) were based on the transformer architecture and made possible by a strategy of aggressive scaling. They were trained on unlabeled datasets that were bigger than any previous AI training data corpus—essentially the entire internet—and were scaled up to unprecedented model sizes. (GPT-2 was considered mind-bogglingly large at 1.5 billion parameters; one year later, GPT-3 debuted at 175 billion parameters.) Reinforcement learning fell out of fashion for half a decade. A widely repeated narrative during the early 2020s was that DeepMind had seriously misread technology trends by committing itself to reinforcement learning and missing the boat on generative AI. Yet today, reinforcement learning has reemerged as the hottest field within AI. What happened? In short, AI researchers discovered that applying reinforcement learning to generative AI models was a killer combination. Starting with a base LLM and then applying reinforcement learning on top of it meant that, for the first time, RL could natively operate with the gift of language and broad knowledge about the world. Pretrained foundation models represented a powerful base on which RL could work its magic. The results have been dazzling—and we are just getting Meets LLMs What does it mean, exactly, to combine reinforcement learning with large language models? A key insight to start with is that the core concepts of RL can be mapped directly and elegantly to the world of LLMs. In this mapping, the LLM itself is the agent. The environment is the full digital context in which the LLM is operating, including the prompts it is presented with, its context window, and any tools and external information it has access to. The model's weights represent the policy: they determine how the agent acts when presented with any particular state of the environment. Acting, in this context, means generating tokens. What about the reward signal and the value function? Defining a reward signal for LLMs is where things get interesting and complicated. It is this topic, more than any other, that will determine how far RL can take us on the path to superintelligence. The first major application of RL to LLMs was reinforcement learning from human feedback, or RLHF. The frontiers of AI research have since advanced to more cutting-edge methods of combining RL and LLMs, but RLHF represents an important step on the journey, and it provides a concrete illustration of the concept of reward signals for LLMs. RLHF was invented by DeepMind and OpenAI researchers back in 2017. (As a side note, given today's competitive and closed research environment, it is remarkable to remember that OpenAI and DeepMind used to conduct and publish foundational research together.) RLHF's true coming-out party, though, was ChatGPT. When ChatGPT debuted in November 2022, the underlying AI model on which it was based was not new; it had already been publicly available for many months. The reason that ChatGPT became an overnight success was that it was approachable, easy to talk to, helpful, good at following directions. The technology that made this possible was RLHF. In a nutshell, RLHF is a method to adapt LLMs' style and tone to be consistent with human-expressed preferences, whatever those preferences may be. RLHF is most often used to make LLMs 'helpful, harmless and honest,' but it can equally be used to make them more flirtatious, or rude, or sarcastic, or progressive, or conservative. How does RLHF work? The key ingredient in RLHF is 'preference data' generated by human subjects. Specifically, humans are asked to consider two responses from the model for a given prompt and to select which one of the two responses they prefer. This pairwise preference data is used to train a separate model, known as the reward model, which learns to produce a numerical rating of how desirable or undesirable any given output from the main model is. This is where RL comes in. Now that we have a reward signal, an RL algorithm can be used to fine-tune the main model—in other words, the RL agent—so that it generates responses that maximize the reward model's scores. In this way, the main model comes to incorporate the style and values reflected in the human-generated preference data. Circling back to reward signals and LLMs: in the case of RLHF, as we have seen, the reward signal comes directly from humans and human-generated preference data, which is then distilled into a reward model. What if we want to use RL to give LLMs powerful new capabilities beyond simply adhering to human preferences?The Next Frontier The most important development in AI over the past year has been language models' improved ability to engage in reasoning. What exactly does it mean for an AI model to 'reason'? Unlike first-generation LLMs, which respond to prompts using next-token prediction with no planning or reflection, reasoning models spend time thinking before producing a response. These models think by generating 'chains of thought,' enabling them to systematically break down a given task into smaller steps and then work through each step in order to arrive at a well-thought-through answer. They also know how and when to use external tools—like a calculator, a code interpreter or the internet—to help solve problems. The world's first reasoning model, OpenAI's o1, debuted less than a year ago. A few months later, China-based DeepSeek captured world headlines when it released its own reasoning model, R1, that was near parity with o1, fully open and trained using far less compute. The secret sauce that gives AI models the ability to reason is reinforcement learning—specifically, an approach to RL known as reinforcement learning from verifiable rewards (RLVR). Like RLHF, RLVR entails taking a base model and fine-tuning it using RL. But the source of the reward signal, and therefore the types of new capabilities that the AI gains, are quite different. As its name suggests, RLVR improves AI models by training them on problems whose answers can be objectively verified—most commonly, math or coding tasks. First, a model is presented with such a task—say, a challenging math problem—and prompted to generate a chain of thought in order to solve the problem. The final answer that the model produces is then formally determined to be either correct or incorrect. (If it's a math question, the final answer can be run through a calculator or a more complex symbolic math engine; if it's a coding task, the model's code can be executed in a sandboxed environment.) Because we now have a reward signal—positive if the final answer is correct, negative if it is incorrect—RL can be used to positively reinforce the types of chains of thought that lead to correct answers and to discourage those that lead to incorrect answers. The end result is a model that is far more effective at reasoning: that is, at accurately working through complex multi-step problems and landing on the correct solution. This new generation of reasoning models has demonstrated astonishing capabilities in math competitions like the International Math Olympiad and on logical tests like the ARC-AGI benchmark. So—is AGI right around the corner? Not necessarily. A few big-picture questions about reinforcement learning and language models remain unanswered and loom large. These questions inspire lively debate and widely varying opinions in the world of artificial intelligence today. Their answers will determine how powerful AI gets in the coming months.A Few Big Unanswered Questions Today's cutting-edge RL methods rely on problems whose answers can be objectively verified as either right or wrong. Unsurprisingly, then, RL has proven exceptional at producing AI systems that are world-class at math, coding, logic puzzles and standardized tests. But what about the many problems in the world that don't have easily verifiable answers? In a provocative essay titled 'The Problem With Reasoners', Aidan McLaughlin elegantly articulates this point: 'Remember that reasoning models use RL, RL works best in domains with clear/frequent reward, and most domains lack clear/frequent reward.' McLaughlin argues that most domains that humans actually care about are not easily verifiable, and we will therefore have little success using RL to make AI superhuman at them: for instance, giving career advice, managing a team, understanding social trends, writing original poetry, investing in startups. A few counterarguments to this critique are worth considering. The first centers on the concepts of transfer learning and generalizability. Transfer learning is the idea that models trained in one area can transfer those learnings to improve in other areas. Proponents of transfer learning in RL argue that, even if reasoning models are trained only on math and coding problems, this will endow them with broad-based reasoning skills that will generalize beyond those domains and enhance their ability to tackle all sorts of cognitive tasks. 'Learning to think in a structured way, breaking topics down into smaller subtopics, understanding cause and effect, tracing the connections between different ideas—these skills should be broadly helpful across problem spaces,' said Dhruv Batra, cofounder/chief scientist at Yutori and former senior AI researcher at Meta. 'This is not so different from how we approach education for humans: we teach kids basic numeracy and literacy in the hopes of creating a generally well-informed and well-reasoning population.' Put more strongly: if you can solve math, you can solve anything. Anything that can be done with a computer, after all, ultimately boils down to math. It is an intriguing hypothesis. But to date, there is no conclusive evidence that RL endows LLMs with reasoning capabilities that generalize beyond easily verifiable domains like math and coding. It is no coincidence that the most important advances in AI in recent months—both from a research and a commercial perspective—have occurred in precisely these two fields. If RL can only give AI models superhuman powers in domains that can be easily verified, this represents a serious limit to how far RL can advance the frontiers of AI's capabilities. AI systems that can write code or do mathematics as well as or better than humans are undoubtedly valuable. But true general-purpose intelligence consists of much more than this. Let us consider another counterpoint on this topic, though: what if verification systems can in fact be built for many (or even all) domains, even when those domains are not as clearly deterministic and checkable as a math problem? Might it be possible to develop a verification system that can reliably determine whether a novel, or a government policy, or a piece of career advice, is 'good' or 'successful' and therefore should be positively reinforced? This line of thinking quickly leads us into borderline philosophical considerations. In many fields, determining the 'goodness' or 'badness' of a given outcome would seem to involve value judgments that are irreducibly subjective, whether on ethical or aesthetic grounds. For instance, is it possible to determine that one public policy outcome (say, reducing the federal deficit) is objectively superior to another (say, expanding a certain social welfare program)? Is it possible to objectively identify that a painting or a poem is or is not 'good'? What makes art 'good'? Is beauty not, after all, in the eye of the beholder? Certain domains simply do not possess a 'ground truth' to learn from, but rather only differing values and tradeoffs to be weighed. Even in such domains, though, another possible approach exists. What if we could train an AI via many examples to instinctively identify 'good' and 'bad' outcomes, even if we can't formally define them, and then have that AI serve as our verifier? As Julien Launay, CEO/cofounder of RL startup Adaptive ML, put it: 'In bridging the gap from verifiable to non-verifiable domains, we are essentially looking for a compiler for natural language…but we already have built this compiler: that's what large language models are.' This approach is often referred to as reinforcement learning from AI feedback (RLAIF) or 'LLM-as-a-Judge.' Some researchers believe it is the key to making verification possible across more domains. But it is not clear how far LLM-as-a-Judge can take us. The reason that reinforcement learning from verifiable rewards has led to such incisive reasoning capabilities in LLMs in the first place is that it relies on formal verification methods: correct and incorrect answers exist to be discovered and learned. LLM-as-a-Judge seems to bring us back to a regime more closely resembling RLHF, whereby AI models can be fine-tuned to internalize whatever preferences and value judgments are contained in the training data, arbitrary though they may be. This merely punts the problem of verifying subjective domains to the training data, where it may remain as unsolvable as ever. We can say this much for sure: to date, neither OpenAI nor Anthropic nor any other frontier lab has debuted an RL-based system with superhuman capabilities in writing novels, or advising governments, or starting companies, or any other activity that lacks obvious verifiability. This doesn't mean that the frontier labs are not making progress on the problem. Indeed, just last month, leading OpenAI researcher Noam Brown shared on X: 'We developed new techniques that make LLMs a lot better at hard-to-verify tasks.' Rumors have even begun to circulate that OpenAI has developed a so-called 'universal verifier,' which can provide an accurate reward signal in any domain. It is hard to imagine how such a universal verifier would work; no concrete details have been shared publicly. Time will tell how powerful these new techniques is important to remember that we are still in the earliest innings of the reinforcement learning era in generative AI. We have just begun to scale RL. The total amount of compute and training data devoted to reinforcement learning remains modest compared to the level of resources spent on pretraining foundation models. This chart from a recent OpenAI presentation speaks volumes: At this very moment, AI organizations are preparing to deploy vast sums to scale up their reinforcement learning efforts as quickly as they can. As the chart above depicts, RL is about to transition from a relatively minor component of AI training budgets to the main focus. What does it mean to scale RL? 'Perhaps the most important ingredient when scaling RL is the environments—in other words, the settings in which you unleash the AI to explore and learn,' said Stanford AI researcher Andy Zhang. 'In addition to sheer quantity of environments, we need higher-quality environments, especially as model capabilities improve. This will require thoughtful design and implementation of environments to ensure diversity and goldilocks difficulty and to avoid reward hacking and broken tasks.' When xAI debuted its new frontier model Grok 4 last month, it announced that it had devoted 'over an order of magnitude more compute' to reinforcement learning than it had with previous models. We have many more orders of magnitude to go. Today's RL-powered models, while powerful, face shortcomings. The unsolved challenge of difficult-to-verify domains, discussed above, is one. Another critique is known as elicitation: the hypothesis that reinforcement learning doesn't actually endow AI models with greater intelligence but rather just elicits capabilities that the base model already possessed. Yet another obstacle that RL faces is its inherent sample inefficiency compared to other AI paradigms: RL agents must do a tremendous amount of work to receive a single bit of feedback. This 'reward sparsity' has made RL impracticable to deploy in many contexts. It is possible that scale will be a tidal wave that washes all of these concerns away. If there is one principle that has defined frontier AI in recent years, after all, it is this: nothing matters more than scale. When OpenAI scaled from GPT-2 to GPT-3 to GPT-4 between 2019 and 2023, the models' performance gains and emergent capabilities were astonishing, far exceeding the community's expectations. At every step, skeptics identified shortcomings and failure modes with these models, claiming that they revealed fundamental weaknesses in the technology paradigm and predicting that progress would soon hit a wall. Instead, the next generation of models would blow past these shortcomings, advancing the frontier by leaps and bounds and demonstrating new capabilities that critics had previously argued were impossible. The world's leading AI players are betting that a similar pattern will play out with reinforcement learning. If recent history is any guide, it is a good bet to make. But it is important to remember that AI 'scaling laws'—which predict that AI performance increases as data, compute and model size increase—are not actually laws in any sense of that word. They are empirical observations that for a time proved reliable and predictive for pretraining language models and that have been preliminarily demonstrated in other data modalities. There is no formal guarantee that scaling laws will always hold in AI, nor how long they will last, nor how steep their slope will be. The truth is that no one knows for sure what will happen when we massively scale up RL. But we are all about to find tuned for our follow-up article on this topic—or feel free to reach out directly to discuss!Looking Forward Reinforcement learning represents a compelling approach to building machine intelligence for one profound reason: it is not bound by human competence or imagination. Training an AI model on vast troves of labeled data (supervised learning) will make the model exceptional at understanding those labels, but its knowledge will be limited to the annotated data that humans have prepared. Training an AI model on the entire internet (self-supervised learning) will make the model exceptional at understanding the totality of humanity's existing knowledge, but it is not clear that this will enable it to generate novel insights that go beyond what humans have already put forth. Reinforcement learning faces no such ceiling. It does not take its cues from existing human data. An RL agent learns for itself, from first principles, through first-hand experience. AlphaGo's 'Move 37' serves as the archetypal example here. In one of its matches against human world champion Lee Sedol, AlphaGo played a move that violated thousands of years of accumulated human wisdom about Go strategy. Most observers assumed it was a miscue. Instead, Move 37 proved to be a brilliant play that gave AlphaGo a decisive advantage over Sedol. The move taught humanity something new about the game of Go. It has forever changed the way that human experts play the game. The ultimate promise of artificial intelligence is not simply to replicate human intelligence. Rather, it is to unlock new forms of intelligence that are radically different from our own—forms of intelligence that can come up with ideas that we never would have come up with, make discoveries that we never would have made, help us see the world in previously unimaginable ways. We have yet to see a 'Move 37 moment' in the world of generative AI. It may be a matter of weeks or months—or it may never happen. Watch this space.

OpenAI launches two new AI models ahead of GPT-5 - here's everything you need to know

Yahoo

an hour ago

Yahoo

OpenAI launches two new AI models ahead of GPT-5 - here's everything you need to know

When you buy through links on our articles, Future and its syndication partners may earn a commission. OpenAI is once again doing side quests in the lead up to the launch of GPT-5. As we wait for the big update, OpenAI is pausing to bring us not one, but two entirely separate models to play with. Both of these new models are available to download for free to anyone with some coding ability via Hugging Face. They come in two sizes, with the larger option being the more capable gpt-oss-120b model that can run on just one single Nvidia GPU, and a second smaller model, called gpt-oss-20b. This one can run on a consumer laptop with 16GB of memory. This is the first time OpenAI has launched an open weight model in years, and has been delaying its release for a while now. While smaller AI companies like Le Chat, Deepseek, and Alibaba have frequently released open-weight models, OpenAI has tended to keep their doors closed off. Sam Altman, CEO of OpenAI, said at the start of the year that OpenAI felt it was on the wrong side of history for this, suggesting they would be going back to launching some open-source model What are open-weight models? Quite simply, an open-weight model is one where all of its training parameters are made publicly available. Developers can access these, analyzing and fine tuning them for their own projects. In such a competitive market, it seems strange for this to be a thing. And yet, it is a very popular option, with some of the most powerful models on the market being open-weighted. Of course, GPT-5 won't be, neither would the likes of Grok and Claude's top models. But that isn't to say that this new option from OpenAI isn't powerful. When put through tests, OpenAI's two new models both performed ahead of Deepseek's R1 and in a similar line to some of OpenAI's other reasoning models. In both models, the full chain of thought can be accessed, making for easier debugging of code and higher trust in the models. What does this mean for you? If you're a developer in the AI space, this will be big news for you. OpenAI took a long break from offering out its weights available to the public, and there is a clear shift in their thinking for this to become available. For everybody else, this won't be of much importance. The big update for the average person will be GPT-5 when that launches in the next week or so. OpenAI did promise a lot of big updates in the next few weeks, with this just being the starter for the main course soon to come. More from Tom's Guide ChatGPT-5 is coming — here's how it could change the way we prompt forever Amazon may bring ads to Alexa+ in least surprising move ever OpenAI says they are no longer optimizing ChatGPT to keep you chatting — here's why