logo
The Rise And Rise Of Reinforcement Learning: AI's Quiet Revolution

The Rise And Rise Of Reinforcement Learning: AI's Quiet Revolution

Forbes19-04-2025

A quiet revolution is reshaping artificial intelligence, and it's not the flashy one grabbing headlines. While chatbots and image generators dazzle, reinforcement learning, a method refined in academia over the past two decades, is powering the next generation of AI breakthroughs. Imagine a child learning to ride a bike: no manual, just trial, error, and the joy of balance. That's reinforcement learning, which is an algorithm that explores, adjusts, and learns from feedback, akin to an Easter egg hunt guided by 'warmer' or 'colder' hints. This approach isn't just changing how machines learn; it's redefining what intelligence means.
To grasp Reinforcement Learning's s ascent, let's first look at the two pillars of traditional machine learning:
Both methods shine in their domains, and are used in combination yet they falter where data is scarce or goals are vague. That's where Reinforcement Learning can help.
Reinforcement learning learns by doing, guided only by rewards or penalties from its environment. It's less about following a script and more about figuring things out. In 2015, Nature published a paper where Google researchers demonstrated how a reinforcement learning trained 'agent' mastered Atari games using just screen pixels and the scoreboard. Through countless trials, it learned to win at Space Invaders, Q*bert, Crazy Climber and dozens of other games often with moves that stunned human players. A year later, research also published in Nature, Google used similar techniques to topple the world's Go champion, which was a milestone once thought to be decades away. Reinforcement Learning thrives where explicit instructions don't exist. It doesn't need a mountain of labeled data but instead just a goal and a way to measure success.
Reinforcement Learning edge lies in its efficiency and ingenuity:
While OpenAI, the creator of ChatGPT, remains a private company, NVIDIA has become the public face of the generative AI boom. This chipmaker's value surged from $200 billion to over $2 trillion in just two years. Many believed its advanced hardware, like that produced by NVIDIA, was essential for the massive data centers powering AI solutions from giants like OpenAI, Meta, Google, and Microsoft. NVIDIA's relationship with ChatGPT has been compared to the iconic "Wintel" partnership between Intel and Microsoft during the rise of Windows.
However in January 2025, DeepSeek, unveiled a new Large Language Model trained using Reinforcement Learning . This model rivals ChatGPT's performance while requiring significantly less computational power. The announcement impacted NVIDIA heavily, causing its stock to drop more than 10% and temporarily erasing over $500 billion in value. Investors began to see that advanced AI might not always depend on such resource-intensive hardware.
DeepSeek's research quickly gained traction. Their paper, 'DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning,' has been cited over 500 times, making it the most referenced reinforcement learning study of 2025. The work highlights how reinforcement learning can achieve high performance without relying on excessive computing resources.
Reinforcement Learning's story isn't just technical but also philosophical. Its trial-and-error mimics human learning, prompting big questions. If machines can replicate this, what defines intelligence? If they spot patterns we can't, what might we learn about our world?
Andrew Ng, an AI luminary and educator, touched on this in a chat with Toby Walsh at UNSW Sydney. Reflecting on his 2002 PhD thesis, Ng said, 'My PhD thesis was on reinforcement learning… and my team worked on a robot.' His early bets are paying off today.
Reinforcement Learning's potential is vast: think more efficient energy grids, tailored education, or smarter robotics. But its autonomy demands caution and careful thought about the incentives used to train the models. An agent tasked with easing traffic might reroute cars through quiet streets, trade efficiency for disruption. Transparency and ethics will be key. Done right, though, Reinforcement Learning could usher in an era where machines don't just mimic us but they illuminate new paths forward.
Reinforcement Learning isn't a footnote in AI's story, it's a pivot. The hunt for smarter, leaner intelligence is on, and reinforcement learning is leading the charge.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

OpenAI and Benchmark Back New Startups From Ex-OpenAI Staffers
OpenAI and Benchmark Back New Startups From Ex-OpenAI Staffers

Bloomberg

time43 minutes ago

  • Bloomberg

OpenAI and Benchmark Back New Startups From Ex-OpenAI Staffers

Two and a half years ago, Liam Fedus was part of the team that helped create ChatGPT and kicked off a frenzy around artificial intelligence. Now, he's among the growing group of ex-OpenAI employees seizing on the AI investment boom with startups of their own. Fedus, OpenAI's former vice president of research for post-training, is raising more than $100 million to launch a company called Periodic Labs that's focused on AI for material science, according to people familiar with the matter. The round is being led by OpenAI and will value the new startup at about $1 billion, said the people, who spoke on condition of anonymity to discuss private information. Fedus is working with Ekin Dogus Cubuk, a former research scientist at Google DeepMind.

5,000-year-old homes — a first-of-their-kind find — unearthed in China. See them
5,000-year-old homes — a first-of-their-kind find — unearthed in China. See them

Miami Herald

timean hour ago

  • Miami Herald

5,000-year-old homes — a first-of-their-kind find — unearthed in China. See them

In Xianyang, China, on the banks of the Weihe River, the remains of ancient homes have been unearthed for the first time in thousands of years. During recent excavations at the Xiejiahe village site, a joint team from the Xianyang Institute of Cultural Relics and Archaeology and the School of Cultural Heritage at Northwest University uncovered nearly an acre of land, according to a June 5 news release from the organizations. Beneath the surface were cultural remains from multiple time periods, but the most interesting finds were a collection of house foundations from the middle to late Yangshao period, according to the release. A total of 19 foundations were unearthed, composed of circular homes in either single-room, double-room or multi-room constructions, researchers said. The Yangshao Period spanned from 5000 to 3000 B.C., making the houses at least 5,000 years old. Seven single-room houses have circular shapes and are partially built into the ground, according to the release. Post holes are built along the walls, and some houses had post holes at the base of the walls, likely to hold a raised platform. These houses had three types: homes with steps along the wall or that form a passageway, homes built into two levels with higher places with scorched soil for cooking areas and lower spaces for living, and flat-bottomed homes used as living spaces. The double-rooms were similar in style, but the ten houses fell into five different categories of construction, researchers said. They were likely made of one living space and one room for storage. The first version includes two irregular shaped circular spaces that were joined together. The second style had a pouch-shaped room connected to a cylindrical pit, and included an extra passage. Another style showed a shallow cylindrical chamber covered in scorched blocks used for cooking with a second side chamber that could have been used as storage or the living space. The last variation was a shallow pit built around the pouch-shaped rooms opening, where the shallow space would have been used for cooking and living and the pouch area used for storage, researchers said. Only two homes with three rooms were discovered, and both had a combination of a deep space and a high activity area with a passageway, according to the release. The styles of home from this period are the first of their kind ever discovered, researchers said, and help shed light on the daily lives of people from this era. All of the homes show some sign of function division — raised cooking areas, deep storage areas or additional pouch-shaped rooms — that show a practical intelligence among the Yangshao people. The homes were a significant find, researchers said, and was named one of the top six archaeological discoveries in Shaanxi Province in 2024, according to the release. The site in the Xiejiahe Village of Xianyang City is in the central region of the Shaanxi Province in east-central China. Chat GPT, an AI chat bot, and Google Translate were used to translate the news release from the Xianyang Institute of Cultural Relics and Archaeology and the School of Cultural Heritage at Northwest University.

NVIDIA (NasdaqGS:NVDA) Collaborates With Accenture And AdaCore For AI And Automotive Innovation
NVIDIA (NasdaqGS:NVDA) Collaborates With Accenture And AdaCore For AI And Automotive Innovation

Yahoo

timean hour ago

  • Yahoo

NVIDIA (NasdaqGS:NVDA) Collaborates With Accenture And AdaCore For AI And Automotive Innovation

NVIDIA has gained attention with its recent expansion in AI startups and introduction of new programming languages for the automotive market, coinciding with a 28% share price increase in the last quarter. These developments align with the broader market's modest upward trend, as the S&P 500 and Nasdaq both marked winning streaks, lifted by renewed economic optimism and progress in trade talks with China. NVIDIA's collaboration with Accenture, AdaCore, and various industry leaders adds weight to its momentum, complementing its focus on AI and autonomous technology, which have remained key drivers of interest amid robust market conditions. We've spotted 1 possible red flag for NVIDIA you should be aware of. We've found 19 US stocks that are forecast to pay a dividend yield of over 6% next year. See the full list for free. The recent developments at NVIDIA, particularly its expansion into AI startups and new programming languages for the automotive market, could significantly influence its long-term growth narrative. The partnerships with industry giants Toyota and Uber aim to cement NVIDIA's presence in the autonomous vehicles sector, potentially boosting revenue streams and market share. These collaborations align with NVIDIA's focus on AI and autonomous technology, potentially enhancing its data center and AI workloads, which in turn might improve earnings and margin potential. Analysts have projected NVIDIA's earnings to grow substantially, assuming it capitalizes on these burgeoning opportunities. However, regulatory challenges and export controls could impact these forecasts, as uncertainty remains a persistent risk. Over the past five years, NVIDIA's total return, including share price appreciation and dividends, has been very large, reflecting significant long-term value creation. In comparison to its one-year performance, NVIDIA surpassed the US Semiconductor industry, which posted a 10.3% gain, suggesting strong relative performance. This longer-term gain is a testament to NVIDIA's rapid growth trajectory, bolstered by innovation and strategic sector expansions. Despite the impressive gains, NVIDIA's current share price of US$113.54 is significantly lower than the consensus analyst price target of US$163.12, indicating potential room for growth. As the company continues to navigate AI and automotive sectors, these market moves can be pivotal in driving future revenue and earnings. Investors may need to assess if the growth potential justifies the price target, considering the associated risks and market dynamics. The ongoing scaling of the Blackwell architecture and its implications for operational efficiency and cost management will be central to NVIDIA meeting or exceeding analyst expectations. Assess NVIDIA's future earnings estimates with our detailed growth reports. This article by Simply Wall St is general in nature. We provide commentary based on historical data and analyst forecasts only using an unbiased methodology and our articles are not intended to be financial advice. It does not constitute a recommendation to buy or sell any stock, and does not take account of your objectives, or your financial situation. We aim to bring you long-term focused analysis driven by fundamental data. Note that our analysis may not factor in the latest price-sensitive company announcements or qualitative material. Simply Wall St has no position in any stocks mentioned. Companies discussed in this article include NasdaqGS:NVDA. This article was originally published by Simply Wall St. Have feedback on this article? Concerned about the content? with us directly. Alternatively, email editorial-team@ Sign in to access your portfolio

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store