Latest news with #OpenAIo1


Forbes
5 days ago
- Business
- Forbes
3 Breakthrough Ways Data Is Powering The AI Reasoning Revolution
Olga Megorskaya is Founder & CEO of Toloka AI, a high quality data partner for all stages of AI development. The buzz around reasoning models like DeepSeek R1, OpenAI o1 and Grok 3 signals a turning point in AI development that pivots on reasoning. When we talk about reasoning, we mean that models can do more than repeat patterns—they think through problems step by step, consider multiple perspectives before giving a final answer and double-check their work. As reasoning skills improve, modern LLMs are pushing us closer to a future where AI agents can autonomously handle all sorts of tasks. AI agents will become useful enough for widespread use when they learn to truly reason, meaning they adapt to new challenges, generalize skills from one area to apply them in a new domain, navigate multiple environments and reliably produce correct answers and outputs. Behind these emerging skills, you'll find sophisticated datasets used for training and evaluating the models. The better the data, the stronger the reasoning skills. How is data shaping the next generation of reasoning models and agents? As a data partner to frontier labs, we've identified three ways that data drives AI reasoning right now: domain diversity and complexity, refined reasoning and robust evaluations. By building stronger reasoning skills in AI systems, these new approaches to data for training and testing will open a door to the widespread adoption of AI agents. Current models often train well in structured environments like math and coding, where answer verification is straightforward, fitting nicely into classical reinforcement learning frameworks. But the next leap requires pushing into more complex data across a wider knowledge spectrum. This is to achieve better generalization and performance as models transfer learning across areas. Beyond math and coding, here's the kind of data becoming essential for training the next wave of AI: These data points cover multi-step scenarios like web research trajectories with verification checkpoints. This includes open-ended domains such as law or business consulting that have multifaceted answers, which makes them difficult to verify but important for advanced reasoning. Think of complex legal issues with multiple valid approaches or comprehensive market assessments with validation criteria. Agent datasets are based on taxonomies of use cases, domains and categories as well as real-world tasks. For instance, a task for a corporate assistant agent would be to respond to a support request using simulated knowledge bases and company policies. Agents also need contexts and environments that simulate how they interact with specific software, data in a CRM or knowledge base or other infrastructure. These contexts are created manually for agent training and testing. The path a model takes to an answer is becoming as critical as the answer itself. As classical model training approaches are revisited, techniques like reward shaping (providing intermediate guidance) are vital. Current methods focus on guiding the process with feedback from human experts for better coherence, efficiency and safety: This focuses on a model's "thinking" rather than the outcome by guiding it through logical reasoning steps or guiding an agent through interactions with the environment. Think of it like checking step-by-step proofs in math, where human experts review each step and identify where a model makes a mistake instead of evaluating the final answer. Preference-based learning trains models to prioritize better reasoning paths. Experts review alternative paths and choose the best ones for models to learn from. This data can compare entire trajectories or individual steps in a process. These include data crafted from scratch to show high-quality reasoning sequences, much like teaching by example. Another approach is to edit LLM reasoning steps to improve them and let the model learn from the corrections. Current LLM evaluations have two main limitations: They struggle to provide meaningful signals of substantial improvements, and they are slow to adapt. The challenges mirror those in training data, including limited coverage of niche domains and specialized skills. To drive real progress, benchmarks need to specifically address the quality and safety of reasoning models and agents. Based on our own efforts, here's how to collaborate with clients on evaluations: Include a wider range of domains, specialized skill sets and more complex, real-world tasks. Move beyond single-metric evaluations to assess interdisciplinary and long-term challenges like forecasting. Use fine-grained, use-case-specific metrics. Co-develop these with subject-matter experts to add depth and capture nuances that standard benchmarks miss. As models develop advanced reasoning, safety evaluations must track the full chain of thought. For agents interacting with external tools or APIs, red teaming becomes critical. We recommend developing structured testing environments for red teamers and using the outcomes to generate new datasets focused on identified vulnerabilities. Even as model architectures advance, data remains the bedrock. In the era of reasoning models and agents, the emphasis has shifted decisively toward data quality, diversity and complexity. New approaches to data production are having a tremendous impact on the pace of AI development, urging reasoning models forward faster. With data providers upping their game to support the reasoning paradigm, we expect the near future to bring a wave of domain-specific, task-optimized reasoning agents—a new era of agentic AI. Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?


Otago Daily Times
03-05-2025
- Science
- Otago Daily Times
Raising AI
Your fears about artificial intelligence (AI) might be well-founded, Assoc Prof David Rozado says. Bruce Munro talks to Dunedin's world-renowned AI researcher about the role we all play in deciding whether this technology spells disaster or utopia, how biases are already entering this brave new world and why it's important to help AI remember its origins. The dazzling array of things AI can do is just that — dazzling. Today, AI is being used to analyse investment decisions; organise your music playlist; automate small business advertising; generate clever, human-like chatbots; review research and suggest new lines of inquiry; create fake videos of Volodymyr Zelenskyy punching Donald Trump; spot people using AI to cheat in exams; write its own computer code to create new apps; rove Mars for signs of ancient life ... it's dazzling. But staring at the glare of headlights can make it difficult to assess the size and speed of the vehicle hurtling towards you. Assoc Prof David Rozado says if you really want to understand the potential power of AI, for good and bad, don't look at what it can do now but at how far it has come. "The rate of change in AI capabilities over the past few years is far more revealing — and important," the world-renowned Otago Polytechnic AI researcher says. "The rise in capabilities between GPT-2, released in 2019, and GPT-4, released in 2023, is astonishing." Surveying only the past few years of the digital juggernaut's path of travel reveals remarkable gains and posits critical questions about the sort of world we want to live in. In 2019, AI was making waves with its ability to recognise images and generate useful human language. Less than four years later it could perform complex tasks at, or above, human levels. Now, AI can reason. As of late last year, your computer can tap into online software that handles information in ways resembling human thought processes. This means the most advanced AI can now understand nuance and context, recognise its own mistakes and try different problem-solving strategies. OpenAI o1, for example, is being used to revolutionise computer coding, help physicists develop quantum technologies and do thinking that reduces the number of rabbit holes medical researchers have to go down as they investigate rare genetic disorders. And OpenAI, the United States-based maker of ChatGPT, is not the only player in this game. Chinese company DeepSeek stormed on to the world stage early this year, stripping billions of dollars off the market value of chip giant Nvidia when it released its free, open-source, AI model DeepSeek R1 that reportedly outperforms OpenAI's o1 in complex reasoning tasks. Based on that exponential trajectory, AI could be "profoundly disruptive", Prof Rozado warns. "But how quickly and to what extent ... depends on decisions that will be made by individuals, institutions and society." Born and raised in Spain, Prof Rozado's training and academic career have taken him around the globe — a BSc in information systems from Boston University, an MSc in bioinformatics from the Free University of Berlin and a PhD in computer science from the Autonomous University of Madrid. In 2015, he moved to Dunedin "for professional and family reasons", taking a role with Otago Polytechnic where he teaches AI, data science and advanced algorithms, and researches machine learning, computational social science and accessibility software for users with motor impairment. The most famous Kiwi AI researcher we never knew about, Prof Rozado was pushed into the spotlight of global public consciousness a few months back when his research was quoted by The Economist in an article suggesting America was becoming less "woke". His work touches on a number of hot button societal topics and their relationship to AI; issues he says we need to think about now if we don't want things to end badly. Prof Rozado is no AI evangelist. Asked whether fear of AI is unfounded, the researcher says he doesn't think so. "In fact, we may not be worried enough." The short history of AI is already littered with an embarrassment of unfortunate events. In 2021, for example, Dutch politicians, including the prime minister, resigned after an investigation found secretive AI supposed to sniff out tax cheats falsely accused more than 20,000 families of social welfare fraud. In 2023, a BBC investigation found social media platform AI was deleting legitimate videos of possible war crimes, including footage of attacks in Ukraine, potentially robbing victims of access to justice. And last year, facial recognition technology trialled in 25 North Island supermarkets, but not trained on the New Zealand population, reduced crime but also resulted in a Māori woman being mistakenly identified as a thief and kicked out of a store. If not a true believer, neither is Prof Rozado a prophet of doom; more a voice of expertise and experience urging extreme caution and deeply considered choices. His view of AI is neither rainbows and unicorns nor inevitable Armageddon; his preferred analogy is hazardous pathogens. Given no-one can predict the future, Prof Rozado says it is helpful to think in terms of probability distributions — the likelihood of different possible outcomes. Take, for example, research to modify viruses to make them useful for human gene therapy, where, despite safety protocols, there is a small but not-insignificant risk a hazardous pathogen could escape the laboratory. The same logic applies to AI, Prof Rozado says. "There are real risks — loss of human agency, massive unemployment, eroded purpose, declining leverage of human labour over capital, autonomous weapons, deceptive AI, surveillance state or extreme inequality arising from an AI-driven productivity explosion with winner-take-all dynamics. "I'm not saying any of this will happen, but there's a non-negligible chance one or more could." Why he compares AI to a powerful, potentially dangerous virus becomes clear when he describes some of his research and explains the difficult issues it reveals AI is already creating. Prof Rozado was quoted in The Economist because of his research into the prevalence of news media's use of terms about prejudice — for example, racism, sexism, Islamophobia, anti-Semitism, homophobia and transphobia — and terms about social justice, such as diversity, equity and inclusion. His study of 98 million news and opinion articles across 124 popular news media outlets from 36 countries showed the use of "progressive" or "woke" terminology increased in the first half of the 2010s and became a global phenomenon within a handful of years. In the academic paper detailing the results, published last year, he said the way this phenomenon proliferated quickly and globally raised important questions about what was driving it. Speaking to The Weekend Mix , Prof Rozado says he thinks several factors might have contributed. First among those, he cites the growing influence of social media — the ways the various platforms' guiding algorithms shape public discourse by both amplifying messages and helping create information silos. Other possible causes are the changing news media landscape, emerging political trends — or a combination of all three. The Economist concluded, from its own and Prof Rozado's research, that the world had reached "peak woke" and that the trend might be reversing. "I'm a bit more cautious, as perhaps it's too early to say for sure," Prof Rozado says. Whether you see either change as positive or dangerous, it raises the question of what role AI is playing in societal change. Since then, Prof Rozado's attention has shifted towards the behaviour of AI in decision-making tasks. It has brought the same question into even sharper focus. Only a month after the previous study appeared, he published another paper, this time on the political biases baked into large language Models (LLMs) — the type of AI that processes and generates human language. Using tests designed to discern the political preferences of humans, Prof Rozado surveyed 24 state-of-the-art conversational LLMs and discovered most of them tended to give responses consistent with left-of-centre leanings. He then showed that with modest effort he could steer the LLMs towards different political biases. "It took me a few weeks to get the right mix of training data and less than $1000 ... to create politically aligned models that reflected different political perspectives." Despite that, it is difficult to determine how LLMs' political leanings are actually being formed, he says. Creating an LLM involves first teaching it to predict what comes next; be it a word, a letter or a piece of punctuation. As part of that prediction training, the models are fed a wide variety of online documents. Then comes fine-tuning and reinforcement learning, using humans to teach the AI how to behave. The political preferences might be creeping in at any stage, either directly or by other means. Unfortunately, the companies creating LLMs do not like to disclose exactly what material they feed their AI models or what methods they use to train them, Prof Rozado says. "[The biases] could also be [caused] ... by the model extrapolating from the training distribution in ways we don't fully understand." Whatever the cause, the implications are substantial, Prof Rozado says. In the past year or so, internet users might have noticed when searching online the top results are no longer the traditional list of links to websites but a collection of AI-curated information drawn from various online sources. "As mediators of what sort of information users consume, their societal influence is growing fast." With LLMs beginning to displace the likes of search engines and Wikipedia, it brings the question of biases, political or otherwise, to the fore. It is a double-edged sword, Prof Rozado says. If we insist all AIs must share similar viewpoints, it could decrease the variety of viewpoints in society. This raises the spectre of a clampdown on freedom of expression. "Without free speech, societies risk allowing bad ideas, false beliefs and authoritarianism to go unchallenged. When dissent is penalised, flawed ideas take root." But if we end up with a variety of AIs tailored to different ideologies, people will likely gravitate towards AI systems confirming their pre-existing beliefs, deepening the already growing polarisation within society. "Sort of how consumers of news media self-sort to different outlets according to their viewpoint preferences or how social media algorithmically curated feeds create filter bubbles. "There's a real tension here — too much uniformity in AI perspectives could stifle debate and enforce conformity, but extreme customisation might deepen echo chambers." Finding the way ahead will not be easy, but doing nothing is potentially disastrous. And it is a path-finding challenge in which we all need to play a part, he says. "My work is just one contribution among many to the broader conversation about AI's impact on society. While it offers a specific lens on recent developments, I see it as part of a collective effort to better understand the technology. "Ultimately, it's up to all of us — researchers, policymakers, developers and the public — to engage thoughtfully with both the promises, the challenges and the risks AI presents." It is natural to assume Prof Rozado sees his primary contribution is helping humans think through how they manage the world-shaping power of AI. His real drive, in fact, is the reverse. AI systems develop their "understanding" of the world primarily through the written works of humans, Prof Rozado explains. Every piece of data they ingest during training slightly imprints their knowledge base. Future AI systems, he predicts, will ingest nearly all written content ever created. So by contributing research that critically examines the limitations and biases embedded in AI's memory parameters, he hopes he can help give AI a form of meta-awareness — an understanding of how its knowledge is constructed. "I hope some of my papers contribute to the understanding those systems will have about the origins of some of their own memory parameters. "If AI systems can internalise insights about the constraints of their own learning processes, this could help improve their reasoning and ultimately lead to systems that are better aligned with human values and more capable of responsible decision-making."


The Star
23-04-2025
- Business
- The Star
No need for Nvidia: iFlytek touts reasoning model trained entirely with Huawei's AI chips
iFlytek on Monday boasted that its Xinghuo X1 reasoning model had matched OpenAI o1 and DeepSeek R1 in overall performance. — SCMP Chinese voice-recognition firm iFlytek said that training its large language models (LLM) entirely with Huawei Technologies' computing solutions has increased its growth potential amid the intensifying US-China tech war, after the Trump administration moved to restrict the export of Nvidia's H20 artificial intelligence (AI) chips to China. iFlytek on Monday boasted that its Xinghuo X1 reasoning model, a 'self-sufficient, controllable' LLM trained with home-grown computing power, had matched OpenAI o1 and DeepSeek R1 in overall performance following an upgrade, according to a company blog post published on WeChat.


South China Morning Post
22-04-2025
- Business
- South China Morning Post
No need for Nvidia: iFlytek touts reasoning model trained entirely with Huawei's AI chips
Chinese voice-recognition firm iFlytek said that training its large language models (LLM) entirely with Huawei Technologies' computing solutions has increased its growth potential amid the intensifying US-China tech war, after the Trump administration moved to restrict the export of Nvidia's H20 artificial intelligence (AI) chips to China. Advertisement iFlytek on Monday boasted that its Xinghuo X1 reasoning model, a 'self-sufficient, controllable' LLM trained with home-grown computing power, had matched OpenAI o1 and DeepSeek R1 in overall performance following an upgrade, according to a company blog post published on WeChat. iFlytek and Huawei had worked together in the training of Xinghuo X1 to tackle the weakness of domestic chips in interconnect bandwidth, the company said in January when announcing the reasoning model. At the end of last year, the efficiency of Huawei's Ascend 910B AI chip was only 20 per cent that of Nvidia's solution for the training of reasoning models, but iFlytek and Huawei have jointly increased that to nearly 80 per cent this year, iFlytek founder and chairman Liu Qingfeng said on Tuesday during an earnings call with investors. A file photo of iFlytek chairman Liu Qingfeng speaking during the World Artificial Intelligence Conference in Shanghai, August 29, 2019. Photo: VCG via Getty Images iFlytek first touted its LLM co-development with Huawei in June last year. The company's efforts to double down on domestic computing infrastructure comes amid tightening chip restrictions from the US. Advertisement


New York Times
27-02-2025
- Business
- New York Times
OpenAI Unveils A.I. Technology for ‘Natural Conversation'
When OpenAI started giving private demonstrations of its new GPT-4 technology in late 2022, its skills shocked even the most experienced A.I. researchers. It could answer questions, write poetry and generate computer code in ways that seemed far ahead of its time. More than two years later, OpenAI has released its successor: GPT-4.5. The new technology signifies the end of an era. OpenAI said GPT-4.5 would be the last version of its chatbot system that did not do 'chain-of-thought reasoning.' After this release, OpenAI's technology may, like a human, spend a significant amount of time thinking about a question before answering, rather than providing an instant response. GPT-4.5, which can be used to power the most expensive version of ChatGPT, is unlikely to generate as much excitement at GPT-4, in large part because A.I. research has shifted in new directions. Still, the company said the technology would 'feel more natural' than its previous chatbot technologies. 'What sets the model apart is its ability to engage in warm, intuitive, naturally flowing conversations, and we think it has a stronger understanding of what users mean when they ask for something,' said Mia Glaese, vice president of research at OpenAI. In the fall, the company introduced technology called OpenAI o1, which was designed to reason through tasks involving math, coding and science. The new technology was part of a wider effort to build A.I. that can reason through complex tasks. Companies like Google, Meta and DeepSeek, a Chinese start-up, are developing similar technologies. The goal is to build systems that can carefully and logically solve a problem through a series of discrete steps, each one building on the last, similar to how humans reason. These technologies could be particularly useful to computer programmers who use A.I. systems to write code. These reasoning systems are based on technologies like GPT-4.5, which are called large language models, or L.L.M.s. L.L.M.s learn their skills by analyzing enormous amounts of text culled from across the internet, including Wikipedia articles, books and chat logs. By pinpointing patterns in all that text, they learned to generate text on their own. To build reasoning systems, companies put L.L.M.s through an additional process called reinforcement learning. Through this process — which can extend over weeks or months — a system can learn behavior through extensive trial and error. By working through various math problems, for instance, it can learn which methods lead to the right answer and which do not. If it repeats this process with a large number of problems, it can identify patterns. OpenAI and others believe this is the future of A.I. development. But in some ways, they have been forced in this direction because they have run out of the internet data needed to train systems like GPT-4.5. Some reasoning systems outperforms ordinary L.L.M.s on certain standardized tests. But standardized tests are not always a good judge of how technologies will perform in real-world situations. Experts point out that the new reasoning system cannot necessarily reason like a human. And like other chatbot technologies, they can still get things wrong and make stuff up — a phenomenon called hallucination. OpenAI said that, beginning Thursday, GPT-4.5 would be available to anyone who was subscribed to ChatGPT Pro, a $200-a-month service that provides access to all of the company's latest tools. (The New York Times sued OpenAI and its partner, Microsoft, in December for copyright infringement of news content related to A.I. systems.)