Claude Opus 4 achieves record performance in AI coding capabilities

Anthropic's latest AI model, Claude Opus 4, has surpassed OpenAI's GPT-4.1 in coding abilities, marking a significant shift in how AI systems can assist with software development tasks. The new...
This story originally appeared on Calendar
Anthropic's latest AI model, Claude Opus 4, has surpassed OpenAI's GPT-4.1 in coding abilities, marking a significant shift in how AI systems can assist with software development tasks. The new model demonstrates capabilities that transform AI from a quick-response tool into a sustained collaborator, capable of working on complex coding projects for extended periods.
The most notable achievement of Claude Opus 4 is its ability to maintain autonomous coding sessions lasting up to seven hours. This represents a significant advancement in AI attention span and task persistence, enabling the model to tackle complex programming challenges without human intervention for significantly longer periods than previous systems.
Record-Breaking Performance on Industry Benchmarks
Claude Opus 4 scored 72.5% on SWE-bench, a rigorous benchmark used to evaluate AI coding abilities. This score sets a new record in the industry and places Anthropic's model ahead of OpenAI's GPT-4.1, which had previously led in this area.
SWE-bench tests an AI's ability to understand, modify, and debug real-world codebases—skills that are directly applicable to professional software development environments. The benchmark evaluates:
Code comprehension across multiple programming languages
Bug identification and resolution
Implementation of new features in existing codebases
Adherence to coding standards and best practices
The 72.5% score indicates that Claude Opus 4 can complete nearly three-quarters of the complex software engineering tasks presented to it, a substantial improvement over previous AI models.
From Assistant to Collaborator: A New Paradigm
This represents a fundamental shift in how AI can participate in software development,' said a software engineer who has tested the new model. Instead of just answering questions or writing small code snippets, these systems can now maintain context and work through multi-hour projects.'
The extended operational timeframe of seven hours means Claude Opus 4 can tackle larger, more complex programming tasks that require sustained attention and iterative problem-solving. This capability transforms the role of AI in programming from a quick reference tool to a genuine collaborator that can:
Maintain awareness of project requirements throughout extended development sessions
Remember earlier decisions and their rationale
Apply consistent coding styles and patterns across an entire project
Debug issues that require a deep understanding of the codebase
Implications for Software Development
The advancement in AI coding capabilities has significant implications for the development of software. For professional developers, Claude Opus 4 could serve as a productivity multiplier, handling routine coding tasks while allowing humans to focus on higher-level design and architecture decisions.
For organizations, this technology could help address the persistent shortage of skilled developers by augmenting existing teams and potentially making software development more accessible to those with limited coding experience.
However, questions remain about how these systems will integrate into existing development workflows and what impact they might have on programming jobs. While some experts suggest these tools will primarily augment human developers rather than replace them, the rapid advancement in capabilities has prompted discussions about the changing nature of software engineering work.
The competition between Anthropic and OpenAI continues to drive rapid innovation in AI capabilities, with each company pushing the boundaries of what their models can accomplish. As these systems become more capable of extended, complex work, they may find applications beyond coding in other fields that require sustained problem-solving and creativity.
The post Claude Opus 4 achieves record performance in AI coding capabilities appeared first on Calendar.

Hashtags

Business

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

6 minutes ago

AI risks 'broken' career ladder for college graduates, some experts say

Artificial intelligence could upend entry-level work as recent college graduates enter the job market, eliminating many positions at the bottom of the white-collar career ladder or at least reshaping them, some experts told ABC News. Such forecasts follow yearslong advances in AI-fueled chatbots, and declarations from some company executives about the onset of AI automation. Dario Amodei, CEO of Anthropic, which created an AI model called Claude, told Axios last week that technology could cut U.S. entry-level jobs by half within five years. When Business Insider laid off 21% of its staff last week, CEO Barbara Ping said the company would go "all in on AI" in an effort to "scale and operate more efficiently." Analysts who spoke to ABC News said AI could replace or reorient entry-level jobs in some white-collar fields targeted by college graduates, such as computer programming and law. Current job woes for this cohort, they added, likely owe in part to economic conditions beyond technology. Many blue-collar and other hands-on jobs will remain largely untouched by AI, they said, noting that tech-savvy young workers may be best positioned to fill new jobs that do incorporate AI. "We're in the flux of dramatic change," said Lynn Wu, a professor of operations, information and decisions at the University of Pennsylvania. "I sympathize with college graduates. In the short run, they may stay with mom and dad for a while. But in the long run, they'll be fine. They're AI natives." Over the early months of 2025, the job market for recent college graduates "deteriorated noticeably," the New York Federal Reserve said in April. It did not provide a reason for the trend. The unemployment rate for recent college graduates reached 5.8%, its highest level since 2021, while the underemployment rate soared above 40%, the New York Fed said. Youth unemployment likely stems from trends in the broader economy rather than AI, Anu Madgavkar, the head of labor market research at the McKinsey Global Institute, told ABC News The softening job market coincides with business uncertaint y and gloomy economic forecasts elicited by President Donald Trump's tariff policy. "It's not surprising we're seeing this unemployment for young people," Madgavkar said. "There is a lot of economic uncertainty." Still, entry-level tasks in white collar professions stand at serious risk from AI, analysts said, pointing to the technology's capacity to perform written and computational tasks as opposed to manual work. AI could replace work previously performed by low-level employees, such as legal assistants compiling relevant precedent for a case or computer programmers writing a basic set of code, Madgavkar said. "Is the bleeding edge or the first type of work to be hit a little more skewed toward entry-level, more basic work getting automated right now? That's probably true," Madgavkar said. "You could have fewer people getting a foothold." Speaking bluntly, Wu said: "The biggest problem is that the career ladder is being broken." For the most part, however, Madgavkar said entry-level positions would change rather than disappear. Managers will prize problem-solving and analysis over tasks dependent on sheer effort, she added, noting the required set of skills will likely include a capacity to use AI. "I don't think it means we'll have no demand for entry-level workers or massively less demand," Madgavakar said. "I just think expectations for young people to use these tools will accelerate very quickly." Some jobs and tasks remain largely immune to AI automation, analysts said, pointing to hands-on work such as manual labor and trades, as well as professional roles like doctors and upper management. Isabella Loaiza, a researcher at the Massachusetts Institute of Technology who studies AI and the workforce, co-authored a study examining the shift in jobs and tasks across the U.S. economy between 2016 and 2024. Rather than dispense with qualities like critical thinking and empathy, workplace technology heightened the need for workers who exhibit those attributes, Loaiza said, citing demand for occupations like early-education teachers, home health aides and therapists. "It is true we're seeing AI having an impact on white-collar work instead of more blue-collar work," Loaiza said.

Trump's AI czar says UBI-style cash payments are 'not going to happen'

Business Insider

17 minutes ago

Business Insider

Trump's AI czar says UBI-style cash payments are 'not going to happen'

Americans probably won't be getting a universal basic income as long as President Donald Trump's AI czar has a say in the matter. David Sacks, the cofounder of Craft Ventures and a member of the so-called " PayPal Mafia," which includes Elon Musk and Peter Thiel, is now a top White House policy advisor for AI. It's an important role as rapid advances in AI bring about generational changes in how the world lives and works. The technology is already reshaping the job market, as chatbots like ChatGPT begin to do the work of entry-level employees. Those at the forefront of the AI revolution have long warned about the risk AI poses to jobs, and have called for a universal basic income to soften the blow. A UBI is a government program that distributes no-strings-attached checks to all residents to spend how they please. Numerous cities and states are already experimenting with its humble cousin, a guaranteed basic income, which distributes checks to specific populations in need. The idea has a long history, and support for these kinds of programs has skyrocketed at the local level in recent years. Any consideration of a basic income at the federal level, however, will likely have to wait. Sacks is not a fan. The AI czar said on X this week that such government "welfare" is a "fantasy." "The future of AI has become a Rorschach test where everyone sees what they want. The Left envisions a post-economic order in which people stop working and instead receive government benefits," Sacks wrote. "In other words, everyone on welfare. This is their fantasy; it's not going to happen." Although reports from recipients who participate in basic income programs are overwhelmingly positive, they have faced political pushback. Last year, Republicans in Arizona voted to ban basic income programs in the state, and similar opposition efforts have gained traction in Iowa, Texas, and South Dakota. Lawmakers in several states have argued that the checks increase reliance on the government and dissuade recipients from working. OpenAI CEO Sam Altman helped fund one of the largest basic income studies, which found, in part, that it encouraged recipients to work harder. Elon Musk, who until recently was the face of Trump's effort to reduce government spending, has said a basic income will likely play a role in future economies as AI continues to rapidly develop. Sacks' comments came as another prominent AI leader, Google DeepMind CEO Demis Hassabis, called for not just a universal basic income, but a "universal high income" at SXSW in London this week. When asked about AI's impact on jobs, Hassabis said there would be a "huge amount of change," but that "new, even better" jobs could replace affected positions and boost productivity. "Beyond that, we may need things like universal high income or some way of distributing all the additional productivity that AI will produce in the economy," Hassabis said.

JPMorgan raises its year-end S&P 500 target, but sees little upside left

CNBC

22 minutes ago

CNBC

JPMorgan raises its year-end S&P 500 target, but sees little upside left

Strong earnings and a resilient economy amid extreme policy uncertainty this year have uplifted JPMorgan's view on U.S. equities. Head of global markets strategy Dubravko Lakos-Bujas raised his year-end S & P 500 price target to 6,000 from 5,200. The strategist cited an "encouraging fundamental backdrop" that alleviated investors' concerns about tariffs' impact on corporate growth. He added that results such as Nvidia's confirmed that the AI theme and capital spending boom remains strong. "Absent major policy surprises, the path of least resistance is to new highs supported by Tech / AI led strong fundamentals, a steady bid from systematic strategies on improving volatility and momentum signals, and flows from active investors on dips," Lakos-Bujas wrote in a note to clients. .SPX 1Y mountain S & P 500 performance over the past year. Lakos-Bujas sees double-digit S & P 500 earnings expectations for 2026 lifting stocks in the second half of this year. He also highlighted the U.S. Court of International Trade's recent ruling against the Trump administration's tariffs as a potential tailwind for equities. "Going into 2H25, investors should begin to anchor equities to 2026 EPS growth potential ... which is significantly higher and should help support the relative valuation case for U.S. equities," Lakos-Bujas wrote. To be sure, Lakos-Bujas' target implies just 1% upside from Thursday's close of 5,939.30. "Elevated valuation could limit market upside, but it is rarely a sell catalyst on its own, especially if the U.S. continues to deliver stronger growth relative to DM peers and the AI story remains intact," the strategist said. Lakos-Bujas had slashed his original year-end target from 6,500 to 5,200 as investors navigated the throes of the April tariff scare. The S & P 500 at one point traded around 20% below its record high set in February. However, pauses to the steep duties announced by President Donald Trump helped the market recover. The benchmark sits around 3% below its all-time high. Three other strategists featured in the CNBC Market Strategist Survey also raised their forecasts this week: RBC's Lori Calvasina : to 5,730 from 5,550 Deutsche Bank's Binky Chadha : to 6,550 from 6,150 Barclays' Venu Krishna : to 6,050 from 5,900