Latest news with #ClaudeSonnet3.7


Axios
22-05-2025
- Business
- Axios
Anthropic unveils the latest Claudes with claim to AI coding crown
Anthropic debuted the first models in its latest Claude 4 series Thursday — including one, Claude 4 Opus, that it says is the world's best at coding. Why it matters: Competition is hot between Anthropic, Google and OpenAI for the "best frontier model" crown as questions persist about the companies' ability to push current AI techniques to new heights. Driving the news: At the high end, Anthropic announced Claude 4 Opus, its "powerful, large model for complex challenges," which it says can perform thousands of steps over hours of work without losing focus. Claude Sonnet 4 — Anthropic's "smart, efficient model for everyday use" — is designed to replace Claude Sonnet 3.7 with improved coding abilities and better adherence to instructions. Both are hybrid models, meaning they can answer users straightaway or take additional compute time to perform reasoning steps. In addition to the new models, Anthropic announced a series of other new capabilities for developers who connect to its models, including the ability to combine reasoning with the use of the Web and other tools. What they're saying: "AI agents powered by Opus 4 and Sonnet 4 can analyze thousands of data sources, execute long-running tasks, write human-quality content, and perform complex actions," Anthropic said in a statement. Between the lines: Anthropic is making one change in its reasoning mechanics — it will now aim to show summaries of the models' thought processes rather than trying to document each step. The big picture: The announcements, made at Anthropic's first-ever developer conference, come after a busy week in AI that saw Microsoft announce a new coding agent and a partnership to host Elon Musk's Grok, Google expand its AI-powered search efforts and OpenAI announce a $6.5 billion deal to buy io, Jony Ive's secretive AI hardware startup.


Indian Express
08-05-2025
- Business
- Indian Express
Mistral announces new AI model Medium 3 at 8x lower cost
French AI startup Mistral has introduced a frontier-level AI model, Mistral Medium 3. The new model from the Paris-based AI company is said to have outperformed models like Claude Sonnet 3.7 and GPT-4o on numerous benchmarks. The new model reportedly costs less than DeepSeek V3. The company has said that organisations can use the new model through its new AI assistant called Le Chat Enterprise that features an agent builder and allows full integration with a variety of apps. Mistral has also teased a more powerful model which will be introduced in the coming weeks. Mistral Medium 3 is said to be pushing efficiency and usability of language models even further. Mistral claims that the new Medium 3 brings a new class of models that balances state-of-the-art performance, is 8x lower in cost, and offers simple deployability to accelerate enterprise usage. The model also leads in professional use cases like coding and multimodal understanding. When it comes to enterprise capabilities, Medium 3 offers hybrid or on-premises in-VPC deployment, custom post-training, and allows integration into enterprise tools and systems. According to the company, the model performs at or above 90 per cent of Claude Sonnet 3.7 on benchmarks across the board at a considerably lower cost – $0.4 input/$2 output per M token. Medium 3 has also surpassed models such as Llama 4 Maverick and enterprise models like Cohere Command A. When it comes to pricing in terms of API and self-deployed systems, the model beats DeepSeek V3. It can also be deployed on any cloud, including self-hosted environments of four GPUs and above. The company claims the model is designed to be frontier-class, particularly in categories of professional use. When it comes to benchmarks, Mistral Medium 3 delivers top performance in instruction following (ArenaHard: 97.1%) and math (Math500: 91%), with strong results in long context tasks (RULER 32K: 96%). In terms of human evaluations, Medium 3 outperforms competitors, especially in coding. The model beats Claude Sonnet 3.7, DeepSeek 3.1, and GPT-4o in several cases.


Forbes
26-03-2025
- Entertainment
- Forbes
Too Good To Be Human? AI's Surprising Bias Against Quality Writing
Robot hands and fingers point to laptop button advisor chatbot robotic artificial intelligence ... More concept The first Turing Test may have been conducted at the ball in My Fair Lady. Professor Higgins has wagered with his friend, Pickering, that he can transform a flower girl into a lady through the science of language. He knows that people judge others by their manner of speech , and he'll use his skills as a professor of elocution to pull off the ruse. The final test comes when a rival professor conducts his own appraisal of Eliza on the dance floor. His verdict: 'She is a fraud!' His logic is captured in the song, 'You Did It!' Artificial Intelligence has faced a similar 'fool the inspector' challenge since Alan Turing first posed his famous test in a 1950 paper titled,'Computing Machinery and Intelligence.' Turing's very practical test proposes that a computer is intelligent if a person cannot distinguish between the computer and another person during an online chat. Many experts believe we've passed Turing's test with generative AI models. The latest version of Claude (Claude Sonnet 3.7) was just released, and it writes remarkably well. I provided Claude with an outline for an article, including the key points to stress, along with an interview text, and it wrote a clear, interesting, coherent article. It was (almost) indistinguishable from something that I might have written. I decided to try a reverse Turing test. My question was whether other AIs thought a given article was written by a person or by an AI. Gemini was certain the article I gave it was written by an AI. ChatGPT thought it plausible that the article was written by either a human or a machine (or a combination of both). Claude credited the human. Interested, I put six of my Forbes columns through the test by asking, 'Was this written by an AI?' The articles are, of course, written by a human (me). In five out of six cases, Gemini thought they were AI-written. The model was transparent about its logic and about the 'tells' it uses to identify AI-written text. Several of these fit the category of what might be called good writing: structured argumentation; use of data and statistics; referencing sources; focus on practical solutions; and a concluding call to action. Ironically, these are the aspirations of many an essay writer! In some cases, unfortunately, Gemini also found that the writing 'lacks a distinct personality or voice…which is often characteristic of AI-generated text.' Oh, well. Gemini's summary for the article How to Jump Start Learning At Work was: It reminded me of the song from My Fair Lady: 'This writing is too good, it said. That clearly indicates that it is AI…' ChatGPT seemed confused. It considered three authorship possibilities for each article: Purely AI-Generated; Human + AI Collaboration; and Purely Human-Written. In most cases, it favored a human collaborating with an AI, but it hedged by finding that all three options were plausible in five out of six cases. Claude identified half of the articles as 'indeterminate' and half as human-generated (phew!). It did this based on the presence of personal voice, individual experience cited in the article, and the nuance of the argument (perceived by the first so-called third-generation LLM). Its summary, for the article cited above, was: A few observations. 1. AI generally assumes thatwell-written articles are written by an AI. In other words, AI has a low regard for human-written text! 2. AI-written text is, indeed, getting very good. We should use it where we can to make writing better – but without delegating the thinking. AI will increase both the efficiency and clarity of business communications. 3. There will be an art to the collaboration between AIs and people as they work together to create good writing. The partnership is likely to involve iteration and the use of several tools. The best way of learning to do this will be by doing. The ability of AI to write well creates another challenge. Content that sounds good but which is entirely derivative will become very easy to create (and to promote using AI). It will be easy to become even more overwhelmed by marginally useful information. For centuries, we lived in a curated media world, where content was scarce and editors were in control. That world was disrupted in less than a generation by user-generated content like blogs, podcasts, and YouTube, which began to overwhelm our ability to process them. AI will move us into another era, one in which the volume of this user-generated content has increased so dramatically that it will inevitably alienate readers. What will be the consequence? I think that people's media preferences will revert from open, public content to curated, paid content. The model originally spawned from scarcity is likely to be recovered as a consequence of abundance (or a scarcity of attention).