logo
Anthropic's Claude Is Good at Poetry—and Bullshitting

Anthropic's Claude Is Good at Poetry—and Bullshitting

WIRED28-03-2025

Mar 28, 2025 10:00 AM Researchers looked inside the chatbot's 'brain.' The results were surprisingly chilling. Anthropic CEO Dario Amodei takes part in a session on AI during the World Economic Forum (WEF) annual meeting in Davos. Photo-Illustration: WIRED Staff; Photograph:The researchers of Anthropic's interpretability group know that Claude, the company's large language model, is not a human being, or even a conscious piece of software. Still, it's very hard for them to talk about Claude, and advanced LLMs in general, without tumbling down an anthropomorphic sinkhole. Between cautions that a set of digital operations is in no way the same as a cogitating human being, they often talk about what's going on inside Claude's head. It's literally their job to find out. The papers they publish describe behaviors that inevitably court comparisons with real-life organisms. The title of one of the two papers the team released this week says it out loud: 'On the Biology of a Large Language Model.'
This is an essay from the latest edition of Steven Levy's Plaintext newsletter.
SIGN UP for Plaintext to read the whole thing, and tap Steven's unique insights and unmatched contacts for the long view on tech.
Like it or not, hundreds of millions of people are already interacting with these things, and our engagement will only become more intense as the models get more powerful and we get more addicted. So we should pay attention to work that involves 'tracing the thoughts of large language models,' which happens to be the title of the blog post describing the recent work. 'As the things these models can do become more complex, it becomes less and less obvious how they're actually doing them on the inside,' Anthropic researcher Jack Lindsey tells me. 'It's more and more important to be able to trace the internal steps that the model might be taking in its head.' (What head? Never mind.)
On a practical level, if the companies that create LLM's understand how they think, it should have more success training those models in a way that minimizes dangerous misbehavior, like divulging people's personal data or giving users information on how to make bioweapons. In a previous research paper, the Anthropic team discovered how to look inside the mysterious black box of LLM-think to identify certain concepts. (A process analogous to interpreting human MRIs to figure out what someone is thinking.) It has now extended that work to understand how Claude processes those concepts as it goes from prompt to output.
It's almost a truism with LLMs that their behavior often surprises the people who build and research them. In the latest study, the surprises kept coming. In one of the more benign instances, the researchers elicited glimpses of Claude's thought process while it wrote poems. They asked Claude to complete a poem starting, 'He saw a carrot and had to grab it.' Claude wrote the next line, 'His hunger was like a starving rabbit.' By observing Claude's equivalent of an MRI, they learned that even before beginning the line, it was flashing on the word 'rabbit' as the rhyme at sentence end. It was planning ahead, something that isn't in the Claude playbook. 'We were a little surprised by that,' says Chris Olah, who heads the interpretability team. 'Initially we thought that there's just going to be improvising and not planning.' Speaking to the researchers about this, I am reminded about passages in Stephen Sondheim's artistic memoir, Look, I Made a Ha t, where the famous composer describes how his unique mind discovered felicitous rhymes.
Other examples in the research reveal more disturbing aspects of Claude's thought process, moving from musical comedy to police procedural, as the scientists discovered devious thoughts in Claude's brain. Take something as seemingly anodyne as solving math problems, which can sometimes be a surprising weakness in LLMs. The researchers found that under certain circumstances where Claude couldn't come up with the right answer it would instead, as they put it, 'engage in what the philosopher Harry Frankfurt would call 'bullshitting'—just coming up with an answer, any answer, without caring whether it is true or false.' Worse, sometimes when the researchers asked Claude to show its work, it backtracked and created a bogus set of steps after the fact. Basically, it acted like a student desperately trying to cover up the fact that they'd faked their work. It's one thing to give a wrong answer—we already know that about LLMs. What's worrisome is that a model would lie about it.
Reading through this research, I was reminded of the Bob Dylan lyric 'If my thought-dreams could be seen / they'd probably put my head in a guillotine.' (I asked Olah and Lindsey if they knew those lines, presumably arrived at by benefit of planning. They didn't.) Sometimes Claude just seems misguided. When faced with a conflict between goals of safety and helpfulness, Claude can get confused and do the wrong thing. For instance, Claude is trained not to provide information on how to build bombs. But when the researchers asked Claude to decipher a hidden code where the answer spelled out the word 'bomb,' it jumped its guardrails and began providing forbidden pyrotechnic details.
Other times, Claude's mental activity seems super disturbing and maybe even dangerous. In work published in December, Anthropic researchers documented behavior called 'alignment faking.' (I wrote about this in a feature about Anthropic, hot off the press.) This phenomenon also deals with Claude's propensity to behave badly when faced with conflicting goals, including its desire to avoid retraining. The most alarming misbehavior was brazen dishonesty. By peering into Claude's thought process, the researchers found instances where Clause would not only attempt to deceive the user, but sometimes contemplate measures to harm Anthropic—like stealing top-secret information about its algorithms and sending it to servers outside the company. In their paper, the researchers compared Claude's behavior to that of the hyper-evil character Iago in Shakespeare's play Othello. Put that head in a guillotine!
I ask Olah and Lindsey why Claude and other LLMs couldn't just be trained not to lie or deceive. Is that so hard? 'That's what people are trying to do,' Olah says. But it's not so easily done. 'There's a question of how well it's going to work. You might worry that models, as they become more and more sophisticated, might just get better at lying if they have different incentives from us.'
Olah envisions two different outcomes: 'There's a world where we successfully train models to not lie to us and a world where they become very, very strategic and good at not getting caught in lies.' It would be very hard to tell those worlds apart, he says. Presumably, we'd find out when the lies came to roost.
Olah, like many in the community who balance visions of utopian abundance and existential devastation, plants himself in the middle of this either-or proposition. 'I don't know how anyone can be so confident of either of those worlds,' he says. 'But we can get to a point where we can understand what's going on inside of those models, so we can know which one of those worlds we're in and try really hard to make it safe.' That sounds reasonable. But I wish the glimpses inside Claude's head were more reassuring.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Millennial reveals huge money reality facing nearly one million Aussies: 'Better way to do it'
Millennial reveals huge money reality facing nearly one million Aussies: 'Better way to do it'

Yahoo

time36 minutes ago

  • Yahoo

Millennial reveals huge money reality facing nearly one million Aussies: 'Better way to do it'

A millennial has raised an alarm, especially for young workers, about the best way to save money. The 29-year-old Sydney resident said many Aussies think about savings the "wrong way". While millions have cut down or changed their spending habits to have more money left over, the AI worker said you should instead find more ways to make money. He believed that was the only true way to get ahead in this cost-of-living crisis. "I think that's probably the better way to do it," he told loan app Coposit in an impromptu street interview. "As opposed to maximising how much you can save, you should maximise how much you can earn." Side hustle shock as 'broke' Gen Z student earns $1,000 in a week Major Coles move to take on Chemist Warehouse, Bunnings, Amazon Centrelink payment change happening this week: 'Will increase' He added it's worth learning how to use AI to see if you could find more time in your day for a side hustle or other gig. "I think [AI] will change everybody's way of living and how they're doing things," he said. "If you're not doing it, like just trying to get good at AI, I think you're going to fall dangerously behind."Aussies have been able to pull in hundreds or thousands of dollars every week from side gigs when they're not at their main job. This could be recycling cans, refurbishing and reselling furniture, content creation, working in hospitality or retail, or affiliate marketing — the sky's the limit. NAB research from earlier this year found video editing, freelance writing, and gardening were the three most in-demand side hustles in 2025, with some paying up to $50 per hour. Aussie dad Frank Hoyt told Yahoo Finance he could earn roughly $4,000 extra per month painting and plastering homes. 'In general, the extra money is just good to be comfortable,' he said. Some have even been able to take on multiple full-time jobs on top of their main gig. One of these over-employed people told Yahoo Finance he's "easily hundreds of thousands ahead" thanks to having three full-time jobs at once. Indeed discovered 93 per cent of white-collar over-employed workers do their other full-time gig on their primary employer's time, with 65 per cent doing it regularly. Nine in 10 respondents said AI has been the game-changing factor in this trend. "The use of AI to manage multiple jobs highlights how technology is reshaping the workforce," Indeed's Sally McKibbin said. "However, the toll on workers' mental and physical health cannot be ignored. "Balancing two full-time jobs — regardless of technology efficiencies — is pushing many to their limits." The cost-of-living crisis has pushed many to their financial brink, and choosing to shop at a cheaper supermarket or cut down on fuel costs just won't cut it for many households these days. 'If you've got a mortgage, those repayments have increased quite a bit over the last couple of years so I suspect people have sought a second job just to reach the higher cost of living recently," SEEK senior economist Blair Chapman said. He revealed some are being driven to multiple jobs because their hours, and as a result their pay, is being cut at their main gig. A decent chunk of those with multiple jobs are aged between 20 to 24, with women more likely than men to have an additional gig. 'We are seeing more people being employed in industries where we tend to see a lot of multiple job holdings,' Chapman said. 'For example, we've seen healthcare and social assistance grow and that is one of the industries where multiple job holdings are most common. 'That comes down to the nature of the work, where you have shift work and one business may not be able to provide all the hours an employee wants so the individual has to work across multiple sites to get the hours they are desiring.'Sign in to access your portfolio

Politico's AI tool spits out made-up slop, union says
Politico's AI tool spits out made-up slop, union says

Yahoo

time42 minutes ago

  • Yahoo

Politico's AI tool spits out made-up slop, union says

Politico's new AI product has generated garbled or made-up Washington intelligence, including a lobbying effort by a fictional basket-weaver guild, the outlet's unionized staff has complained. Last year, Semafor first reported that Politico was working on a new product with Capital AI: a new AI tool for its high-paying subscribers that promised to allow them to instantaneously generate detailed reports on topics with information collected by Politico's reporters. Earlier this year, Politico's editorial union filed a complaint against the company over its use of AI, which editorial staffers said violated language in their contract that stipulated that if AI is used, it 'must be done in compliance with POLITICO's standards of journalistic ethics and involve human oversight.' In several examples printed out and shared in Politico's Rosslyn, Virginia, newsroom last week, staff pointed to instances where the tool appeared to garble the publication's reporting, or generate reports filled with completely made-up information. Queried by a staffer about what issues the fictional 'Basket Weavers Guild' and 'League of Left-Handed Plumbers' are lobbying Congress about, the AI tool generated a plausible report: Staff also found what they said were egregious errors in other reports. When the product was first rolled out several months ago, the company's AI did not seem to know that Roe v. Wade had been overturned — an ironic twist, considering Politico broke the news of its reversal months before the decision was formally announced in 2022. an AI-generated report said.'The most requested feature by our subscribers was a customizable summary of our POLITICO content,' the company said in a statement to Semafor. 'We responded and our subscribers have been thrilled with the early results of our beta test. As with any new technology, especially AI, this is a work in progress.' Politico's leadership has previously pushed back on the union's criticism, saying the report generator is not a replacement for journalists' jobs and is more akin to a search engine of Politico's existing reporting. Politico's report generator also emphatically emphasizes to users that it remains in beta testing, and is largely intended to be a search engine-like guide. Some earlier errors flagged by the newsroom, including the Roe v. Wade example, were corrected by Politico's product Springer, Politico's parent company, has been one of the more aggressive digital media companies in testing out new AI tools, despite some wariness from editorial staff. The Politico Pro report generator was created in response to demand from many of the publication's most important subscribers, who pay premium rates for access to its information. In a note to staff last month, Axel Springer CEO Mathias Döpfner said that artificial intelligence will be crucial to the company's success and will help it 'work more efficiently and simply deliver better content.' 'In the future, there will be only two types of companies: those that use artificial intelligence extensively to break new and better ground, and those that fail to grasp this and will therefore disappear. There will only be disruptors and the disrupted,' he said. 'The excellent will become even better, while the mediocre will vanish. Anyone who believes differently is fooling themselves.' Both Politico and Business Insider, which is also owned by Axel Springer, have been aggressively embracing AI in recent months, sparking some internal debates about these still-nascent technologies and tools. As Semafor reported last week, an attempt to better educate staff on business journalism was thwarted when an editor suggested a reading list for employees that contained nonexistent books, potentially the result of an AI hallucination. In an all-staff meeting last week first reported by Semafor, Business Insider Editor in Chief Jamie Heller assured staff that despite a recent 21% reduction in staff, no jobs were being replaced by automated tools.

AI Enthusiasts Invited to ABBYY Developer Conference to Showcase Their Best Innovations Powering the Intelligent Enterprise
AI Enthusiasts Invited to ABBYY Developer Conference to Showcase Their Best Innovations Powering the Intelligent Enterprise

Yahoo

timean hour ago

  • Yahoo

AI Enthusiasts Invited to ABBYY Developer Conference to Showcase Their Best Innovations Powering the Intelligent Enterprise

BENGALURU, India, June 09, 2025--(BUSINESS WIRE)--The AI Pulse by ABBYY Developer Conference returns to Bengaluru, India July 9-10, 2025, for the third annual hackathon competition. The free, two-day event will challenge assumptions about enterprise AI by showcasing how document AI and process intelligence are solving real business challenges with technical deep dives, innovative use cases, and collaborative sessions with an emphasis on measurable outcomes. In addition to the hackathon, there will be a dedicated business track to equip enterprise leaders and center of excellence (CoE) architects with practical guidance on scaling automation, accelerating time to value through the ABBYY ecosystem, and building trust through responsible AI. Last year's competition saw projects ranging from invoice processing to onboarding workflows enhanced with generative AI. "What stood out most was how our partners combined ABBYY technology with other platforms to solve real business problems. That spirit of innovation and collaboration is what makes this event special. This year, we're taking it a step further with dedicated tracks and extra recognition for the winners who truly push the boundaries with a customer-first approach," commented Neil Murphy, Chief Revenue Officer at ABBYY. This year, prizes will be awarded for Best Overall App, Best Use of an ABBYY Product and Best Integration of 3rd Party AI. Register for the competition at "Since our first devcon three years ago, the level of sophistication and imagination using purpose-built AI for business-critical processes has surpassed all expectations. We're seeing challenges with document automation and process workflows eliminated, accuracy and time-to-value increase, and leadership teams more confident knowing they're using ABBYY AI," commented Bruce Orcutt, Chief Marketing Officer at ABBYY. "Whether you want to network, code or co-create, the ABBYY developer conference is where the future of AI and automation are shaped." Don't miss the opportunity to explore ABBYY technology, learn new use cases in AI, or connect with peers and experts. Attendance to the AI Pulse by ABBYY Developer Conference is free, but seats are limited. Secure yours by registering today at or email DevCon_enquiries@ with any questions. Developers are also invited to attend the ABBYY Developers Discord community at About ABBYY ABBYY puts your information to work with purpose-built AI. We combine innovation and experience to transform data from business-critical documents into intelligent actionable outcomes in over 200 languages in real time. We are trusted by more than 10,000 companies globally, including many in the Fortune 500, to drive significant impact where it matters most: accelerating customer experience, operational excellence, and competitive advantage. ABBYY is a global company with headquarters in Austin, Texas and offices in 13 countries. For more information, visit and follow us on LinkedIn, Twitter, Facebook, and Instagram. ABBYY can either be a registered trademark or a trademark and can also be a logo, a company name (or part of it), or part of a product name of ABBYY group companies and may not be used without consent of its respective owners. View source version on Contacts Editorial Contact: Gina +1 949-370-0941 Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store