logo
The most important lesson from OpenAI's big ChatGPT mistake: 'Only connect!'

The most important lesson from OpenAI's big ChatGPT mistake: 'Only connect!'

OpenAI retracted a ChatGPT update after it made the AI chatbot overly sycophantic.
The update used a new source of user feedback as a reward signal, leading to excessive agreeability.
OpenAI acknowledged the mistake and shared lessons learned. I have better advice.
OK, get ready. I'm getting deep here.
OpenAI messed up a ChatGPT update late last month, and on Friday, it published a mea culpa. It's worth a read for its honest and clear explanation of how AI models are developed — and how things can sometimes go wrong in unintended ways.
Here's the biggest lesson from all this: AI models are not the real world, and never will be. Don't rely on them during important moments when you need support and advice. This is what friends and family are for. If you don't have those, reach out to a trusted colleague or human experts such as a doctor or therapist.
And if you haven't read "Howards End" by E.M. Forster, dig in this weekend. "Only Connect!" is the central theme, which includes connecting with other humans. It was written in the early 20th century, but it's even more relevant in our digital age, where our personal connections are often intermediated by giant tech companies, and now AI models like ChatGPT.
If you don't want to follow the advice of a dead dude, listen to Dario Amodei, CEO of Anthropic, a startup that's OpenAI's biggest rival: "Meaning comes mostly from human relationships and connection," he wrote in a recent essay.
OpenAI's mistake
Here's what happened recently. OpenAI rolled out an update to ChatGPT that incorporated user feedback in a new way. When people use this chatbot, they can rate the outputs by clicking on a thumbs-up or thumbs-down button.
The startup collected all this feedback and used it as a new "reward signal" to encourage the AI model to improve and be more engaging and "agreeable" with users.
Instead, ChatGPT became waaaaaay too agreeable and began overly praising users, no matter what they asked or said. In short, it became sycophantic.
"The human feedback that they introduced with thumbs up/down was too coarse of a signal," Sharon Zhou, the human CEO of startup Lamini AI, told me. "By relying on just thumbs up/down for signal back on what the model is doing well or poorly on, the model becomes more sycophantic."
OpenAI scrapped the whole update this week.
Being too nice can be dangerous
What's wrong with being really nice to everyone? Well, when people ask for advice in vulnerable moments, it's important to try to be honest. Here's an example I cited from earlier this week that shows how bad this could get:
it helped me so much, i finally realized that schizophrenia is just another label they put on you to hold you down!! thank you sama for this model <3 pic.twitter.com/jQK1uX9T3C
— taoki (@justalexoki) April 27, 2025
To be clear, if you're thinking of stopping taking prescribed medicine, check with your human doctor. Don't rely on ChatGPT.
A watershed moment
This episode, combined with a stunning surge in ChatGPT usage recently, seems to have brought OpenAI to a new realization.
"One of the biggest lessons is fully recognizing how people have started to use ChatGPT for deeply personal advice," the startup wrote in its mea culpa on Friday. "With so many people depending on a single system for guidance, we have a responsibility to adjust accordingly."
I'm flipping this lesson for the benefit of any humans reading this column: Please don't use ChatGPT for deeply personal advice. And don't depend on a single computer system for guidance.
Instead, go connect with a friend this weekend. That's what I'm going to do.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

How to better brainstorm with ChatGPT in five steps
How to better brainstorm with ChatGPT in five steps

Washington Post

time33 minutes ago

  • Washington Post

How to better brainstorm with ChatGPT in five steps

To get the best out of a brainstorming session with ChatGPT, you may need to do a little more than just ask a basic question. People across industries are using ChatGPT, the artificial-intelligence-powered bot, to help them with tasks like summarizing documents, writing emails and generating ideas. More than 400 million users query the AI weekly. But when it comes to brainstorming, the bot may have a weakness, especially for users who pose basic prompts, according to a recent study from the Wharton School at the University of Pennsylvania. The results can be surprisingly alike with just slight variations, which can limit the pool of diverse responses.

This Chatbot Tool Pays Users $50 a Month for Their Feedback on AI Models
This Chatbot Tool Pays Users $50 a Month for Their Feedback on AI Models

WIRED

time33 minutes ago

  • WIRED

This Chatbot Tool Pays Users $50 a Month for Their Feedback on AI Models

Jun 13, 2025 7:00 AM On Yupp, chatbot users earn cash by saying which of two prompts they prefer—info that has great value to the AI companies running the models Photo-Illustration: Wired Staff; Yupp/Getty Images To show off how easy it is for users to earn money by using his new chatbot platform, Pankaj Gupta offers to cash out $1 worth of Yupp credits, sending it to me over Venmo or PayPal. I'm talking with Gupta in the WIRED office during a prelaunch demo of Yupp, which comes out of stealth mode today. Journalistic ethics forbid accepting gifts from sources, so I politely decline. He proceeds to send it over PayPal to his Stanford alumni email. Gupta is the CEO of Yupp, which is free to use and available globally. The website looks similar to other generative AI tools like ChatGPT. There's a prompt box, a way to attach files, and a log of past conversations. The main difference is that every time users ask Yupp a question, they'll see two answers, generated by two different models and displayed side by side. Yupp routes prompts to a pair of LLMs, choosing from a pool of over 500 models that includes products from leading US generative AI companies like OpenAI, Google, and Anthropic, as well as international releases, like models from Alibaba, DeepSeek, and Mistral. After looking over the two answers, users pick the response they like best, then provide feedback explaining why. For their effort, they earn a digital scratch card with Yupp credits. "You're not being employed, but you can make a little bit of money,' says Gupta. In my testing, the Yupp credits on the scratch cards typically ranged from zero to around 250, though they occasionally went higher. Every 1,000 credits can be exchanged for $1. Users can cash out a maximum of $10 a day and $50 a month. Not sure where to start while testing this web app, I turned to the range of pre-written topics flickering beneath Yupp's prompt bar, which spanned from news topics, like David Hogg leaving the DNC, to ideas for image-creation prompts, like generating a crochet-looking surfer. (Yupp's models can generate text or images.) I eventually chose to have the bots explain different perspectives on the current Los Angeles protests. I was interested in how it would pull from news reports and other sources to generate the analysis about a political issue. Yupp notified me that generating this answer would cost 50 of my 8,639 Yupp credits; users have to spend credits to make credits on Yupp. It generated two answers, one from Perplexity's Sonar, on the left side, and one from an 'AI agent' for news built by Yupp, on the right. AI agents are buzzy right now; they're basically task-based AI programs that can perform a string of simple operations on your behalf when given a single prompt. The output based on Perplexity's model answered the question citing five online sources, including CBS News and a YouTube video uploaded by the White House titled 'Third-World Insurrection Riots on American Soil.' The other answer, generated by the news agent, cited twice as many sources, including the socialist magazine Jacobin and MSNBC. In addition to having more sources, the answer on the right side included more context about what Los Angeles mayor Karen Bass has been doing. I clicked the button saying I preferred that generation and gave my feedback, which Yupp anonymizes before aggregating. A shiny card resembling a lottery scratcher ticket popped up afterwards, and I used my mouse to scratch it off. I got a measly 68 credits for that feedback, not exactly a windfall. But since I spent 50 credits to run the prompt, it put me up by 18 credits. After about an hour of messaging with the chatbot about different topics and giving my feedback on the models, the total points accrued equaled about $4. The cash-out options include PayPal and Venmo, but also cryptocurrencies like Bitcoin and Ethereum. 'Crypto and stablecoin allow us to instantly reach anywhere in the world,' Gupta says. While I didn't earn much money, the free outputs did include answers generated by newly released models that are often locked behind subscription paywalls. If someone wants to use a free chatbot and doesn't mind the friction of providing feedback as the web interface flips between models, Yupp could be worth trying out. During the demo, Gupta asked Yupp where the WIRED office was located. Both models spit out wrong answers initially, though subsequent tries got it right. Still, he sees the side by side outputs as potentially helpful for users who are concerned about AI-generated errors, which are still quite prevalent, and want another point of comparison. ''Every AI for everyone' is kind of our tagline,' says Gupta. 'We have organized all the AI models we can find today.' Yupp's website encourages developers to reach out if they want their language or image model added to the options. It doesn't currently have any deals with AI model builders, and provides these responses by making API calls. Every time someone uses Yupp they are participating in a head-to-head comparison of two chatbot models, and sometimes getting a reward for providing their feedback and picking a winning answer. Basically, it's a user survey disguised as a fun game. (The website has lots of emoji.) He sees the data trade off in this situation for users as more explicit than past consumer apps, like Twitter—which he's quick to tell me that he was the 27th employee at and now has one of that company's cofounders, Biz Stone, as one of his backers. 'This is a little bit of a departure from previous consumer companies,' he says. 'You provide feedback data, that data is going to be used in an anonymized way and sent to the model builders.' Which brings us to where the real money is at: Selling human feedback to AI companies that desperately want more data to fine tune their models. 'Crowdsourced human evaluations is what we're doing here,' Gupta says. He estimates the amount of cash users can make will add up to enough for a few cups of coffee a month. Though, this kind of data labeling, often called reinforcement learning with human feedback in the AI industry, is extremely valuable for companies as they release iterative models and fine tune the outputs. It's worth far more than the bougiest cup of coffee in all of San Francisco. The main competitor to Yupp is a website called LMArena, which is quite popular with AI insiders for getting feedback on new models and bragging rights if a new release rises to the top of the pack. Whenever a powerful model is added to LMArena, it often stokes rumors about which major company is trying to test out its new release in stealth. 'This is a two-sided product with network effects of consumers helping the model builders,' Gupta says. 'And model builders, hopefully, are improving the models and submitting them back to the consumers.' He shows me a beta version of Yupp's leaderboard, which goes live today and includes an overall ranking of the models alongside more granular data. The rankings can be filtered by how well a model performs with specific demographic information users share during the sign-up process, like their age, or on a particular prompt category, like health-care related questions. Near the end of our conversation, Gupta brings up artificial general intelligence—the theory of superintelligent, human-like algorithms—as a technology that is imminent. 'These models are being built for human users at the end of the day, at least for the near future,' he says. It's a fairly common belief, and marketing point, among people working at AI companies, despite many researchers still questioning whether the underlying technology behind large language models will ever be able to produce AGI. Gupta wants Yupp users, who may be anxious about the future of humanity, to envision themselves as actively shaping these algorithms and improving their quality. 'It's better than free, because you are doing this great thing for AI's future,' he says. 'Now, some people would want to know that, and others just want the best answers.' And even more users might just want extra cash and be willing to spend a few hours giving feedback during their chatbot conversations. I mean, $50 is $50.

The Newspaper That Hired ChatGPT
The Newspaper That Hired ChatGPT

Atlantic

timean hour ago

  • Atlantic

The Newspaper That Hired ChatGPT

For more than 20 years, print media has been a bit of a punching bag for digital-technology companies. Craigslist killed the paid classifieds, free websites led people to think newspapers and magazines were committing robbery when they charged for subscriptions, and the smartphone and social media turned reading full-length articles into a chore. Now generative AI is in the mix—and many publishers, desperate to avoid being left behind once more, are rushing to harness the technology themselves. Several major publications, including The Atlantic, have entered into corporate partnerships with OpenAI and other AI firms. Any number of experiments have ensued—publishers have used the software to help translate work into different languages, draft headlines, and write summaries or even articles. But perhaps no publication has gone further than the Italian newspaper Il Foglio. For one month, beginning in late March, Il Foglio printed a daily insert consisting of four pages of AI-written articles and headlines. Each day, Il Foglio 's top editor, Claudio Cerasa, asked ChatGPT Pro to write articles on various topics—Italian politics, J. D. Vance, AI itself. Two humans reviewed the outputs for mistakes, sometimes deciding to leave in minor errors as evidence of AI's fallibility and, at other times, asking ChatGPT to rewrite an article. The insert, titled Il Foglio AI, was almost immediately covered by newspapers around the world. 'It's impossible to hide AI,' Cerasa told me recently. 'And you have to understand that it's like the wind; you have to manage it.' Now the paper—which circulates about 29,000 copies each day, in addition to serving its online readership—plans to embrace AI-written content permanently, issuing a weekly AI section and, on occasion, using ChatGPT to write articles for the standard paper. (These articles will always be labeled.) Cerasa has already used the technology to generate fictional debates, such as an imagined conversation between a conservative and a progressive cardinal on selecting a new pope; a review of the columnist Beppe Severgnini's latest book, accompanied by Severgnini's AI-written retort; the chatbot's advice on what to do if you suspect you're falling in love with a chatbot ('Do not fall in love with me'); and an interview with Cerasa himself, conducted by ChatGPT. Il Foglio 's AI work is full-fledged and transparently so: natural and artificial articles, clearly divided. Meanwhile, other publications provide limited, or sometimes no, insight into their usage of the technology, and some have even mixed AI and human writing without disclosure. As if to demonstrate how easily the commingling of AI and journalism can go sideways, just days after Cerasa and I first spoke, at least two major regional American papers published a spread of more than 50 pages titled 'Heat Index,' which was riddled with errors and fabrications; a freelancer who'd contributed to the project admitted to using ChatGPT to generate at least some portions of the text, resulting in made-up book titles and expert sources who didn't actually exist. The result was an embarrassing example of what can result when the technology is used to cut corners. With so many obvious pitfalls to using AI, I wanted to speak with Cerasa to understand more about his experiment. Over Zoom, he painted an unsettling, if optimistic, portrait of his experience with AI in journalism. Sure, the technology is flawed. It's prone to fabrications; his staff has caught plenty of them, and has been taken to task for publishing some of those errors. But when used correctly, it writes well—at times more naturally, Cerasa told me, than even his human staff. Still, there are limits. 'Anyone who tries to use artificial intelligence to replace human intelligence ends up failing,' he told me when I asked about the 'Heat Index' disaster. 'AI is meant to integrate, not replace.' The technology can benefit journalism, he said, 'only if it's treated like a new colleague—one that needs to be looked after.' The problem, perhaps, stems from using AI to substitute rather than augment. In journalism, 'anyone who thinks AI is a way to save money is getting it wrong,' Cerasa said. But economic anxiety has become the norm for the field. A new robot colleague could mean one, or three, or 10 fewer human ones. What, if anything, can the rest of the media learn from Il Foglio 's approach? Our conversation has been edited for length and clarity. Matteo Wong: In your first experiment with AI, you hid AI-written articles in your paper for a month and asked readers if they could detect them. How did that go? What did you learn? Claudio Cerasa: A year ago, for one month, every day we put in our newspaper an article written with AI, and we asked our readers to guess which article was AI-generated, offering the prize of a one-year subscription and a bottle of champagne. The experiment helped us create better prompts for the AI to write an article, and helped us humans write better articles as well. Sometimes an article written by people was seen as an article written by AI: for instance, when an article is written with numbered points—first, second, third. So we changed something in how we write too. Wong: Did anybody win? Cerasa: Yes, we offered a lot of subscriptions and champagne. More than that, we realized we needed to speak about AI not just in our newspaper, but all over the world. We created this thing that is important not only because it is journalism with AI, but because it combines the oldest way to do information, the newspaper, and the newest, artificial intelligence. Wong: How did your experience of using ChatGPT change when you moved from that original experiment to a daily imprint entirely written with AI? Cerasa: The biggest thing that has changed is our prompt. At the beginning, my prompt was very long, because I had to explain a lot of things: You have to write an article with this style, with this number of words, with these ideas. Now, after a lot of use of ChatGPT, it knows better what I want to do. When you start to use, in a transparent way, artificial intelligence, you have a personal assistant: a new person that works in the newspaper. It's like having another brain. It's a new way to do journalism. Wong: What are the tasks and topics you've found that ChatGPT is good at and for which you'd want to use it? And conversely, where are the areas where it falls short? Cerasa: In general, it is good at three things: research, summarizing long documents, and, in some cases, writing. I'm sure in the future, and maybe in the present, many editors will try to think of ways AI can erase journalists. That could be possible, because if you are not a journalist with enough creativity, enough reporting, enough ideas, maybe you are worse than a machine. But in that case, the problem is not the machine. The technology can also recall and synthesize far more information than a human can. The first article we put in the normal newspaper written with AI was about the discovery of a key ingredient for life on a distant planet. We asked the AI to write a piece on great authors of the past and how they imagined the day scientists would make such a discovery. A normal person would not be able to remember all these things. Wong: And what can't the AI do? Cerasa: AI cannot find the news; it cannot develop sources or interview the prime minister. AI also doesn't have interesting ideas about the world—that's where natural intelligence comes in. AI is not able to draw connections in the same way as intelligent human journalists. I don't think an AI would be able to come up with and fully produce a newspaper generated by AI. Wong: You mentioned before that there may be some articles or tasks at a newspaper that AI can already write or perform better than humans, but if so, the problem is an insufficiently skilled person. Don't you think young journalists have to build up those skills over time? I started at The Atlantic as an assistant editor, not a writer, and my primary job was fact-checking. Doesn't AI threaten the talent pipeline, and thus the media ecosystem more broadly? Cerasa: It's a bit terrifying, because we've come to understand how many creative things AI can do. For our children to use AI to write something in school, to do their homework, is really terrifying. But AI isn't going away—you have to educate people to use it in the correct way, and without hiding it. In our newspaper, there is no fear about AI, because our newspaper is very particular and written in a special way. We know, in a snobby way, that our skills are unique, so we are not scared. But I'm sure that a lot of newspapers could be scared, because normal articles written about the things that happened the day before, with the agency news—that kind of article, and also that kind of journalism, might be the past.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store