19 hours ago
‘This is coming for everyone': A new kind of AI bot takes over the web
People are replacing Google search with artificial intelligence tools like ChatGPT, a major shift that has unleashed a new kind of bot loose on the web.
To offer users a tidy AI summary instead of Google's '10 blue links,' companies such as OpenAI and Anthropic have started sending out bots to retrieve and recap content in real time. They are scraping webpages and loading relevant content into the AI's memory and 'reading' far more content than a human ever would.
Subscribe to The Post Most newsletter for the most important and interesting stories from The Washington Post.
According to data shared exclusively with The Washington Post, traffic from retrieval bots grew 49 percent in the first quarter of 2025 from the fourth quarter of 2024. The data is from TollBit, a New York-based start-up that helps news publishers monitor and make money when AI companies use their content.
TollBit's report, based on data from 266 websites - half of which are run by national and local news organizations - suggests that the growth of bots that retrieve information when a user prompts an AI model is on an exponential curve.
'It starts with publishers, but this is coming for everyone,' Toshit Panigrahi, CEO and co-founder of TollBit, said in an interview.
Panigrahi said that this kind of bot traffic, which can be hard for websites to detect, reflects growing demand for content, even as AI tools devastate traffic to news sites and other online platforms. 'Human eyeballs to your site decreased. But the net amount of content access, we believe, fundamentally is going to explode,' he said.
A spokesperson for OpenAI said that referral traffic to publishers from ChatGPT searches may be lower in quantity but that it reflects a stronger user intent compared with casual web browsing.
To capitalize on this shift, websites will need to reorient themselves to AI visitors rather than human ones, Panigrahi said. But he also acknowledged that squeezing payment for content when AI companies argue that scraping online data is fair use will be an uphill climb, especially as leading players make their newest AI visitors even harder to identify.
Debate around the AI industry's use of online content has centered on the gargantuan amounts of text needed to train the AI models that power tools like ChatGPT. To obtain that data, tech companies use bots that scrape the open web for free, which has led to a raft of lawsuits alleging copyright theft from book authors and media companies, including a New York Times lawsuit against OpenAI. Other news publishers have opted for licensing deals. (In April, The Washington Post inked a deal with OpenAI.)
In the past eight months, as chatbots have evolved to incorporate features like web search and 'reasoning' to answer more complex queries, traffic for retrieval bots has skyrocketed. It grew 2.5 times as fast as traffic for bots that scrape data for training between the fourth quarter of 2024 and the first quarter of 2025, according to TollBit's report.
Panigrahi said TollBit's data may underestimate the magnitude of this change because it doesn't reflect bots that AI companies send out on behalf of AI 'agents' that can complete tasks on a user's behalf, like ordering takeout from DoorDash.
The start-up's findings also add a new dimension to mounting evidence that the modern internet - optimized for Google search results and social media algorithms - will have to be restructured as the popularity of AI answers grows.
'To think of it as, 'Well, I'm optimizing my search for humans' is missing out on a big opportunity,' he said.
Installing TollBit's analytics platform is free for news publishers, and the company has more than 2,000 clients, many of which are struggling with these seismic changes, according to data in the report.
Although news publishers and other websites can implement blockers to prevent various AI bots from scraping their content, TollBit found that more than 26 million AI scrapes bypassed those blockers in March alone. Some AI companies claim bots for AI agents don't need to follow bot instructions because they are acting on behalf of a user.
Mark Howard, chief operating officer for the media company Time, a TollBit client, said the start-up's traffic data has helped Time negotiate content licensing deals with AI companies including OpenAI and the search engine Perplexity.
But the market to fairly compensate publishers is far from established, Howard said. 'The vast majority of the AI bots out there absolutely are not sourcing the content through any kind of paid mechanism. … There is a very, very long way to go.'
Related Content
He's dying. She's pregnant. His one last wish is to fight his cancer long enough to see his baby.
The U.S. granted these journalists asylum. Then it fired them.
'Enough is enough.' Why Los Angeles is still protesting, despite fear.