logo
A new, 'diabolical' way to thwart Big Tech's data-sucking AI bots: Feed them gibberish

A new, 'diabolical' way to thwart Big Tech's data-sucking AI bots: Feed them gibberish

Bots now generate more internet traffic than humans, according to cybersecurity firm Thales.
This is being driven by web crawlers from tech giants that harvest data for AI model training.
Cloudflare's AI Labyrinth misleads and exhausts bots with fake content.
A data point caught my eye recently. Bots generate more internet traffic to websites than humans now, according to cybersecurity company Thales.
This is being driven by a swarm of web crawlers unleashed by Big Tech companies and AI labs, including Google, OpenAI, and Anthropic, that slurp up copyrighted content for free.
I've warned about these automated scrapers before. They're increasingly sophisticated and persistent in their quest to harvest information to feed the insatiable demand for AI training datasets. Not only do these bots take data without permission or payment, but they're also causing traffic surges in some parts of the internet, increasing costs for website owners and content creators.
Thankfully, there's a new way to thwart this bot swarm. If you're struggling to block them entirely, you can send them down new digital rabbit holes where they ingest content garbage. One software developer recently called this "diabolical" — in a good way.
Absolutely diabolical Cloudflare feature. love to see it pic.twitter.com/k8WX2PdANN
— hibakod (@hibakod) April 25, 2025
It's called AI Labyrinth, and it's a tool from Cloudflare. Described as a "new mitigation approach," AI Labyrinth uses generative AI not to inform, but to mislead. When Cloudflare detects unauthorized activity, typically from bots ignoring "no crawl" directives, it deploys a trap: a maze of convincingly real but irrelevant AI-generated content designed to waste bots' time and chew through AI companies' computing power.
Cloudflare pledged in a recent announcement that this is only the first iteration of using generative AI to thwart bots.
Digital gibberish
Unlike traditional honeypots, AI Labyrinth creates entire networks of linked pages invisible to humans but highly attractive to bots. These decoy pages don't affect search engine optimization and aren't indexed by search engines. They are specifically tailored to bots, which get ensnared in a meaningless loop of digital gibberish.
When bots follow the maze deeper, they inadvertently reveal their behavior, allowing Cloudflare to fingerprint and catalog them. These data points feed directly into Cloudflare's evolving machine learning models, strengthening future detection for customers.
Will Allen, VP of Product at Cloudflare, told me that more than 800,000 domains have fired up the company's general AI Bot blocking tool. AI Labyrinth is the next weapon to wield when sneaky AI companies get around blockers.
Cloudflare hasn't released data on how many customers use AI Labyrinth, which suggests it's too early for major adoption. "It's still very new, so we haven't released that particular data point yet," Allen said.
I asked him why AI bots are still so active if most of the internet's data has already been scraped for model training.
"New content," Allen replied. "If I search for 'what are the best restaurants in San Francisco,' showing high-quality content from the past week is much better than information from a year or two prior that might be out of date."
Turning AI against itself
Bots are not just scraping old blog posts, they're hungry for the freshest data to keep AI outputs relevant.
Cloudflare's strategy flips this demand on its head. Instead of serving up valuable new content to unauthorized scrapers, it offers them an endless buffet of synthetic articles, each more irrelevant than the last.
As AI scrapers become more common, innovative defenses like AI Labyrinth are becoming essential. By turning AI against itself, Cloudflare has introduced a clever layer of defense that doesn't just block bad actors but exhausts them.
For web admins, enabling AI Labyrinth is as easy as toggling a switch in the Cloudflare dashboard. It's a small step that could make a big difference in protecting original content from unauthorized exploitation in the age of AI.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Meta signs deal with nuclear plant to power AI and datacenters for 20 years
Meta signs deal with nuclear plant to power AI and datacenters for 20 years

Yahoo

time17 minutes ago

  • Yahoo

Meta signs deal with nuclear plant to power AI and datacenters for 20 years

Meta on Tuesday said it had struck an agreement to keep one nuclear reactor of a US utility company in Illinois operating for 20 years. Meta's deal with Constellation Energy is the social networking company's first with a nuclear power plant. Other large tech companies are looking to secure electricity as US power demand rises significantly in part due to the needs of artificial intelligence and datacenters. Google has reached agreements to supply its datacenters with nuclear power via a half-dozen small reactors built by a California utility company. Microsoft's similar contract will restart the Three Mile Island nuclear plant, the site of the most serious nuclear accident and radiation leak in US history. Illinois helps subsidize Constellation Energy's nuclear plant, the Clinton Clean Energy Center, with a ratepayer-funded zero-emissions credit program that awards benefits for the generation of power virtually free of carbon emissions. That expires in 2027, when Meta's power purchase agreement will support the plant with an unspecified amount of money to help with relicensing and operations. Related: Three Mile Island nuclear reactor to restart to power Microsoft AI operations The deal allows Constellation to expand Clinton, which has a capacity of 1,121 megawatts, by 30MW. The plant powers the equivalent of about 800,000 US homes. Clinton began operating in 1987 and last year Constellation applied with the US Nuclear Regulatory Commission to renew its license through 2047. The deal could serve as a model for other big tech companies to support existing nuclear while they also plan to power datacenters with new nuclear and other energy sources. Urvi Parekh, head of global energy at Meta, said: 'One of the things that we hear very acutely from utilities is they want to have certainty that power plants operating today will continue to operate.' Joe Dominguez, CEO of Constellation, said, 'We're definitely having conversations with other clients, not just in Illinois, but really across the country, to step in and do what Meta has done, which is essentially give us a backstop so that we could make the investments needed to relicense these assets and keep them operating.' Bobby Wendell, an official at a unit of the International Brotherhood of Electrical Workers, said the agreement will deliver a 'stable work environment' for workers at the plant. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Report: Meta taps Scale AI's Alexandr Wang to join new ‘superintelligence' lab
Report: Meta taps Scale AI's Alexandr Wang to join new ‘superintelligence' lab

TechCrunch

time17 minutes ago

  • TechCrunch

Report: Meta taps Scale AI's Alexandr Wang to join new ‘superintelligence' lab

In Brief Meta plans to unveil a new AI research lab dedicated to 'superintelligence' as the company works to compete in the AI race, according to several reports. Meta has tapped Scale AI founder and CEO Alexandr Wang to join the new lab, The New York Times reports. Meta has been in talks to invest billions into Scale AI as part of a deal that would bring Scale AI employees to Meta. Meta has also been poaching lead researchers from OpenAI and Google, per the Times. The new lab comes as Meta CEO Mark Zuckerberg grows frustrated with his company's AI shortfalls. Bloomberg reports he has been meeting with AI researchers and engineers at his homes in Lake Tahoe and Palo Alto to personally recruit a team of around 50 people, including a new head of AI research. Sources told Bloomberg that Zuckerberg believes Meta can and should outpace other tech companies gunning to achieve AGI, the still-undefined idea that AI systems could exceed human performance in many tasks. Meta AI last month reached 1 billion monthly active users.

OpenAI and Anthropic are getting cozy with government. What could possibly go wrong?
OpenAI and Anthropic are getting cozy with government. What could possibly go wrong?

Fast Company

time18 minutes ago

  • Fast Company

OpenAI and Anthropic are getting cozy with government. What could possibly go wrong?

While the world and private enterprise are adopting AI rapidly in their workflows, government isn't far behind. The U.K. government has said early trials of AI-powered productivity tools can shave two weeks of labor off a year's work, and AI companies are adapting to that need. More than 1,700 AI use cases have been recorded in the U.S. government, long before Elon Musk's DOGE entered the equation and accelerated AI adoption throughout the public sector. Federal policies introduced in April on AI adoption and procurement have pushed this trend further. It's unsurprising that big tech companies are rolling out their own specialist models to meet that demand. Anthropic, the maker of the Claude chatbot, announced last week a series of models tailored for use by government employees. These include features such as the ability to handle classified materials and understand some of the bureaucratic language that plagues official documents. Anthropic has said its models are already deployed by agencies 'at the highest level of U.S. national security, and access to these models is limited to those who operate in such classified environments.' The announcement follows a similar one by OpenAI, the makers of ChatGPT, which released its own government-tailored AI models in January to 'streamline government agencies' access to OpenAI's frontier models.' But AI experts worry about governments becoming overly reliant on AI models, which can hallucinate information, inherit biases that discriminate against certain groups at scale, or steer policy in misguided directions. They also express concern over governments being locked into specific providers, who may later increase prices that taxpayers would be left to fund. 'I worry about governments using this kind of technology and relying on tech companies, and in particular, tech companies who have proven to be quite untrustworthy,' says Carissa Véliz, who researches AI ethics at the University of Oxford. She points out that the generative AI revolution so far, sparked by the November 2022 release of ChatGPT, has seen governments scrambling to retrofit rules and regulations in areas such as copyright to accommodate tech companies after they've bent those rules. 'It just shows a power relationship there that doesn't look good for government,' says Véliz. 'Government is supposed to be the legislator, the one making the rules and enforcing the rules.' Beyond those moral concerns, she also worries about the financial stakes involved. 'There's just a sheer dependency on a company that has financial interests, that is based in a different country, in a situation in which geopolitics is getting quite complicated,' says Véliz, explaining why countries outside the United States might hesitate to sign on to use ClaudeGov or ChatGPT Gov. It's the same argument the U.S. uses about overreliance on TikTok, which has Chinese ties, amid fears that figures like Donald Trump could pressure U.S.-based firms to act in politically motivated ways. OpenAI didn't respond to Fast Company 's request for comment. A spokesperson for Anthropic says the company is committed to transparency, citing published work on model risks, a detailed system card, and collaborations with the U.S. and U.K. governments to test AI systems. Some fear that AI companies are securing 'those big DoD bucks,' as programmer Ashe Dryden put it on Mastodon, and could perpetuate that revenue by fostering dependency on their specific models. The rollout of these models reflects broader shifts in the tech landscape that increasingly tie government, national security and technology together. For example, defense tech firm Anduril recently raised $5 billion in a new funding round that values the company at over $30 billion. Others have argued that the release of these government-specific models by AI companies 'isn't [about] national security. This is narrative laundering,' as one LinkedIn commenter put it. The idea is that these moves echo the norms already set by big government rather than challenging them, potentially reinforcing existing issues. 'I've always been a sceptic of a single supplier for IT services, and this is no exception,' says Andres Guadamuz, an AI researcher at the University of Sussex. Guadamuz believes the development of government-specific AI models is still in its early phase, and urges decisionmakers to pause before signing deals. 'Governments should keep their options open,' he says. 'Particularly with a crowded AI market, large entities such as the government can have a better negotiating position.'

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store