Latest news with #BrowserBase

Cloudflare Accuses Perplexity of Bypassing Bot Blocks; AI Firm Denies Claims

Hans India

3 days ago

Business
Hans India

Cloudflare Accuses Perplexity of Bypassing Bot Blocks; AI Firm Denies Claims

A fresh controversy has erupted in the tech world as Cloudflare, a leading internet infrastructure provider, has accused AI startup Perplexity of engaging in deceptive web scraping practices. The company alleges that Perplexity accessed content from websites that had explicitly restricted such activity, drawing attention once again to the murky boundaries of AI, data access, and internet ethics. In a blog post published Monday, Cloudflare claimed it detected Perplexity scraping content from sites that had added rules in their files to block such bots. According to Cloudflare, the AI firm allegedly circumvented these blocks by disguising its crawler's identity, including tactics like changing its user-agent strings and using multiple IP addresses to evade detection. 'This activity was observed across tens of thousands of domains and millions of requests per day,' the blog post stated. Cloudflare said it relied on a combination of machine learning tools and traffic analysis to pinpoint Perplexity as the source of the behavior. It added that some of the requests impersonated legitimate browsers, including Google Chrome on macOS. Cloudflare said the scraping came to its attention after several of its clients reported suspicious traffic coming from Perplexity, despite efforts to block it. In response, Cloudflare has now removed Perplexity's bots from its list of verified crawlers and introduced additional measures to prevent similar activity in the future. Perplexity has strongly denied the accusations, pushing back in a detailed rebuttal. The AI startup dismissed the claims as a 'sales pitch,' arguing that Cloudflare's blog post reflects a fundamental misunderstanding of how AI assistant's function. 'When Perplexity fetches a webpage, it's because a user asked a specific question,' the company stated. It emphasized that its AI platform does not engage in traditional web crawling or mass data harvesting. Instead, it claimed its system only retrieves real-time information when prompted by user queries and does not store or use that content to train its AI models. Further defending itself, Perplexity said that Cloudflare had wrongly attributed some of the automated traffic to its systems. It pointed to a third-party service, BrowserBase, suggesting that only a minor portion of the requests in question originated from there. 'This is a basic traffic analysis failure,' Perplexity argued, accusing Cloudflare of presenting misleading data and diagrams. The dispute comes at a time when the lines between helpful AI tools and unauthorized bots are increasingly blurred. As more AI applications rely on real-time data, concerns are growing among website operators over how their content is accessed and used. While Cloudflare has yet to issue a follow-up to Perplexity's rebuttal, the clash has already fueled broader discussions about ethical web scraping, AI transparency, and the urgent need for standardized guidelines on digital content access. With both companies standing firm on their positions, this incident may become a touchstone case in the ongoing struggle between open web advocates and those demanding tighter content controls in the AI era.

Perplexity accused of bypassing blocks to secretly scrape websites, says Cloudflare

India Today

3 days ago

Business
India Today

Perplexity accused of bypassing blocks to secretly scrape websites, says Cloudflare

Internet infrastructure company Cloudflare has accused artificial intelligence startup Perplexity of scraping websites that had explicitly blocked such activity, sparking a fresh debate over the ethics and transparency of AI data collection practices. In a blog post published on Monday, Cloudflare said it had observed Perplexity circumventing website restrictions by disguising its identity. The company claimed Perplexity's crawler was 'ignoring web standards' and 'hiding its identity' through methods such as changing its user agent, the code that identifies a browser to a website, and using multiple network addresses to bypass activity was observed across tens of thousands of domains and millions of requests per day,' Cloudflare said. The infrastructure giant said it had used machine learning and network signals to confirm the source of the behaviour, which included impersonating legitimate browsers such as Google Chrome on to Cloudflare, the behaviour first came to light after multiple customers complained that Perplexity was crawling their websites despite having added specific rules in their files to block such activity. The company has since removed Perplexity's bots from its verified list and implemented new measures to block them. Perplexity, however, has pushed back against the claims, calling them a 'sales pitch' and arguing that Cloudflare fundamentally misunderstands how modern AI assistants work. In a detailed rebuttal shared by the company, Perplexity argued that it does not engage in traditional web crawling to build massive data sets. Instead, it said, its platform uses 'user-driven agents' that fetch content only when a person asks a question requiring real-time information.'When Perplexity fetches a webpage, it's because a user asked a specific question,' the company explained, insisting that the fetched data is not stored or used for training AI also accused Cloudflare of misattributing automated traffic from a third-party service, BrowserBase, to its own systems. It claimed only a small fraction of traffic came from this service and denied using it for general web scraping. 'This is a basic traffic analysis failure,' Perplexity said, accusing Cloudflare of publishing inaccurate diagrams and misleading dispute highlights growing tensions between AI companies that rely on open web data and website operators seeking to control how their content is used. As AI tools become more sophisticated and reliant on real-time information, the line between helpful digital assistant and unwanted bot has become increasingly has not responded to Perplexity's latest statements, and the controversy is likely to fuel further discussion on the need for clearer guidelines around AI data access and ethical web scraping.- Ends

Latest news with #BrowserBase

Cloudflare Accuses Perplexity of Bypassing Bot Blocks; AI Firm Denies Claims

Perplexity accused of bypassing blocks to secretly scrape websites, says Cloudflare

Get Started Now: Download the App