Scale AI left thousands of confidential AI training docs for Google, Meta, xAI publicly accessible: Report

Scale AI, the data-labelling startup at the heart of some of the biggest artificial intelligence projects in the world, has reportedly left sensitive documents and contractor data accessible to the public through unsecured Google Docs links, according to an investigation by Business Insider. The revelation comes at a time when the company is already under intense scrutiny following Meta's recent $14.3 billion investment for a 49 per cent stake. While Scale AI maintains that it operates independently, clients like Google and OpenAI are reportedly reconsidering their ties to the company amid increasing security concerns.advertisementA Business Insider report found that at least 85 Google Docs containing thousands of pages of sensitive content were publicly accessible to anyone who had the link. These documents reportedly included confidential instructions and internal assessments for AI training projects involving major tech firms like Google, Meta, and Elon's Musk xAI. Scale AI, which helps label and process training data for AI models, has come under fire not only for exposing project documents but also for leaving personal data of its contractors public. Spreadsheets reviewed by Business Insider showed private email addresses, pay disputes, and even lists categorising workers by performance, including some accused of 'cheating'.
Some of the files that have been exposed have details about Google's efforts to refine its Bard (now Gemini) chatbot using data from OpenAI's ChatGPT, while some others include feedback on Bard's weaknesses, such as struggling with complex queries. The documents were reportedly labelled 'confidential' by Google but remained open for public access. advertisementAmong the leaked documents was reportedly another confidential xAI project sent to Scale AI contractors, called Project Xylophone. It reportedly involved 700 prompts, which were meant to help train conversational abilities in scenarios ranging from casual banter to zombie apocalypse. Meta, too, apparently had multiple 'confidential' documents left accessible, including training audio files to help improve the emotional tone and safety of its chatbot responses. One file titled 'Good and Bad Folks' openly tagged dozens of contractors as low quality or flagged them for suspicious behaviour.Another document titled 'move all cheating taskers' listed hundreds of names and emails flagged for misconduct. Shockingly, some of these sheets were editable, meaning anyone with the URL could add or change entries.'We are conducting a thorough investigation and have disabled any user's ability to publicly share documents from Scale-managed systems,' a Scale AI spokesperson told Business Insider. The company said it remains committed to customer trust and has robust technical and policy safeguards in place to protect data.Some current and former contractors also told Business Insider that this system of using shared Google Docs was common at Scale and helped speed up operations across its army of freelance workers. Some workers claimed they still had access to old projects even after being reassigned, with documents continuing to receive updates from clients. Some also said that the system is 'incredibly janky.' - Ends

Hashtags

Business

Finance

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Google launches Gemini model for robots to run without internet connectivity

Time of India

24 minutes ago

Time of India

Google launches Gemini model for robots to run without internet connectivity

Google has launched a new Gemini model, which is specifically designed to run on robots without requiring internet connectivity. The tech giant describes the "Gemini Robotics On-Device" model as an efficient, on-device robotics model offering 'general-purpose dexterity' and faster task adaptation. This new version builds on the Gemini Robotics VLA (vision language action) model, which was introduced in March and brought Gemini 2.0's multimodal reasoning and real-world understanding to physical applications. By operating independently of a data network, the on-device model will support latency-sensitive applications and ensure stronger environments with unreliable or no connectivity. Google is also providing a Gemini Robotics SDK to assist developers. This SDK will allow them to evaluate Gemini Robotics On-Device for their specific tasks and environments, test the model within Google's MuJoCo physics simulator, and adapt it to new domains with a limited number of demonstrations (as few as 50 to 100). Developers can gain access to the SDK by signing up for Google's trusted tester program. Google's new Gemini model for robots: Capabilities and performance Google claims that Gemini Robotics On-Device is a lightweight robotics foundation model designed for bi-arm robots that enables advanced dexterous manipulation with minimal computational overhead. Built on the capabilities of Gemini Robotics, it supports rapid experimentation, fine-tuning for new tasks, and local low-latency inference. The company also promises that the model demonstrates strong generalisation across visual, semantic, and behavioural tasks, effectively following natural language instructions and completing complex actions like unzipping bags or folding clothes, all while operating directly on the robot. In tests, Gemini Robotics On-Device outperforms other on-device models, especially in challenging, out-of-distribution and multi-step scenarios. It can be fine-tuned with just 50–100 demonstrations, making it highly adaptable to new applications. Originally trained on ALOHA robots, the model was successfully adapted to the bi-arm Franka FR3 and Apollo humanoid robot, completing tasks like folding dresses and belt assembly. This marks the first availability of a VLA model for on-device fine-tuning, offering powerful robotics capabilities without cloud dependency. Redmi Pad 2: Know these Things Before Buying!

Android is currently optimised for…: Why Perplexity AI CEO Aravind Srinivas wants Google to rebuild its operating system

Time of India

an hour ago

Time of India

Android is currently optimised for…: Why Perplexity AI CEO Aravind Srinivas wants Google to rebuild its operating system

Perplexity AI CEO Aravind Srinivas wants Google to rebuild its Android operating system. He noted that Android is more optimised for the tech giant's ad-driven business model than for enabling AI-powered experiences for smartphone users. Srinivas took to the social media platform X (earlier Twitter) to share his opinion that highlights a potential conflict as AI assistants become more common in smartphones. With this post, he questions whether current platforms, particularly those tied to advertising like Android, can evolve into intelligent, agentic systems that will primarily serve users. Srinivas questions whether Android's current priorities are aligned with the emerging era of AI agents , which are designed to interact proactively with users. What Perplexity AI CEO Aravind Srinivas said about Android In his X post, Srinivas wrote: 'Android needs to be rebuilt for AI. It's currently optimised for preserving Google's ad business rather than a truly agentic OS.' With this post, he suggests that to achieve significant advancements in AI-first mobile computing, Google may need to make some fundamental changes to the operating system itself, rather than merely adding AI features as layers. This suggestion comes as Perplexity develops Comet, an AI browser that will compete with Google by offering query responses with inline citations. This criticism comes at a time when Google is under increasing pressure on several fronts. According to a recent report by Bloomberg, Apple executives have internally discussed the possibility of acquiring Perplexity AI, with M&A chief Adrian Perica reportedly raising the idea with senior leaders, including services head Eddy Cue. Recently, Srinivas also suggested that Google's key weakness lies in its heavy reliance on high-margin search advertising , which remains far more profitable than its other businesses, like YouTube, cloud services, or AI initiatives. At the recently held Sohn Investment Conference, Srinivas explained how the Android-maker is trapped by its success. He noted, 'This is the first time in two decades that Google is extremely vulnerable.'

Google rolls out AI mode search in India, powered by Gemini 2.5

Hans India

2 hours ago

Hans India

Google rolls out AI mode search in India, powered by Gemini 2.5

Google has officially launched its AI Mode search experience in India, bringing advanced, natural search capabilities to users through voice, visuals, and text. Initially piloted in the U.S., this experimental feature is now accessible in English via Search Labs—a platform that allows users to try early Google Search features and provide feedback. Once activated, AI Mode introduces a dedicated tab in the Google app and search interface. It leverages the powerful Gemini 2.5 model to enhance reasoning and allow for more complex, multi-step queries. Users can input questions through typing, voice commands, or even images, enabling deeper and more intuitive interactions with Search. This rollout is especially significant in India, where voice and visual search see widespread use. Hema Budaraju, Vice President of Product Management for Search at Google, noted that India leads globally in monthly Google Lens usage. AI Mode is part of Google's mission to make information universally accessible—regardless of how users choose to ask. The feature also enables follow-up questions and provides links to diverse sources, offering multiple perspectives on any topic. This aligns with the growing trend of users seeking not just answers, but richer understanding. Google highlighted that more than 1.5 billion people globally now interact with AI Overviews each month. These AI-generated summaries—featured at the top of search results—have led to a 10% increase in engagement for certain query types in both India and the U.S.