11 hours ago
Scale AI left thousands of confidential AI training docs for Google, Meta, xAI publicly accessible: Report
Scale AI, the data-labelling startup at the heart of some of the biggest artificial intelligence projects in the world, has reportedly left sensitive documents and contractor data accessible to the public through unsecured Google Docs links, according to an investigation by Business Insider. The revelation comes at a time when the company is already under intense scrutiny following Meta's recent $14.3 billion investment for a 49 per cent stake. While Scale AI maintains that it operates independently, clients like Google and OpenAI are reportedly reconsidering their ties to the company amid increasing security Business Insider report found that at least 85 Google Docs containing thousands of pages of sensitive content were publicly accessible to anyone who had the link. These documents reportedly included confidential instructions and internal assessments for AI training projects involving major tech firms like Google, Meta, and Elon's Musk xAI. Scale AI, which helps label and process training data for AI models, has come under fire not only for exposing project documents but also for leaving personal data of its contractors public. Spreadsheets reviewed by Business Insider showed private email addresses, pay disputes, and even lists categorising workers by performance, including some accused of 'cheating'.
Some of the files that have been exposed have details about Google's efforts to refine its Bard (now Gemini) chatbot using data from OpenAI's ChatGPT, while some others include feedback on Bard's weaknesses, such as struggling with complex queries. The documents were reportedly labelled 'confidential' by Google but remained open for public access. advertisementAmong the leaked documents was reportedly another confidential xAI project sent to Scale AI contractors, called Project Xylophone. It reportedly involved 700 prompts, which were meant to help train conversational abilities in scenarios ranging from casual banter to zombie apocalypse. Meta, too, apparently had multiple 'confidential' documents left accessible, including training audio files to help improve the emotional tone and safety of its chatbot responses. One file titled 'Good and Bad Folks' openly tagged dozens of contractors as low quality or flagged them for suspicious document titled 'move all cheating taskers' listed hundreds of names and emails flagged for misconduct. Shockingly, some of these sheets were editable, meaning anyone with the URL could add or change entries.'We are conducting a thorough investigation and have disabled any user's ability to publicly share documents from Scale-managed systems,' a Scale AI spokesperson told Business Insider. The company said it remains committed to customer trust and has robust technical and policy safeguards in place to protect current and former contractors also told Business Insider that this system of using shared Google Docs was common at Scale and helped speed up operations across its army of freelance workers. Some workers claimed they still had access to old projects even after being reassigned, with documents continuing to receive updates from clients. Some also said that the system is 'incredibly janky.' - Ends