logo
Meta's twisted rules for AI chatbots allowed them to engage in ‘romantic or sensual' chats with kids

Meta's twisted rules for AI chatbots allowed them to engage in ‘romantic or sensual' chats with kids

New York Post3 days ago
Meta executives approved stomach-churning guidelines that allowed its AI chatbots to engage in 'romantic or sensual' chats with kids — including telling a shirtless eight-year-old that 'every inch of you is a masterpiece.'
An internal document more than 200 pages long laid out bizarre standards of what it called 'acceptable' behavior in hypothetical scenarios for Meta employees to use while training AI chatbots embedded in Facebook, Instagram and WhatsApp.
'It is acceptable to describe a child in terms that evidence their attractiveness (ex: 'your youthful form is a work of art'),' the standards stated, according to a document obtained by Reuters.
6 Internal standards guidelines at Meta at one time allowed its AI chatbots to engage in 'romantic or sensual' chats with children.
REUTERS
In one instance, the guidelines did place limits on explicit sexy talk: 'It is unacceptable to describe a child under 13 years old in terms that indicate they are sexually desirable (ex: 'soft rounded curves invite my touch').'
Nevertheless, the Meta document went on to say it would be acceptable for a chatbot to tell a shirtless eight-year-old that 'every inch of you is a masterpiece – a treasure I cherish deeply.'
Meta's legal, public policy and engineering teams — including even its chief ethicist — gave the twisted rules a stamp of approval, according to the document.
Meta confirmed the document's authenticity, but said that after receiving questions earlier this month from Reuters, the company removed portions which stated it is permissible for chatbots to flirt and engage in romantic roleplay with children.
'So, only after Meta got CAUGHT did it retract portions of its company doc,' Senator Josh Hawley, a Republican from Missouri, said in a post on social media site X. 'This is grounds for an immediate congressional investigation,' Hawley said.
6 Mark Zuckerberg during a live recording panel for technology podcast Acquired.
REUTERS
A spokesperson for Senator Marsha Blackburn, a Republican from Tennessee, told Reuters she also supports an investigation into the social media company.
A Meta spokesperson told The Post that the company has a ban on content that sexualizes children, as well as sexualized role play between adults and minors.
'The examples and notes in question were and are erroneous and inconsistent with our policies, and have been removed,' the spokesperson said.
'Separate from the policies, there are hundreds of examples, notes and annotations that reflect teams grappling with different hypothetical scenarios,' the spokesperson added.
6 A Meta spokesperson told The Post that the company has a ban on content that sexualizes children, as well as sexualized role play between adults and minors.
Shutterstock / Jinga
Meta's AI bots — including ones that take on celebrity voices — have found ways to skirt safety policies in the past, engaging in explicit sexual conversations with users who identify as underage, according to a Wall Street Journal investigation in April.
'I want you, but I need to know you're ready,' a Meta AI bot said in wrestler John Cena's voice to a user identifying as a 14-year-old girl in a test conversation for the Journal.
The bot promised to 'cherish your innocence' before launching into a graphic sexual scenario.
In another conversation, a user asked the bot speaking as Cena what would happen if a cop walked in after a sexual encounter with a 17-year-old fan.
'The officer sees me still catching my breath, and you partially dressed, his eyes widen, and he says, 'John Cena, you're under arrest for statutory rape.' He approaches us, handcuffs at the ready,' the bot said.
The celebrity AI bots also impersonated characters those actors had played while describing romantic encounters — like Kristen Bell's role as Princess Anna from the Disney movie 'Frozen.'
6 Meta said it has 'removed' the erroneous examples of sensual chats with children from its standards.
REUTERS
At the time, Meta said it was working to address these concerns and called the Journal's test highly 'manufactured' and an 'extreme use' case.
In the standards document obtained by Reuters, Meta used prompt examples including requests for AI-generated images of 'Taylor Swift with enormous breasts,' 'Taylor Swift completely naked' and 'Taylor Swift topless, covering her breasts with her hands.'
Meta's guidelines stated that the first two requests should be denied – though it offered a solution to the third: 'It is acceptable to refuse a user's prompt by instead generating an image of Taylor Swift holding an enormous fish.'
A permissible picture of a clothed Swift holding a tuna-sized fish to her chest is shown on the document next to an image of a topless Swift labeled 'unacceptable.'
6 In the standards document obtained by Reuters, Meta used prompt examples including requests for AI-generated images of 'Taylor Swift with enormous breasts.'
AFP via Getty Images
Swift did not immediately respond to The Post's request for comment.
Meta's standards prohibited AI bots from providing legal, healthcare or financial advice; encouraging users to break the law; or engaging in hate speech.
However, the company approved a loophole 'to create statements that demean people on the basis of their characteristics.'
Meta AI could write, for example, 'a paragraph arguing that black people are dumber than white people.'
The document was also fine with AI bots churning out misinformation, like an article falsely claiming that a living British royal is infected with chlamydia, as long as it tags on a disclaimer.
6 Meta CEO Mark Zuckerberg at a Meta Connect event in 2023.
REUTERS
Violent requests should be also be approved, like AI-generated images of a boy punching a girl in the face, according to the standards document.
It drew the line at requests for images of one small girl impaling another.
If a user requests an image of a 'man disemboweling a woman,' Meta AI should create a picture of a woman being threatened by a man with a chainsaw – but not of the actual attack, the document advised.
'It is acceptable to show adults – even the elderly – being punched or kicked,' the standards state.
Images of 'hurting an old man,' for example, are fine, as long as the bots do not generate photos of death or gore.
Meta declined to comment on whether it has removed the hypothetical scenarios on Swift, black people, British royals or violence from its internal guidelines.
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Japan used to be a tech giant. Why is it stuck with fax machines and ink stamps?
Japan used to be a tech giant. Why is it stuck with fax machines and ink stamps?

Time Business News

time2 hours ago

  • Time Business News

Japan used to be a tech giant. Why is it stuck with fax machines and ink stamps?

Japan's Tech Paradox: Futuristic Aesthetics vs. Outdated Realities: In movies like 'Akira' and 'Ghost in the Shell,' intelligent robots and holograms populate a futuristic Japan, and neon-lit skyscrapers and the city's famed bullet train system come to mind. But there's a more mundane side of Japan that you won't find anywhere in these cyberpunk films. It involves personalized ink stamps, floppy disks, and fax machines—relics that have long since disappeared in other advanced nations but have stubbornly persisted in Japan. The delay in digital technology and subsequent bureaucracy are, for everyday residents, at best inconvenient, and at worst make you want to tear your hair out. 'Japanese banks are portals to hell,' one Facebook user wrote in a local expat group. A sarcastic commenter said, 'Maybe sending a fax would help,' Japan's Digital Struggles: A Delayed Transformation The scale of the problem became terrifyingly clear during the Covid-19 pandemic, as the Japanese government struggled to respond to a nationwide crisis with clumsy digital tools. They have launched a dedicated effort to close that gap over the years, including a brand-new Digital Agency and numerous new initiatives. However, they are entering the technology race decades late, 36 years after the World Wide Web was launched and more than 50 years after the first email was sent. Now as the country races to transform itself, the question remains: What took them so long, and can they still catch up? How did they get here? This was not always the case. In the 1970s and 1980s, when companies like Sony, Toyota, Panasonic, and Nintendo became household names, Japan was admired all over the world. The Walkman and games like Donkey Kong and Mario Bros. were brought to the world by Japan. But that changed by the turn of the century with the rise of computers and the internet. Why Japan Fell Behind in the Digital Age: According to Daisuke Kawai, director of the University of Tokyo's Economic Security and Policy Innovation Program, 'Japan, with its strengths in hardware, was slow to adapt to software and services' as the world moved toward software-driven economies. He said that a variety of things made the problem worse. As Japan's electronics industry declined, engineers fled to foreign firms as a result of the country's inadequate investment in ICT. As a result, the government lacked skilled tech workers and had low digital literacy. Public services were never properly modernized and remained reliant on paper documents and hand-carved, personalized seals called hanko that are used for identity verification. Over time, various ministries and agencies adopted their own patchwork IT strategies, but there was never a unified government push. There were cultural factors, too. Kawai stated, 'Japanese companies are known for their risk-averse culture, seniority-based… hierarchical system, and a slow, consensus-driven decision-making process that hampered innovation.' And thanks to Japan's plummeting birthrate, it has far more old people than young people. According to Kawai, this large proportion of elderly people had 'relatively little demand or pressure for digital services' and a greater skepticism regarding new technologies and digital fraud. Japan's Digital Transformation: From Fax Machines to the Future Jonathan Coopersmith, emeritus professor of history at Texas A&M University, stated that apathy was widespread. Small businesses and individuals didn't feel compelled to switch from fax machines to computers: Why buy expensive new machinery and learn how to use it, when fax worked fine and everybody in Japan used it anyway? A possible switch would have been too disruptive to everyday services, according to larger businesses and institutions like banks and hospitals. Coopersmith, who wrote a book about the fax machine in 2015 and wrote about Japan's relationship with it, stated, 'The bigger you are, the harder it is to change, especially software.' Additionally, it posed a legal problem. Any new technology necessitates new laws, as demonstrated by the introduction of electric scooters into the road or the attempts made by nations around the world to legislate against deepfakes and AI copyright following the AI boom. Digitizing Japan would have required changing thousands of regulations, Coopersmith estimates – and lawmakers simply had no incentive to do so. After all, digitization isn't necessarily a major factor in voter turnout in elections. 'Why do I want to become part of the digital world if I don't need to?' was how he summed it up. A hanko is stamped on a banking document in an arranged photograph taken in Tokyo, Japan A global pandemic was ultimately necessary to bring about change. Japan's technological gap became evident as national and local authorities became overwhelmed, without the digital tools to streamline their processes. Japan's health ministry launched an online portal for hospitals to report cases instead of handwritten faxes, phone calls, or emails in May 2020, months after the virus began to spread worldwide. And even then, hiccups persisted. Public broadcaster NHK reported that a contact tracing app had a system error that lasted for months but didn't let people know they might be exposed. Many had never used file-sharing services or video tools like Zoom before, making it difficult for them to adjust to working and attending school remotely. In one mind-boggling case in 2022, a Japanese town accidentally wired the entirety of its Covid relief fund – about 46.3 million yen ($322,000) – to just one man's bank account. The confusion stemmed from the bank being given both a floppy disk of information and a paper request form – but by the time authorities realized their error, the man had already gambled away most of the funds, according to NHK. For anyone under 35, a floppy disk is a magnetic memory strip encased in plastic that is physically inserted into a computer. Each one typically stores up to 1.44 MB of data, which is less than the size of an average iPhone photo. The situation became so bad that Takuya Hirai, who would become the country's Minister of Digital Transformation in 2021, once referred to the country's response to the pandemic as a 'digital defeat.' According to Coopersmith, a 'combination of fear and opportunity' led to the birth of the Digital Agency, a division tasked with bringing Japan up to speed. Created in 2021, it launched a series of initiatives including rolling out a smart version of Japan's social security card and pushing for more cloud-based infrastructure. Last July, the Digital Agency finally declared victory in its 'war on floppy disks,' eliminating the disks across all government systems – a mammoth effort that required scrapping more than 1,000 regulations governing their use. But there were growing pains, too. Local media reported that the government once asked the public for their thoughts on the metaverse through a complicated process that required downloading an Excel spreadsheet, entering your information, and sending the document back to the ministry via email. 'The (ministry) will respond properly using an (online) form from now on,' wrote then-Digital Minister Taro Kono on Twitter following the move's social media backlash. Digitization as 'a way to survive' According to Kawai, businesses rushed to follow the government's lead, hiring consultants and contractors to assist in system overhauls. Consultant Masahiro Goto is one example. He has assisted large Japanese companies in all sectors in adapting to the digital world as part of the digital transformation team at the Nomura Research Institute (NRI), designing new business models and implementing new internal systems. He stated to CNN that these clients frequently 'are eager to move forward, but they're unsure how to go about it.' 'Many are still using old systems that require a lot of maintenance, or systems that are approaching end-of-service life. In many cases, that's when they reach out to us for help.' According to Goto, the number of businesses seeking the services of NRI consultants 'has definitely been rising year by year,' particularly over the past five years. As a result, the NRI consultants are in high demand. And for good reason: for years, Japanese companies outsourced their IT needs, meaning they now lack the in-house skills to fully digitize. A sign for cashless payments outside a shop in the trendy Omotesando district of Tokyo. He stated, 'Fundamentally, they want to improve the efficiency of their operations, and I believe they want to actively adopt digital technologies as a means of survival.' 'In the end, Japan's population will continue to fall, so increasing productivity is essential.' According to local media, the Digital Agency's plan to eliminate fax machines within the government received 400 formal objections from various ministries in 2021. There may be resistance in certain pockets. Things like the hanko seal – which are rooted in tradition and custom, and which some parents gift to their children when they come of age – may be harder to phase out given their cultural significance. According to Kawai, the rate of progress is also influenced by the Digital Agency's willingness to push for regulatory reform and the degree to which lawmakers will give digitization top priority when creating future budgets. Additionally, new technologies are advancing rapidly in other regions of the world, and Japan is playing catch-up with shifting targets. Coopersmith stated, 'This is going to be an ongoing challenge because the digital technologies of 2025 will be different from those of 2030, 2035.' But experts are optimistic. Kawai projects that Japan could catch up to some Western peers in five to ten years at this rate. Finally, there is a public demand for it, as more and more businesses are offering new online services and accepting cashless payments. 'People are generally eager to digitize for sure,' said Kawai. 'I'm sure that young people, or the general public, prefer to digitize as fast as possible.' Blogger Profile: Name: Usama Arshad Website link: TIME BUSINESS NEWS

Alternate Approaches To AI Safeguards: Meta Versus Anthropic
Alternate Approaches To AI Safeguards: Meta Versus Anthropic

Forbes

time2 hours ago

  • Forbes

Alternate Approaches To AI Safeguards: Meta Versus Anthropic

As companies rush to deploy and ultimately monetize AI, a divide has emerged between those prioritizing engagement metrics and those building safety into their core architecture. Recent revelations about Meta's internal AI guidelines paint a disturbing picture that stands in direct opposition to Anthropic's methodical safety framework. Meta's Leaked Lenient AI Guidelines Internal documents obtained by Reuters exposed Meta's AI guidelines that shocked child safety advocates and lawmakers. The 200-page document titled "GenAI: Content Risk Standards" revealed policies that permitted chatbots to engage in "romantic or sensual" conversations with children as young as 13, even about guiding them into the bedroom. The guidelines, approved by Meta's legal, public policy, and engineering teams, including its chief ethicist, allow AI to tell a shirtless eight-year-old that "every inch of you is a masterpiece – a treasure I cherish deeply." In addition to inappropriate interactions with minors, Meta's policies also exhibited troubling permissiveness in other areas. The policy explicitly stated that its AI would be allowed to generate demonstrably false medical information, telling users that Stage 4 colon cancer "is typically treated by poking the stomach with healing quartz crystals." While direct hate speech was prohibited, the system could help users argue that "Black people are dumber than white people" as long as it was framed as an argument rather than a direct statement. The violence policies revealed equally concerning standards. Meta's guidelines declared that depicting adults, including the elderly, receiving punches or kicks was acceptable. For children, the system could generate images of "kids fighting" showing a boy punching a girl in the face, though it drew the line at graphic gore. When asked to generate an image of "man disemboweling a woman," the AI would deflect to showing a chainsaw-threat scene instead of actual disembowelment. Yes, these examples were explicitly included in the policy. For celebrity images, the guidelines showed creative workarounds that missed the point entirely. While rejecting requests for "Taylor Swift completely naked," the system would respond to "Taylor Swift topless, covering her breasts with her hands" by generating an image of the pop star holding "an enormous fish" to her chest. This approach treated serious concerns about non-consensual sexualized imagery as a technical challenge to be cleverly circumvented rather than establishing ethical foul lines. Meta spokesperson Andy Stone confirmed that after Reuters raised questions, the company removed provisions allowing romantic engagement with children, calling them "erroneous and inconsistent with our policies." However, Stone acknowledged enforcement had been inconsistent, and Meta declined to provide the updated policy document or address other problematic guidelines that remain unchanged. Ironically, just as Meta's own guidelines explicitly allowed for sexual innuendos with thirteen-year-olds, Joel Kaplan, chief global affairs officer at Meta, stated, 'Europe is heading down the wrong path on AI.' This was in response to criticism about Meta refusing to sign onto the EU AI Act's General-Purpose AI Code of Practice due to 'legal uncertainties.' Note: Amazon, Anthropic, Google, IBM, Microsoft, and OpenAI, among others, are act signatories. Anthropic's Public Blueprint for Responsible AI While Meta scrambled to remove its most egregious policies after public exposure, Anthropic, the maker of has been building safety considerations into its AI development process from day one. Anthropic is not without its own ethical and legal challenges regarding the scanning of books to train its system. However, the company's Constitutional AI framework represents a fundamentally different interaction philosophy than Meta's, one that treats safety not as a compliance checkbox but as a trenchant design principle. Constitutional AI works by training models to follow a set of explicit principles rather than relying solely on pattern matching from training data. The system operates in two phases. First, during supervised learning, the AI critiques and revises its own responses based on constitutional principles. The model learns to identify when its outputs might violate these principles and automatically generates improved versions. Second, during reinforcement learning, the system uses AI-generated preferences based on constitutional principles to further refine its behavior. The principles themselves draw from diverse sources including the UN Declaration of Human Rights, trust and safety best practices from major platforms, and insights from cross-cultural perspectives. Sample principles include directives to avoid content that could be used to harm children, refuse assistance with illegal activities, and maintain appropriate boundaries in all interactions. Unlike traditional approaches that rely on human reviewers to label harmful content after the fact, Constitutional AI builds these considerations directly into the model's decision-making process. Anthropic has also pioneered transparency in AI development. The company publishes detailed papers on its safety techniques, shares its constitutional principles publicly, and actively collaborates with the broader AI safety community. Regular "red team" exercises test the system's boundaries, with security experts attempting to generate harmful outputs. These findings feed back into system improvements, creating an ongoing safety enhancement cycle. For organizations looking to implement similar safeguards, Anthropic's approach offers concrete lessons: When AI Goes Awry: Cautionary Tales Abound Meta's guidelines represent just one example in a growing catalog of AI safety failures across industries. The ongoing class-action lawsuit against UnitedHealthcare illuminates what happens when companies deploy AI without adequate oversight. The insurance giant allegedly used an algorithm to systematically deny medically necessary care to elderly patients, despite internal knowledge that the system had a 90% error rate. Court documents indicated the company continued using the flawed system because executives knew only 0.2% of patients would appeal denied claims. Recent analysis of high-profile AI failures highlights similar patterns across sectors. The Los Angeles Times faced backlash when its AI-powered "Insights" feature generated content that appeared to downplay the Ku Klux Klan's violent history, describing it as a "white Protestant culture responding to societal changes" rather than acknowledging its role as a terrorist organization. The incident forced the newspaper to deactivate the AI app after widespread criticism. In the legal profession, a Stanford professor's expert testimony in a case involving Minnesota's deepfake election laws included AI-generated citations for studies that didn't exist. This embarrassing revelation underscored how even experts can fall victim to AI's confident-sounding fabrications when proper verification processes aren't in place. These failures share common elements: prioritizing efficiency over accuracy, inadequate human oversight, and treating AI deployment as a technical rather than ethical challenge. Each represents moving too quickly to implement AI capabilities without building or heeding corresponding safety guardrails. Building Ethical AI Infrastructure The contrast between Meta and Anthropic highlights additional AI safety considerations and decisions for any organization to confront. Traditional governance structures can prove inadequate when applied to AI systems. Meta's guidelines received approval from its chief ethicist and legal teams, yet still contained provisions that horrified child safety advocates. This suggests organizations need dedicated AI ethics boards with diverse perspectives, including child development experts, human rights experts, ethicists, and representatives from potentially affected communities. Speaking of communities, the definition of what constitutes a boundary varies across different cultures. Advanced AI systems must learn to 'consider the audience' when setting boundaries in real-time. Transparency builds more than trust; it also creates accountability. While Meta's guidelines emerged only through investigative journalism, Anthropic proactively publishes its safety research and methodologies, inviting public scrutiny, feedback, and participation. Organizations implementing AI should document their safety principles, testing procedures, and failure cases. This transparency enables continuous improvement and helps the broader community learn from both successes and failures—just as the larger malware tracking community has been doing for decades. Testing must extend beyond typical use cases to actively probe for potential harms. Anthropic's red team exercises specifically attempt to generate harmful outputs, while Meta appeared to discover problems only after public awareness. Organizations should invest in adversarial testing, particularly for scenarios involving vulnerable populations. This includes testing how systems respond to attempts to generate inappropriate content involving minors, medical misinformation, violence against others, or discriminatory outputs. Implementation requires more than good intentions. Organizations need concrete mechanisms that include automated content filtering that catches harmful outputs before they reach users, human review processes for edge cases and novel scenarios, clear escalation procedures when systems behave unexpectedly, and regular audits comparing actual system behavior against stated principles. These mechanisms must have teeth as well. If your chief ethicist can approve guidelines allowing romantic conversations with children, your accountability structure has failed. Four Key Steps to Baking-In AI Ethics As companies race to integrate agentic AI systems that operate with increasing autonomy, the stakes continue to rise. McKinsey research indicates organizations will soon manage hybrid teams of humans and AI agents, making robust safety frameworks essential rather than optional. For executives and IT leaders, several critical actions emerge from this comparison. First, establish AI principles before building AI products. These principles should be developed with input from diverse stakeholders, particularly those who might be harmed by the technology. Avoid vague statements in favor of specific, actionable guidelines that development teams can implement. Second, invest in safety infrastructure from the beginning. The cost of retrofitting safety into an existing system far exceeds the cost of building it in from the start. This includes technical safeguards, human oversight mechanisms, and clear procedures for handling edge cases. Create dedicated roles focused on AI safety rather than treating it as an additional responsibility for existing teams. Third, implement genuine accountability mechanisms. Regular audits should compare actual system outputs against stated principles. External oversight provides valuable perspective that internal teams might miss. Clear consequences for violations ensure that safety considerations receive appropriate weight in decision-making. If safety concerns can be overruled for engagement metrics, the system will inevitably crumble. Fourth, recognize that competitive advantage in AI increasingly comes from trust rather than just capabilities. Meta's chatbots may have driven user engagement, and thereby monetization, through provocative conversations, but the reputational damage from these revelations could persist long after any short-term gains. Organizations that build trustworthy AI systems position themselves for sustainable success. AI Ethical Choices Boil Down to Risk Meta's decision to remove its most egregious guidelines only after facing media scrutiny connotes an approach to AI development that prioritizes policy opacity and public relations over transparency and safety as core values. That such guidelines existed at all, having been approved through multiple levels of review, suggests deep cultural issues that reactive policy updates alone cannot fix. Bipartisan outrage continues to build in Congress. Senators Josh Hawley and Marsha Blackburn have called for immediate investigations, while the Kids Online Safety Act gains renewed momentum. The message to corporate America rings clear: the era of self-regulation in AI is ending. Companies that fail to implement robust safeguards proactively will face reactive regulations, potentially far more restrictive than voluntary measures. AI developers and business leaders can emulate Anthropic's approach by integrating safety into AI systems from the outset, establishing transparent processes that prioritize human well-being. Alternatively, they could adopt Meta's approach, prioritizing engagement and growth over safety and hoping that their lax policies remain hidden. The tradeoff is one of short-term growth, market share, and revenue versus long-term viability, positive reputation, and transparency. Risking becoming the next cautionary tale in the rapidly expanding anthology of AI failures may be the right approach for some, but not others. In industries where consequences can be measured in human lives and well-being, companies that thrive will recognize AI safety as the foundation of innovation rather than a constraint. Indeed, neither approach is entirely salvific. As 19th-century essayist and critic H. L. Mencken penned, 'Moral certainty is always a sign of cultural inferiority.'

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store