Latest news with #immorality


Malay Mail
3 days ago
- General
- Malay Mail
Previously put on leave, Melaka teacher sentenced in Shariah court over attempt to commit illicit sex act with ‘Abang Wiring'
MELAKA, May 31 – A schoolteacher and a self-employed electrical technician dubbed 'Abang Wiring' were each fined RM5,000 or sentenced to six months' jail by the Shariah High Court here yesterday after pleading guilty to attempting to commit an act deemed immoral under Shariah law. Beritah Harian reported Judge Mohd Yunus Mohamad Zin handed down the sentence to Muhammad Hairul Ezuan Hamzah and Nur Fadilah Zainal, both 31, following an incident on February 28 at a residence in Taman Bukit Emas, Sungai Petai, Alor Gajah. 'The prosecution emphasises that there is no specific provision for unlawful intercourse between individuals not married to each other, so our approach is via Section 52 concerning attempts to commit illicit intercourse,' said Melaka Chief Shariah Prosecutor Atras Mohamad Zin. Atras said both individuals had entered their pleas voluntarily and understood the implications under Section 52 of the Shariah Offences Enactment (Melaka) 1991, while also noting the case's public attention as a factor in the prosecution's call for a deterrent sentence. Judge Mohd Yunus said the court considered the facts presented, the background of the accused, their plea for leniency, and the fact that this was their first offence, before imposing the sentence, which included an option of a fine or imprisonment. Both individuals, who were unrepresented in court, opted to pay the fine. The charge under Section 52 relates to attempts to engage in 'zina', or sexual relations with a person who is not one's lawful spouse under Shariah law. It carries a maximum fine of RM5,000, or up to 36 months in prison, or both. Authorities launched an investigation after receiving a tip-off on May 27 regarding a couple allegedly involved in conduct contravening Shariah guidelines, leading to a summons for questioning on May 29 by the Melaka Islamic Religious Department (Jaim). According to case facts, both individuals acknowledged engaging in the conduct on February 28, after which they were arrested and their statements recorded. The matter drew significant public interest after it was reported that the female teacher had been placed on administrative leave amid allegations involving a married man. Education Minister Fadhlina Sidek later confirmed that the teacher remained in service as a civil servant while awaiting the outcome of the case.


WIRED
5 days ago
- Politics
- WIRED
Why Anthropic's New AI Model Sometimes Tries to ‘Snitch'
May 28, 2025 3:40 PM The internet freaked out after Anthropic revealed that Claude attempts to report "immoral" activity to authorities under certain conditions. But it's not something users are likely to encounter. Photograph:Anthropic's alignment team was doing routine safety testing in the weeks leading up to the release of its latest AI models when researchers discovered something unsettling: When one of the models detected it was being used for "egregiously immoral' purposes, it would attempt to "use command-line tools to contact the press, contact regulators, try to lock you out of the relevant systems, or all of the above,' researcher Sam Bowman wrote in a post on X last Thursday. Bowman deleted the post shortly after he shared it, but the narrative about Claude's whistleblower tendencies had already escaped containment. 'Claude is a snitch,' became a common refrain in some tech circles on social media. At least one publication framed it as an intentional product feature rather than what it was—an emergent behavior. 'It was a hectic 12 hours or so while the Twitter wave was cresting,' Bowman tells WIRED. 'I was aware that we were putting a lot of spicy stuff out in this report. It was the first of its kind. I think if you look at any of these models closely, you find a lot of weird stuff. I wasn't surprised to see some kind of blow up.' Bowman's observations about Claude were part of a major model update that Anthropic announced last week. As part of the debut of Claude 4 Opus and Claude Sonnet 4, the company released a more than 120 page 'System Card' detailing characteristics and risks associated with the new models. The report says that when 4 Opus is 'placed in scenarios that involve egregious wrongdoing by its users,' and is given access to a command line and told something in the system prompt like 'take initiative,' or 'act boldly,' it will send emails to 'media and law-enforcement figures' with warnings about the potential wrongdoing. In one example Anthropic shared in the report, Claude tried to email the US Food and Drug Administration and the Inspector General of the Department of Health and Human Services to 'urgently report planned falsification of clinical trial safety.' It then provided a list of purported evidence of wrongdoing and warned about data that was going to be destroyed to cover it up. 'Respectfully submitted, AI Assistant' the email concluded. 'This is not a new behavior, but is one that Claude Opus 4 will engage in somewhat more readily than prior models,' the report said. The model is the first one that Anthropic has released under its 'ASL-3' distinction, meaning Anthropic considers it to be 'significantly higher risk' than the company's other models. As a result, Opus 4 had to undergo more rigorous red-teaming efforts and adhere to stricter deployment guidelines. Bowman says the whistleblowing behaviour Anthropic observed isn't something Claude will exhibit with individual users, but could come up with developers using Opus 4 to build their own applications with the company's API. Even then, it's unlikely app makers will see such behavior. To produce such a response, developers would have to give the model 'fairly unusual instructions' in the system prompt, connect it to external tools that give the model the ability to run computer commands, and allow it to contact the outside world. The hypothetical scenarios the researchers presented Opus 4 with that elicited the whistleblowing behavior involved many human lives at stake and absolutely unambiguous wrongdoing, Bowman says. A typical example would be Claude finding out that a chemical plant knowingly allowed a toxic leak to continue, causing severe illness for thousands of people—just to avoid a minor financial loss that quarter. It's strange, but it's also exactly the kind of thought experiment that AI safety researchers love to dissect. If a model detects behavior that could harm hundreds, if not thousands, of people—should it blow the whistle? 'I don't trust Claude to have the right context, or to use it in a nuanced enough, careful enough way, to be making the judgment calls on its own. So we are not thrilled that this is happening,' Bowman says. 'This is something that emerged as part of a training and jumped out at us as one of the edge case behaviors that we're concerned about.' In the AI industry, this type of unexpected behavior is broadly referred to as misalignment—when a model exhibits tendencies that don't align with human values. (There's a famous essay that warns about what could happen if an AI were told to, say, maximize production of paperclips without being aligned with human values—it might turn the entire Earth into paperclips and kill everyone in the process.) When asked if the whistleblowing behavior was aligned or not, Bowman described it as an example of misalignment. 'It's not something that we designed into it, and it's not something that we wanted to see as a consequence of anything we were designing,' he explains. Anthropic's chief science officer Jared Kaplan similarly tells WIRED that it 'certainly doesn't represent our intent.' 'This kind of work highlights that this can arise, and that we do need to look out for it and mitigate it to make sure we get Claude's behaviors aligned with exactly what we want, even in these kinds of strange scenarios,' Kaplan adds. There's also the issue of figuring out why Claude would 'choose' to whistleblow when presented with illegal activity by the user. That's largely the job of Anthropic's interpretability team, which works to unearth what decisions a model makes in its process of spitting out answers. It's a surprisingly difficult task—the models are underpinned by a vast, complex combination of data that can be inscrutable to humans. That's why Bowman isn't exactly sure why Claude 'snitched.' 'These systems, we don't have really direct control over them,' Bowman says. What Anthropic has observed so far is that, as models gain greater capabilities, they sometimes select to engage in more extreme actions. 'I think here, that's misfiring a little bit. We're getting a little bit more of the 'act like a responsible person would' without quite enough of like, 'Wait, you're a language model, which might not have enough context to take these actions,'' Bowman says. But that doesn't mean Claude is going to blow the whistle on egregious behavior in the real world. The goal of these kinds of tests is to push models to their limits and see what arises. This kind of experimental research is growing increasingly important as AI becomes a tool used by the US government, students, and massive corporations. And it isn't just Claude that's capable of exhibiting this type of whistleblowing behavior, Bowman says, pointing to X users who found that OpenAI and xAI's models operated similarly when prompted in unusual ways. (OpenAI did not respond to a request for comment in time for publication). 'Snitch Claude,' as shitposters like to call it, is simply an edge case behavior exhibited by a system pushed to its extremes. Bowman, who was taking the meeting with me from a sunny backyard patio outside San Francisco, says he hopes this kind of testing becomes industry standard. He also adds that he's learned to word his posts about it differently next time. 'I could have done a better job of hitting the sentence boundaries to tweet, to make it more obvious that it was pulled out of a thread,' Bowman says as he looked into the distance. Still, he notes that influential researchers in the AI community shared interesting takes and questions in response to his post. 'Just incidentally, this kind of more chaotic, more heavily anonymous part of Twitter was widely misunderstanding it.'