Study Shows Experienced Humans Can Spot Text Created By AI

21-04-2025

In this, the generation of generative AI, it's been proven several times that people aren't especially good at telling whether written material was spit out by an AI bot or through a labor of human love.
The general inability to tell real McCoy from OpenAI has been a disaster for teachers, who've been overcome by AI-crafted homework and test answers, destroying what it means to earn an education. But they are not alone, as AI text has diluted every form of written communication. In this dynamic, where humans have proven to be unreliable at spotting AI text, dozens of companies have sprung up selling AI detection. Dozens more companies have been created to help the unscrupulous avoid being detected by those systems.
But new research from Jenna Russell, Marzena Karpinska, and Mohit Iyyer of the University of Maryland, Microsoft, and the University of Massachusetts, Amherst, respectively, shows that, on the essential job of detecting AI-created text, people may not be useless after all. The research team found that people who frequently use AI to create text can be quite good at spotting it.
In the study, the team asked a group of five humans to review three types of text – human written, AI created, and AI created but then altered or edited by systems that are designed to fool automated AI detectors. The last set is especially important because other tests have shown that editing text created by AI can confuse or degrade the accuracy of even the better AI detection systems.
Their headline finding is 'that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text, even without any specialized training or feedback.' The research found that experienced humans reliably detected AI text even after it had been altered and edited, and that – with one exception – the human investigators outperformed every automated detection system they tested. The exception was AI detection provider Pangram, which matched the 'near perfect detection accuracy' of the humans.
According to the research paper, Pangram and the human experts were 99.3% accurate at spotting text created by AI and 100% accurate at picking out the human text as human.
That experienced humans may be able to reliably detect output from AI bots is big news, but don't get too excited or think that we won't need computer-based AI detectors anymore.
For one, this paper asked the experienced humans to vote on whether written work was AI or not, using a majority vote of five experts to stand as the indicator. That means that to be really accurate at picking the automated from the authentic took five people, not one.
This is from the paper, 'The majority vote of our five expert annotators substantially outperforms almost every commercial and open-source detector we tested on these 300 articles, with only the commercial Pangram model matching their near-perfect detection accuracy.'
In fact, if you were to pit one single experienced human detector against the best automated system, Pangram was still more accurate. The automated detector, 'outperforms each expert individually,' says the paper.
And, somewhat troubling, the paper also says that individual human AI sleuths, on average, indicated that human written text was written by AI 3.3% of the time. For a single human reviewer, that false positive rate could be a problem when drawn over hundreds or even thousands of papers.
Moreover, while a group of experienced human reviewers were more accurate than a single human reviewer, hiring five people to review each and every writing composition is wildly impractical, which the study's authors concede. 'An obvious drawback is that hiring humans is expensive and slow: on average, we paid $2.82 per article including bonuses, and we gave annotators roughly a week to complete a batch of 60 articles,' the paper reports.
There aren't many settings in which $2.82 and a week's time – per paper – are plausible.
Still, in the world where parsing auto-bot text from real writing is essential, the paper has three important contributions.
First, finding that humans, experienced humans, can actually spot AI text is a significant foundation for further discussion.
Second, as the paper points out, human AI detectors have a real advantage over automated systems in that they can articulate why they suspect a section of text is fake. Humans, the report says, 'can provide detailed explanations of their decision-making process, unlike all of the automatic detectors in our study.' In many settings, that can be quite important.
Finally, knowing humans can do this – even a panel of humans – may afford us a viable second opinion or outside double-check on important cases of suspected AI use. In academic settings, scientific research, or intellectual property disputes as examples, having a good AI detector and a different way to spot likely AI text, could be deeply valuable, even if it takes longer and costs more.
In education settings, institutions that care about the rigor and value of their grades and degrees, could create a two-tier review system for written work – a fast, high-quality, and accurate automated review, followed by a human panel-like review in cases where authenticity is contested. In such a system, a double-verified finding could prove conclusive enough to take action, thereby protecting not only the schools and teachers, but also protecting the honest work of human writers, and the general public.

Hashtags

#UniversityofMaryland

#Microsoft

#JennaRussell

#MarzenaKarpinska

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

How Workplace AI Is Both A Lifeline And A Landmine For Disabled Employees

Forbes

12 minutes ago

Forbes

How Workplace AI Is Both A Lifeline And A Landmine For Disabled Employees

In 2025, artificial intelligence is transforming the workplace for everyone, regardless of ability, but for workers with disabilities, the critical stakes governing both threats and opportunities remain, as ever, unique and highly personalized. First, the good news. AI has the potential to transform working life for disabled staff in several important ways. The core promise is enhanced accessibility. Though not yet at a level where it can operate entirely independently from human oversight, AI holds the potential to remediate on an industrial scale vast swathes of material that are inaccessible to workers with disabilities. This might include websites and documents that were previously indecipherable to screen readers or videos that lack captions for the hard of hearing. Those with sensory processing differences arising from neurodiversity and dyslexia can use AI to create highly personalized content such as concise, easy-read summaries or customized fonts, spacing, and colors. Though this is an opportunity that exists across the board, not just for employees with disabilities, new technologies such as agentic AI, which is now being rolled into everyday platforms like OpenAI's ChatGPT, are fostering an important workplace skills reset. In short, over the next few years, employees with the highest level of proficiency in deploying AI tools stand to gain the most. Why this might matter to employees with disabilities is that such individuals are often natural productivity hackers. Innovating through devising shortcuts and alternatives is often a key requirement of life with a disability, as is an appetite for identifying and becoming early adopters of technologies with the potential to make life that bit easier. Being watched Sadly, on the flipside, there is a much darker side to workplace AI for people with disabilities. This can be seen in the disproportionate impact that the growing proliferation of productivity surveillance tools has on this population. Undoubtedly buoyed by a wave of post-pandemic return-to-office mandates, including a directive from U.S. President Donald Trump for federal employees to cease working from home shortly after he returned to office, the workplace surveillance market is expected to grow to $4.5 billion by 2026. Algorithmic and AI-based workplace tracking tools can include anything from keystroke and mouse movement monitors to video monitoring software that can not only provide snapshots of an employee's computer screen but also a live video feed of the person at their desk. Additionally, some tools can track a person's exact location in the office, and there are those that measure physical output in environments such as warehouses. Nowadays, this type of tech has become so pervasive that it can also monitor a worker's bodily functions directly through health and well-being apps that are offered by the employer, often within a wrapper of employee benefit packages. Such technologies are not on the periphery either but are now, under the guise of employers needing to make data-driven decisions to boost both profits and employee productivity, being deployed by many renowned brands such as PWC, UnitedHealth Group and Elon Musk's AI startup xAI. Although many employees may somewhat justifiably have concerns, a 2023 Pew Research study found that 56 percent of U.S. workers are against the use of AI tools that track employee location, while 61 percent oppose monitoring employees' movements. However, it is potentially those workers with disabilities who have the most to lose. That's because the data that has been used to build the tools themselves uses metrics such as the typical output that might be expected of a non-disabled worker. These might include aspects like how many bathroom breaks an employee might have throughout the day. How long an individual should be able to remain seated at their desk for a single session could be another criterion. Unfortunately, workers with disabilities often fall outside of these 'norms' and therefore risk being flagged by AI systems for indiscipline or review by senior management. Writing for the American Bar Association last month, Ariana Aboulafia, Project Lead, Disability Rights in Technology Policy at the Center for Democracy & Technology, posited, 'Worker surveillance issues are inextricably tied to disability rights. These tools are often used in many different employment contexts and to the detriment of the privacy and civil rights of workers. Even though they seem like something out of dystopian fiction, they may very well be used by your employer right now or by your next employer. It is vital that workers are aware of the potential presence of these tools and that employers are aware of their impacts so that they can mitigate harms—or even choose not to use these tools at all.' Common sense might dictate that something as simple as disability disclosure from the employee might be the way out of this trap. That way, the employer would be able to take the disability into account when considering the surveillance metrics. The reality is, however, in today's competitive workplace, if an employee has the option of not revealing their disability status to their employer, they may well be inclined to remain tight-lipped for a whole host of reasons. Not least a fear of discrimination and negative attitudes from management and colleagues. Equally, in 2025, it's no longer far-fetched to believe that individuals of all abilities are becoming increasingly fearful that agentic AI may be set to take their jobs in the near future and will therefore do everything they can to avoid being perceived as risky or less valuable to the organization. In this context, it would seem like workplace AI surveillance is here to stay, with the next best option being to build as robust a framework as possible for how to use the tools responsibly and to recognize the types of cases where they might not be telling the employer the whole story.

Downgrade Alert! Analysts Have Recently Downgraded These Stocks

Business Insider

24 minutes ago

Business Insider

Downgrade Alert! Analysts Have Recently Downgraded These Stocks

As an investor, it is prudent to keep track of stocks that have been downgraded by Wall Street, as these signal an unfavorable change in the company's outlook. Analysts usually downgrade a company's ratings when they perceive deteriorating fundamentals, a weaker competitive position, higher valuations compared to peers, or a challenging macroeconomic environment. Importantly, analysts also share their reasons and insights behind these downgrades. Elevate Your Investing Strategy: Take advantage of TipRanks Premium at 50% off! Unlock powerful investing tools, advanced data, and expert analyst insights to help you invest with confidence. A stock's price often reacts to analyst rating changes or adjustments in price targets. Investors can use these rating changes to gauge the risks involved and adjust their portfolio holdings accordingly. However, not every downgrade calls for an immediate sell. Instead, investors should conduct a closer review of these stocks and reassess their investment strategy. Here's a List of Downgraded Stocks: Adobe Systems (ADBE) – Melius Research analyst Ben Reitzes downgraded software company Adobe's stock from a 'Hold' rating to a 'Sell' and maintained a $310 price target, implying 7.1% downside potential. Reitzes believes that Adobe is in the 'early innings' of seeing multiple contraction amid the rapid shift to artificial intelligence (AI). He expects value to continue shifting toward 'infrastructure winners' such as Microsoft (MSFT) and Oracle (ORCL). Accordingly, Reitzes cut Adobe's 2026 and 2027 estimates. The Trade Desk (TTD) – The Trade Desk offers a cloud-based platform that enables advertisers to plan, manage, optimize, and measure digital advertising campaigns across multiple channels and devices. Jefferies analyst James Heaney and HSBC analyst Mohammed Khallouf downgraded TTD stock from a 'Buy' rating to a 'Hold.' Khallouf also cut the price target from $84 to $56, implying 5.3% upside potential. He stated that TTD's Q2 results highlight 'structural issues,' with growth slowing despite a healthy advertising market. Khallouf cites cautious spending by major brands, normalizing connected TV growth, and rising competition, including from AI, as key challenges. (AI) – D.A. Davidson analyst Lucky Schreiner downgraded shares of the enterprise AI company from a 'Hold' rating to a 'Sell' and also lowered the price target on AI stock from $25 to $13, implying 21.1% downside potential from current levels. The downgrade followed release of preliminary Q1 results that fell far below expectations. The revised sales guidance was about 33% below the midpoint of its prior guidance of $100 million to $109 million and down 19% year-over-year. Moreover, the adjusted operating loss was expected to be roughly twice as large as the company's earlier forecast of $23.5 million to $33.5 million. Additionally, announced a restructuring of its sales and services organization, bringing in new leaders across regions. Schreiner fears that the business could get worse before it gets better. Ballard Power (BLDP) – Lake Street analyst Robert Brown downgraded shares of the developer of proton exchange membrane (PEM) fuel cell products from a 'Buy' rating to a 'Hold.' Brown also slashed the price target on BLDP stock from $5 to $2, implying 11.1% upside potential. While the analyst was encouraged by the new management's goal of cash flow breakeven by the end of 2027, he believes the hydrogen fuel cell market remains challenging, with mixed order activity and headwinds in several markets. Brown expects continued uncertainty throughout 2025. Serve Robotics (SERV) – Serve Robotics is a manufacturer of autonomous delivery robots, designed to revolutionize last-mile delivery services. Seaport Global analyst Aaron Kessler downgraded SERV stock from a 'Buy' rating to a 'Hold,' following its Q2 results. Kessler lowered his revenue and EBITDA forecasts, expecting most of the company's growth to come later in 2026, while near-term expenses continue to rise. Seaport believes the stock will likely trade sideways until there are major improvements in key revenue drivers. Wheaton Precious Metals (WPM) – UBS analyst Daniel Major downgraded the shares of the Canadian metals streaming company from a 'Buy' rating to a 'Hold.' Moreover, he increased the price target on WPM to $106 from $100, implying 9% upside potential. The analyst expects Wheaton's stock to pause after its recent 75% year-to-date gain, driven by higher gold and silver prices. He believes the stock is now fully valued and that the market has already factored in further commodity price growth for the company. Open Text Corporation (OTEX) – Jefferies analyst Samad Samana downgraded OTEX stock from a 'Buy' rating to a 'Hold,' while lowering the price target to $33 from $35, implying 12.1% upside potential. Although Samana remains optimistic about Open Text's valuation, he sees uncertainty over the future of the business and leadership. The company announced its CEO is leaving, shortly after CFO Chadwick Westlake's departure, and said it is exploring 'portfolio-shaping opportunities.' Cummins (CMI) – Freedom Capital Markets analyst Sergey Glinyanov downgraded the shares of power solutions provider from a 'Buy' rating to a 'Hold,' but lifted the price target from $368 to $399, implying 1.9% upside potential. The analyst highlighted Cummins' stronger-than-expected margins and earnings per share, which offset declines in its Engine and Components segments through its diversified operations. However, Glinyanov is concerned about the company's current valuation versus future prospects. Lantheus Holdings (LNTH) – Lantheus Holdings provides innovative diagnostic imaging agents and therapies. Truist Financial analyst Richard Newitter downgraded LNTH stock from a 'Buy' rating to a 'Hold' and slashed the price target from $111 to $63, implying 13.5% upside potential. Although Newitter acknowledges that much of the uncertainty around Pylarify has already been built in the stock's recent drop, he expects Pylarify's sales to decline for at least the next two quarters. The analyst sees potential long-term value in Lantheus' pipeline but believes the stock may remain flat for about six months until there is more clarity about Pylarify's future sales potential.

Elon Musk is trying to pit ChatGPT against its owner as his feud with Sam Altman escalates

Yahoo

38 minutes ago

Yahoo

Elon Musk is trying to pit ChatGPT against its owner as his feud with Sam Altman escalates

Elon Musk is upping the ante in his feud with Sam Altman, and using ChatGPT to do it. Musk posted a screenshot of a query to ChatGPT, asking if he or Altman was "more trustworthy." Musk's screenshot said ChatGPT picked him. Elon Musk is turning to ChatGPT to adjudicate his long-running feud with Sam Altman. Musk posted a screenshot of a query to ChatGPT 5 Pro on Tuesday, asking if he or Altman was "more trustworthy." ChatGPT chose Musk. "There you have it," Musk wrote. Business Insider posed the same query as Musk to ChatGPT eight times, testing it across the GPT 5 Pro, GPT 5 Thinking, and GPT 5 models. ChatGPT picked Musk once, while it was set to GPT-5 Thinking. The rest of the attempts returned Altman, including when it was toggled to GPT-5 Thinking again. Musk and OpenAI did not respond to requests for comment from Business Insider. Later, Musk posted the screenshot in response to an X post from OpenAI's ChatGPT account. The account had shared a query to Musk's chatbot, Grok, asking if Musk was right to say Apple had committed antitrust violations. Musk threatened to sue Apple on Monday over what he said was its bias toward OpenAI on the App Store. But Grok — in a response reposted by the official ChatGPT X account — disagreed with Musk's opinion of the Apple rankings. "Good bot," the OpenAI-affiliated account said of Grok's response, adding that Grok was very "truth-seeking." "You too," Musk replied. Musk cofounded OpenAI with Altman in 2015 but left its board in 2018. Since then, Musk has publicly criticized Altman's leadership of OpenAI. Last year, Musk filed a lawsuit against OpenAI, accusing the company of violating its nonprofit mission when it partnered with Microsoft. Musk launched his AI startup, xAI, in July 2023. Grok, its first chatbot, was released in the same year. Musk isn't the only one who has tried to pit a chatbot against its creator. In May, Altman asked Grok if it would pick him or Musk to lead the AI arms race if the fate of humanity was at stake. "If forced, I'd lean toward Musk for his safety emphasis, critical for humanity's survival, though Altman's accessibility is vital. Ideally, their strengths should combine with regulation to ensure AI benefits all," Grok said in response to Altman's query. Read the original article on Business Insider Solve the daily Crossword