Problematic Paper Screener: Trawling for fraud in the scientific literature

Yahoo29-01-2025

Have you ever heard of the Joined Together States? Or bosom peril? Kidney disappointment? Fake neural organizations? Lactose bigotry? These nonsensical, and sometimes amusing, word sequences are among thousands of 'tortured phrases' that sleuths have found littered throughout reputable scientific journals.
They typically result from using paraphrasing tools to evade plagiarism-detection software when stealing someone else's text. The phrases above are real examples of bungled synonyms for the United States, breast cancer, kidney failure, artificial neural networks, and lactose intolerance, respectively.
We are a pair of computer scientists at Université de Toulouse and Université Grenoble Alpes, both in France, who specialize in detecting bogus publications. One of us, Guillaume Cabanac, has built an automated tool that combs through 130 million scientific publications every week and flags those containing tortured phrases.
The Problematic Paper Screener also includes eight other detectors, each of which looks for a specific type of problematic content.
Several publishers use our paper screener, which has been instrumental in more than 1,000 retractions. Some have integrated the technology into the editorial workflow to spot suspect papers upfront. Analytics companies have used the screener for things like picking out suspect authors from lists of highly cited researchers. It was named one of 10 key developments in science by the journal Nature in 2021.
So far, we have found:
Nearly 19,000 papers containing at least five tortured phrases each.
More than 280 gibberish papers – some still in circulation – written entirely by the spoof SCIgen program that Massachusetts Institute of Technology students came up with nearly 20 years ago.
More than 764,000 articles that cite retracted works that could be unreliable. About 5,000 of these articles have at least five retracted references listed in their bibliographies. We called the software that finds these the 'Feet of Clay' detector after the biblical dream story where a hidden flaw is found in what seems to be a strong and magnificent statue. These articles need to be reassessed and potentially retracted.
More than 70 papers containing ChatGPT 'fingerprints' with obvious signs such as 'Regenerate Response' or 'As an AI language model, I cannot …' in the text. These articles represent the tip of the tip of the iceberg: They are cases where ChatGPT output has been copy-pasted wholesale into papers without any editing (or even reading) and has also slipped past peer reviewers and journal editors alike. Some publishers allow the use of AI to write papers, provided the authors disclose it. The challenge is to identify cases where chatbots are used not just for language-editing purposes but to generate content – essentially fabricating data.
There's more detail about our paper screener and the problems it addresses in this presentation for the Science Studies Colloquium.
Read The Conversation's investigation into paper mills here: Fake papers are contaminating the world's scientific literature, fueling a corrupt industry and slowing legitimate lifesaving medical research
This article is republished from The Conversation, a nonprofit, independent news organization bringing you facts and trustworthy analysis to help you make sense of our complex world. It was written by: Guillaume Cabanac, Institut de Recherche en Informatique de Toulouse; Cyril Labbé, Université Grenoble Alpes (UGA), and Frederik Joelving, Retraction Watch
Read more:
Fake papers are contaminating the world's scientific literature, fueling a corrupt industry and slowing legitimate lifesaving medical research
When scientific citations go rogue: Uncovering 'sneaked references'
A new 'AI scientist' can write science papers without any human input. Here's why that's a problem
Guillaume Cabanac receives funding from the European Research Council (ERC) and the Institut Universitaire de France (IUF). He is the administrator of the Problematic Paper Screener, a public platform that uses metadata from Digital Science and PubPeer via no-cost agreements. Cabanac has been in touch with most of the major publishers and their integrity officers, offering pro bono consulting regarding detection tools to various actors in the field including ClearSkies, Morressier, River Valley, Signals, and STM.
Cyril Labbé receives funding from the European Research Council. He he has also received funding from the French National Research Agency (ANR), and the U.S. Office of Research Integrity. Labbé has been in touch with most of the major publishers and their integrity officers, offering pro-bono consulting regarding detection tools to various actors in the field including STM-Hub and Morressier.
Frederik Joelving does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.

Hashtags

Science

#ChatGPT

#SCIgen

#ScienceStudiesColloquium

#TogetherStates

#UniversitédeToulouse

#UniversitéGrenobleAlpes

#GuillaumeCabanac

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Ohio State University going all in on AI to build ‘intuition': Provost

Yahoo

an hour ago

Yahoo

Ohio State University going all in on AI to build ‘intuition': Provost

(NewsNation) — As artificial intelligence continues to be hotly debated in the United States, one college is going all in on letting the capability of computational systems perform tasks typically associated with human intelligence. Ravi Bellamkonda, provost and executive vice president at Ohio State University, joined 'NewsNation Live' to discuss the school's 'AI Fluency Initiative' beginning this fall. The 'AI Fluency Initiative' will be required of all students at the school. The hope is that the program will help students become bilingual, meaning fluent in their major field of study and applying AI in that area. 'We've been using all sorts of tools to augment learning, augment our understanding of the world,' said Bellamkonda. 'There's no question we live in a data-rich world now, from politics, to news, to manufacturing, to medicine, to all these fields of study and interest for our students. We have lots and lots of data. And the power of AI is to synthesize that data and make meaning out of it.' Regulating AI for ten years in 'big, beautiful bill' no help: Tech expert Pew Research Center said that in 2024, teens reported using AI twice as much as they did the year before. Over half say they think it's okay to use AI for research, while nearly 30% acknowledge it's acceptable to use AI for math problems. Less than 20% added that it's okay to use AI to write an essay. 'We really want to build intuition, judgment, and ethical understanding,' Bellamkonda said. 'When is it appropriate? When is it not appropriate? When is it hallucinating? When is it not? And we feel like having that conversation in the classroom with a professor in the context of the subject they are learning is really important for us.' High school students create AI-detecting app to help teachers Apple says it has a new research paper out that says artificial intelligence might not be as intelligent as some want it to be. The company's new research paper claims that so-called 'large reasoning models' and 'large language models' such as ChatGPT give up when trying to solve complex problems. Their researchers say this is proof of limited capabilities that keep artificial intelligence from reasoning the way you or I do. 'We had the same fear, if you remember, when we had calculators,' added Bellamkonda. 'We were afraid that people would store formulas in there and not really understand. That's the challenge we have as educators now is to still make sure that our students have a deep understanding of the subject and they're not just letting AI do all the thinking.' How AI is shaping industries across the US Bloomberg also reported that Mark Zuckerberg, founder of Facebook, is seeking a team to build an AI that could reason on the same level as a human. He wants to hire around 50 people for the project. This comes after Meta delayed the release of a big new artificial intelligence model last month over concerns it wasn't good enough. In addition, St. Petersburg, Florida, is installing AI or so-called 'smart signals' that can connect with tech in some newer vehicles. That tech can alert the driver about upcoming hazards and traffic conditions, such as flooding or a pedestrian in the area. The city is looking to invest more than $1 million in the project. Copyright 2025 Nexstar Media, Inc. All rights reserved. This material may not be published, broadcast, rewritten, or redistributed.

Boston Globe

a day ago

Boston Globe

FDA looks to AI to enhance efficiency

Get Starting Point A guide through the most important stories of the morning, delivered Monday through Friday. Enter Email Sign Up The agency plays a central role in pursuing the agenda of the US health secretary, Robert F. Kennedy Jr., and it has already begun to press food makers to eliminate artificial food dyes. The new road map also underscores the Trump administration's efforts to smooth the way for major industries with an array of efforts aimed at getting products to pharmacies and store shelves quickly. Advertisement Some aspects of the proposals outlined in JAMA were met with skepticism, particularly the idea that AI is up to the task of shearing months or years from the painstaking work of examining applications that companies submit when seeking approval for a drug or high-risk medical device. Advertisement 'I don't want to be dismissive of speeding reviews at the FDA,' said Stephen Holland, a lawyer who formerly advised the House Committee on Energy and Commerce on health care. 'I think that there is great potential here, but I'm not seeing the beef yet.' A major AI rollout closely follows the release of a report by Kennedy's MAHA Commission, which uses an acronym for Make America Healthy Again, that was found to be rife with references to scientific research apparently fabricated by an AI program. For some cases, the FDA officials proposed speeding major drug approvals by requiring only one major study in patients rather than two, a practice the agency has used in recent years. The pandemic provided a precedent, they said, for accelerating the process. 'We believe this is clear demonstration that rapid or instant reviews are possible,' Makary and Prasad wrote. But Holland pointed out that during the pandemic, many staff members were transferred from routine duties, including overseas inspections of food or drug facilities, and reassigned to hasten critical COVID product reviews. The agency was also better staffed. In recent months, the FDA shed about 1,940 employees, reducing the workforce to 8,000 from roughly 10,000. Last week, the agency introduced Elsa, an AI large-language model similar to ChatGPT. The FDA said it could be used to prioritize which food or drug facilities to inspect, to describe side effects in drug safety summaries and to perform other basic product-review tasks. The FDA officials wrote that AI held the promise to 'radically increase efficiency' in examining as many as 500,000 pages submitted for approval decisions. Advertisement Current and former health officials said the AI tool was helpful but far from transformative. For one, the model limits the number of characters that can be reviewed, meaning it is unable to do some rote data analysis tasks. Its results must be checked carefully, so far saving little time. Staff members said the model was hallucinating, or producing false information. Employees can ask the Elsa model to summarize text or act as an expert in a particular field of medicine. Makary said the AI models were not being trained by data submitted by the drug or medical device industry. When it comes to food oversight, Makary and Prasad said there would be a renewed focus on 'our increasingly chemically manipulated diet,' a goal embraced by Republicans and Democrats. 'For all additives,' the article said, 'the benefit-to-harm balance must be reevaluated.' Although the Trump administration is seeking steep cuts in the FDA's budget for the next fiscal year, the food division is expected to receive additional funds. Others noted the fine line agency officials were walking, given Kennedy's complaints that the FDA is too close to the drug industry and the Trump administration's business-friendly approach. Makary and Prasad wrote that the FDA must be 'partners with industry' while avoiding 'a cozy relationship that has characterized the agency in the past.' Dr. Reshma Ramachandran, a director of the Yale Collaboration for Regulatory Rigor, Integrity and Transparency, pointed out that Makary and Prasad were going on a six-city, closed-door listening tour to meet with chief executives of the drug industry. 'How is this guarding the agency 'against a cozy relationship' with industry?' she asked. The FDA priorities 'read as though they're straight out of PhRMA's playbook,' she said, referring to the trade group. Advertisement This article originally appeared in

The 2025 Tech Power Players in the foundational AI sector

Boston Globe

2 days ago

Boston Globe

The 2025 Tech Power Players in the foundational AI sector

The team behind the company, now chasing better known rivals such as OpenAI's ChatGPT, included three MIT students and their adviser, computer scientist Rus has been a fixture on the AI scene since she came to MIT in 2003, fresh off a MacArthur 'genius' grant for her work developing robots. Nine years later, the university named Rus to lead the school's famed Born in Communist Romania during the Cold War, Rus and her family immigrated to the United States in 1982. She studied at the University of Iowa before earning a doctorate at Cornell University in 1992. She taught at Dartmouth College before moving to MIT. Advertisement Inspired by the simple brain structure of a roundworm, Rus and her cofounders, Ramin Hasani, Mathias Lechner, and Alexander Amini, developed an AI technique with fewer software 'neurons' than the large language models of OpenAI and others. That means Liquid AI requires less computing power (and electricity). The company, valued at more than $2 billion, has about 55 employees at its Kendall Square headquarters. More tech power players to watch in the foundational AI sector: Explore more sectors Aaron Pressman can be reached at