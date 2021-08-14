« previous post |

Article by Holly Else in Nature (8/5/21):

"‘Tortured phrases’ give away fabricated research papers

Analysis reveals that strange turns of phrase may indicate foul play in science"

Here are the beginning and a few other selected portions of the article:

In April 2021, a series of strange phrases in journal articles piqued the interest of a group of computer scientists. The researchers could not understand why researchers would use the terms ‘counterfeit consciousness’, ‘profound neural organization’ and ‘colossal information’ in place of the more widely recognized terms ‘artificial intelligence’, ‘deep neural network’ and ‘big data’.

Further investigation revealed that these strange terms — which they dub “tortured phrases” — are probably the result of automated translation or software that attempts to disguise plagiarism. And they seem to be rife in computer-science papers.

Research-integrity sleuths say that Cabanac* and his colleagues have uncovered a new type of fabricated research paper, and that their work, posted in a preprint on arXiv on 12 July1, might expose only the tip of the iceberg when it comes to the literature affected.

[*VHM: Guillaume Cabanac, a computer scientist at the University of Toulouse, France]

…

Scientific term Tortured phrase Big data Colossal information Artificial intelligence Counterfeit consciousness Deep neural network Profound neural organization Remaining energy Leftover vitality

Cloud computing Haze figuring Signal to noise Flag to commotion Random value Irregular esteem

Suspecting that the tortured phrases are the result of automated translation or software that rewrites existing text, Cabanac and colleagues ran a selection of abstracts from Microprocessors and Microsystems and other journals through a tool that can identify whether texts have been generated by the artificial-intelligence tool GPT. Of the Microprocessors and Microsystems papers flagged by the tool, manual checks revealed “critical flaws” in some of them, such as nonsensical text, as well as plagiarized text and images.

To dig deeper, the group downloaded all papers published in Microprocessors and Microsystems between 2018 and 2021, a time frame they chose because an upgraded version of GPT was released in 2019. They identified around 500 “questionable articles” based on various factors. Their analysis revealed that papers published after February 2021 had an acceptance time that was five times shorter, on average, than those published before that date. A high proportion of these papers came from authors in China. And a subset of papers had identical submission, revision and acceptance dates, the majority of which appeared in special issues of the journal. This is suspicious, the authors say. Unlike standard issues, overseen by the editor-in-chief, special issues are usually proposed and overseen by a guest editor, and focus on a specific area of research.

…

The sentence that I have highlighted is the only mention of China, but I dare say — as a long-term investigator of Chinglish — that it is easy to spot this technique of using machine translated awkward phraseology as a specialty of writers from China. Phrases such as "Flag to commotion" for "Signal to noise" and "Irregular esteem" for "Random value" just reek of Chinglish.

The article goes on for nine more paragraphs and offers much additional information about the modus operandi of these machine assisted plagiarists and the means whereby they are investigated and identified.

Suggested readings

Permalink