Rampant plagiarism in the Chinese literary world

« previous post | next post »

"It cannot read the human heart" by Yan Ge (b/1984), London Review of Books Blog (2/20/26)

Since November 2024, a book influencer on RedNote has been publishing posts featuring side-by-side excerpts from works by different authors that contained similar, and in many cases identical, sentences and paragraphs. Among those whose sentences, similes, descriptions, scenes and plotlines appeared to have been copied and pasted were Eileen Chang, Hsien-yung Pai, William Faulkner, Orhan Pamuk, Annie Proulx and Gabriel García Márquez. The perpetrators of the apparent plagiarism were a number of contemporary Chinese authors.

‘Why are so many writers “borrowing” from others’ work?’ my friend asked. ‘Is this some kind of open secret in the literary world?’

I had no answer. In more than twenty years as a writer, I have previously encountered only a couple of incidents of outright literary theft (as opposed to quotation or allusion). Both times, I was baffled by it. Plagiarism, it seems to me, is a humiliating admission of artistic failure.

Digging deeper into the causes for the widespread plagiarism that she was encountering, Yan discovered one potential reason for the rapid rise in these corrupt practice cases:

The discovery was made possible by AI-powered plagiarism-checking applications, but some people have suggested that the plagiarism itself may have been fostered by the use of large language models. Given the data that AI models are trained on, wasn’t it possible – inevitable, even – that any writer who used AI for prompting or editing would end up copying, inadvertently, the work of others? The trouble is that much of the apparent plagiarism was published in the early 2000s or the 1990s. So unless someone invents a time machine, the theory doesn’t hold.

Moreover, says Yan, 

If plagiarism is defined as having sentences flagged as identical by a checker, then so be it. But the software can only scan texts mechanically; it cannot read the human heart … This so-called reader who exposed the identical texts, you are not a reader in any real sense. You just used the software, being too lazy to read anything yourself … You are merely a reader who is not illiterate.

There is yet one more outré hypothesis about what may have served to promote plagiarism:

Other online analysts noted that a number of the authors involved had attended creative writing MFA programmes, which have been a feature of Chinese universities for the last fifteen years or so. ‘So this is how they teach writing in the universities,’ people speculated. ‘They simply get the students to memorise the classics and graft the masters’ sentences into their imitations.’ The opinion echoed a long-running scepticism towards the institutionalisation ­– or, as some would have it, the industrialisation ­– of writing.

In the final analysis, after consulting with another friend, Yan came to the conclusion that the plagiarizers were doing it for money.  Creative writing, especially for state-funded journals, is so highly lucrative that, if you steadily churn out one or two stories a month for them, before long you will be in the top five per cent income bracket.

Yan has been writing in English in addition to Mandarin and Sichuanese. Her first English book is a 2023 short story collection Elsewhere: stories. Reviewer Chelsea Leu wrote

Yan Ge’s English debut is preoccupied with language, its failures, and its relationship to human emotions and the raw reality – the 'food' – of life. … These stories map out the distance between the head and the gut – the way language can fail to convey the deepest, most visceral facts of life."

Reviewer Sindya Bhanoo wrote that the stories "explore the power of language across the Chinese diaspora to either bring people together or push them apart."

(Wikipedia)

If there's not a dramatic turnaround soon, these practices will take all of the fun out of writing — and reading.

 

Selected readings

[h.t. John Rohsenow and thanks to Jing Hu]]



38 Comments »

  1. Martin Schwartz said,

    February 25, 2026 @ 11:19 pm

    Why should AI have to plagiarise?? It now writes stories online,
    e.g. dramas about children foiled at trying to rip off their aged parents,
    with robotic narrators, but one would not necessarily guess.
    I've seen some AI poetry which may pass muster (for at least a plastic hotdog). Last night BBC reported a robot conductor for the Swedish Philharmonic Orch.; a reviewer complained about the robot's
    expressionless face. That may be resolved by now, easy enough.
    And no, I'm not a
    Martin Schwartz
    robot
    (yet)
    Oy.

  2. Lucas Christopoulos said,

    February 25, 2026 @ 11:47 pm

    AI does not have the ability to achieve the same specificity of analysis from different perspectives. This is part of what makes the human mind so remarkable. The way an individual’s mind works is inherent to that person, and AI cannot follow the same stream of thought. Diversity of knowledge certainly helps, but the perspective of expression is intrinsic to the biological mind.

  3. John Swindle said,

    February 26, 2026 @ 2:41 am

    Not only sentences but whole books can be copied, in which case the English term is "pirating."

    I don't remember whether you've discussed the remarkable 2012 novel 花冠病毒 (huāguān bìngdú, styled “The Corolla Virus" in English), by Bi Shumin 毕淑敏. Bi, a Beijing psychiatrist and writer, fought the SARS epidemic and in her novel predicted the next one, which did occur and became known as COVID-19. When I was able to find a copy to read (nope, my Chinese wasn't good enough) and donate to my local library, it claimed to be written by someone else entirely. I forget who—the library accepted the donation and listed the author correctly as Bi Shumin. What little I could find online confirmed that it was the same work. As far as I know it hasn't been translated into English yet.

  4. Scott P. said,

    February 26, 2026 @ 8:44 am

    Why should AI have to plagiarise?? It now writes stories online,
    e.g. dramas about children foiled at trying to rip off their aged parents,
    with robotic narrators, but one would not necessarily guess.

    That's how AI works, it digests the content of the training material and uses that to predict what is most likely to follow a prompt. So everything a LLM produces is plagiarized.

  5. ajay said,

    February 26, 2026 @ 9:21 am

    If plagiarism is defined as having sentences flagged as identical by a checker, then so be it. But the software can only scan texts mechanically; it cannot read the human heart … This so-called reader who exposed the identical texts, you are not a reader in any real sense. You just used the software, being too lazy to read anything yourself … You are merely a reader who is not illiterate.

    It might be worth clarifying that this quote is not from Yan Ge herself, but from one of the plagiarists!

    Yan's own complaint also seems a little unfair:
    When I lived in China, I fretted over the fact that I couldn’t write more than a couple of short stories a year, failing to realise that I was missing out on making money unless I broadened my resources.
    I don't think there has ever been a time when any professional writer could realistically expect to survive on an output of two short stories a year.

  6. ajay said,

    February 26, 2026 @ 9:22 am

    John Swindell: out of curiosity, how does a novel that has not been translated into English have an English title?

  7. Victor Mair said,

    February 26, 2026 @ 10:22 am

    That's a reasonable question, and it even passed through my mind fleetingly as I read John Swindell's comment, but I quickly recalled that it is not unusual for Chinese books, films, etc. to have English titles when their primary content is in Chinese.

  8. ajay said,

    February 26, 2026 @ 10:54 am

    Ah, I did not know that. On searching I see it has a Chinese title as well, and both appear on the cover of the book. Interesting! Why do they do that, do you know?

  9. Ted McClure said,

    February 26, 2026 @ 11:05 am

    @ajay: It's Library of Congress cataloging practice to create a translated title of a foreign language work for its English language record if the work doesn't already provide one.

  10. cervantes said,

    February 26, 2026 @ 11:17 am

    This is not really anything new. It's a fairly modern standard that you don't copy from extant works. Check out the Gospels if you don't believe me. Tristam Shandy is a famous novel and it's heavily plagiarized. I could go into this at greater length but you get the idea.

  11. Chas Belov said,

    February 26, 2026 @ 3:40 pm

    I seem to recall many years ago reading that China did not have copyright. ¿Was that true? ¿Is it true today?

    I can't speak for books, but it is quite common in Asia for music groups and albums to have English names which appear on the cover rather than an Asian-language name, even though the content is not in, or mostly not in, English.

    For example, groups:

    Japan: The Boom, Lindberg, Soft Ballet
    Thailand: Big Ass, Potato, Bodyslam, So Cool
    China: Black Panthers (this one appears only in Chinese on some releases)
    Taiwan: Mayday (this one often appears bilingually or just in Chinese), Cherry Boom, Fun4
    Malaysia: Estranged

    I believe The Seeds, a group from Laos, would also fall under this list but I can't seem to find any trace of them at the moment to verify.

  12. David Marjanović said,

    February 26, 2026 @ 4:14 pm

    Yan's own complaint also seems a little unfair:

    When I lived in China, I fretted over the fact that I couldn’t write more than a couple of short stories a year, failing to realise that I was missing out on making money unless I broadened my resources.

    I don't think there has ever been a time when any professional writer could realistically expect to survive on an output of two short stories a year.

    Even within English, though, "a couple" doesn't mean "two" for everyone; many use it for "a few".

    I'm not aware of any other language that has a separate expression for unstressed "two".

  13. Bob Ladd said,

    February 26, 2026 @ 4:36 pm

    David Marjanović – Surely German "ein Paar" is used often enough in contexts very much like English "a couple" to mean "a few", not exactly two. Or am i missing something in your comment about "unstressed two"?

  14. Chas Belov said,

    February 26, 2026 @ 4:37 pm

    Correction:
    Laos: Cells (bilingual on some covers)
    That explains why I couldn't find them.

    Also, Japan: The Blankey Jet City.

  15. JPL said,

    February 26, 2026 @ 5:19 pm

    Just to point out, from the text provided above, that our commenter is called "John Swindle", not "Swindell".

  16. Steve Morrison said,

    February 26, 2026 @ 9:20 pm

    Hmm. I wonder what title the Library of Congress uses for Les Misérables?

  17. Martin Schwartz said,

    February 26, 2026 @ 10:25 pm

    @Bob Ladd: While Yiddish por / pur can mean 'pair', I primarily heard
    a por/pur (cf. German ein Paar) for 'a few"
    Martin Schwartz

  18. 번하드 said,

    February 27, 2026 @ 3:21 am

    @ajay: I think the OP talks about one or two stories a month, not a year, which would make the income claim sound less improbable.

  19. ajay said,

    February 27, 2026 @ 4:50 am

    I think the OP talks about one or two stories a month, not a year, which would make the income claim sound less improbable.

    The linked post talks about both, if you read it closely. Yan Ge bemoans that she was not able to write more than a couple of stories A YEAR, and was therefore not able to emulate other authors who made a comfortable living by writing one or two stories A MONTH.

    It's Library of Congress cataloging practice to create a translated title of a foreign language work for its English language record if the work doesn't already provide one.

    Ah, thank you – so this is for the benefit of libraries which have English (or some Roman alphabet language) as their working language but might still want to have books written in Chinese in their collections, and so will need to be able to log them as accessions with a Roman-alphabet title? That makes sense.

  20. Pierre Menard said,

    February 27, 2026 @ 4:50 am

    This is not really anything new. It's a fairly modern standard that you don't copy from extant works. Check out the Gospels if you don't believe me. Tristam Shandy is a famous novel and it's heavily plagiarized. I could go into this at greater length but you get the idea.

  21. ajay said,

    February 27, 2026 @ 4:57 am

    our commenter is called "John Swindle", not "Swindell".

    Apologies! I have never come across the former spelling and clearly my brain autocorrected.

    I wonder what title the Library of Congress uses for Les Misérables?

    Whatever the title of that particular edition is, I should think – most of the translations keep the original French title.

    What other books are there which are universally known by their original-language titles even in English translation? "Das Kapital" is the only one that comes immediately to mind.

  22. ajay said,

    February 27, 2026 @ 5:07 am

    This is not really anything new. It's a fairly modern standard that you don't copy from extant works. Check out the Gospels if you don't believe me. Tristam Shandy is a famous novel and it's heavily plagiarized. I could go into this at greater length but you get the idea.

    I regret deeply that the site did not allow me to post this comment again under the name "Pierre Menard".

  23. Bob Ladd said,

    February 27, 2026 @ 5:35 am

    @ajay: nice question. A few possibilities are Caesar's De Bello Gallico and Dante's Divina Commedia. Otherwise I can only think of works whose titles involve the names of characters, like Don Quixote or Buddenbrooks.

  24. Philip Taylor said,

    February 27, 2026 @ 5:40 am

    Not really a circle in which I move, but personally speaking I have never heard of Dante’s work being referred to as anything other than "[The] Divine Comedy".

  25. Robot Therapist said,

    February 27, 2026 @ 5:53 am

    The other day, I came across a youtube video of a story purporting to be a "Rumpole" story. I took that to mean it was one of John Mortimer's lovely legal "Rumpole of the Bailey" stories. However, on listening, it seemed to be an AI copy. And it kind of had the right tone, but the plot simply did not make any sense.

  26. Robot Therapist said,

    February 27, 2026 @ 5:54 am

    If a friend suggested we "pop" into the bar for "a couple" of beers, that would mean four beers.

  27. ajay said,

    February 27, 2026 @ 10:21 am

    A few possibilities are Caesar's De Bello Gallico and Dante's Divina Commedia

    Yes, colloquially you might use those in speech, but I don't think I've seen an English translation of Dante which is titled "La Divina Commedia" rather than "The Divine Comedy". (And the original title was just "Comedia", anyway.) Loeb calls its edition "The Gallic War". On the other hand all the English editions of "Les Miserables" seem to be titled "Les Miserables".

    But Xenophon is quite often published in English translation as "Anabasis" rather than as "The March Up-Country" or whatever the English version would be. And would "The Iliad" and "The Odyssey" count? I know that the exact titles would be "Ilias" and "Odysseia" in Roman script, but I haven't seen any translations that try to translate the titles into English as, say, "Odysseus' Tale" and "Troy Story".

  28. KevinM said,

    February 27, 2026 @ 7:41 pm

    @cervantes, @Pierre Menard. I see what you did there.

  29. Michael said,

    February 27, 2026 @ 7:47 pm

    I doubt it applies, but I'm reminded of the movie "The 400 Blows," in which the teenage protagonist is deeply inspired reading Hugo, and, wanting to make an impression with his essay, gets carried away and writes an essay from Hugo as his assignment. The professor shames him in front of the class, reinforcing his rejection of formal education, missing the fact that a reading assignment had moved him deeply (though obviously he needed guidance on creative writing).

  30. Michael Vnuk said,

    February 28, 2026 @ 2:32 am

    Adolf wrote a book known mainly by its German title.

  31. ajay said,

    February 28, 2026 @ 4:04 pm

    Good point. And I suppose the Torah is still the Torah whatever language it is in, as are the Koran and the Talmud and the Bhagavad Gita and the Shahnameh.

  32. Rodger C said,

    March 1, 2026 @ 10:36 am

    I thought that only the Qur'an in Arabic is "really" the Qur'an.

  33. Philip Taylor said,

    March 1, 2026 @ 11:27 am

    But "The القرآن that can be recited is not the eternal القرآن; The القرآن that can be named is not the eternal القرآن" —

    The nameless is the origin of Heaven and Earth
    The named is the mother of myriad things
    Thus, constantly without desire, one observes its essence
    Constantly with desire, one observes its manifestations
    These two emerge together but differ in name
    The unity is said to be the mystery
    Mystery of mysteries, the door to all wonders

    (if one may be permitted to mix one's belief systems).

  34. Philip Taylor said,

    March 1, 2026 @ 11:38 am

    Oh. I fear I have almost certainly used a tautology — I assume that "القرآن" already subsumes the definite article … Perhaps one of our revered moderators would be kind enough to remove the four "the"s in the first sentence of my previous message and then delete this comment.

  35. ajay said,

    March 2, 2026 @ 7:27 am

    I thought that only the Qur'an in Arabic is "really" the Qur'an.

    To an observant Muslim, yes. But if you buy a translation of the Quran into English, the title on the front will probably be "The Quran" or "The Koran" – like this, for example https://www.penguin.co.uk/books/35169/the-koran-by-translated-with-notes-by-nj-dawood/9780141393841

    Penguin didn't publish it as "The Recitation" or something.

  36. Rodger C said,

    March 2, 2026 @ 10:40 am

    Mohammed Mar

    Pickthall (a convert) published it as "The Meaning of the Glorious Koran," and his was the usual English version for a long time.

  37. Rodger C said,

    March 3, 2026 @ 10:44 am

    *Mohammed Marmaduke Pickthall

  38. ajay said,

    March 4, 2026 @ 4:20 am

    Mohammed Marmaduke Pickthall

    This, surely, is a PG Wodehouse character.

RSS feed for comments on this post · TrackBack URI

Leave a Comment