In 2019, the Office of Scientific Integrity at our institution, Duke University, held a town hall meeting on plagiarism. It began with an overview that included material on self-plagiarism: the reuse of an author’s previously published material. A dean from the graduate school then spoke about the plagiarism-detection platform iThenticate, which compares submitted papers against published papers and identifies passages that are identical or nearly so. Many scientific journals and major granting agencies, we were told, are also using iThenticate. Now that Duke had adopted the software, those in attendance were encouraged to use it to check their manuscripts and grants for plagiarism and self-plagiarism prior to submission.
To humanities scholars, this situation might seem nonsensical; you wouldn’t need a computer to tell you if you’d plagiarized another writer or inappropriately reused your own material. But the situation is different in science, technology, engineering and mathematics fields. One difference is that research progresses in incremental steps through multiple closely related publications. While publishers expect each paper to offer a novel and substantive advance over previous work, STEM researchers often have reason to repeat some details from their earlier papers, such as definitions, overviews of previous research or methodological details.
Another difference is that STEM research articles usually have multiple authors -- various combinations of faculty members, graduate students, research faculty members and postdocs. Some junior researchers may not yet have learned the norms for reusing materials in their discipline, and even experienced researchers may have different sensibilities about what’s appropriate when it comes to reuse. Further, because authors might draft different parts of an article, a principal investigator may not notice if a co-author has reused material in ways that they would deem inappropriate. In such situations, iThenticate-generated reports could help researchers not just to discover problems but also to facilitate learning regarding what is and is not considered acceptable practice.
But for authors looking for guidance on how to reuse their previously published material appropriately, resources are limited. Interestingly -- and, as it turns out, problematically -- one of the most widely accessed guides is iThenticate’s own 2011 white paper “The Ethics of Self-Plagiarism.” In a Google search for “self-plagiarism” we conducted this past August, iThenticate’s paper was the second hit out of 310,000. Its influence has extended to graduate schools’ Responsible Conduct of Research training materials, university library guidelines on plagiarism, ethical guidelines of scholarly societies, educational blogs and even an Albanian government policy paper. The document also appears cited in multiple educational columns, editorials and author guidelines of scientific journals.
While iThenticate’s paper may be perceived as an authoritative guide, it is also the product of a for-profit plagiarism-detection business with a virtual monopoly across academe, scholarly publishing and government. (IThenticate’s parent company, Turnitin, was sold last year for an estimated $1.75 billion.) Given this clear conflict of interest, the academic community should know exactly what this document says, as well as what it is: a misrepresentation of the realities and ethics of academic research -- and a guide that leads to worse, not better, writing.
Misrepresentation in ‘The Ethics of Self-Plagiarism’
IThenticate’s paper begins with this hypothetical scenario:
Leslie is an assistant professor going through tenure review with significant pressure to publish. An article she is writing for a journal piggybacks on a recent conference presentation that was also published by the conference sponsor. Leslie would like to integrate the writing from the conference presentation into the article. She faces an ethical dilemma: to repurpose her own writing from one text and use it for another, thereby increasing her number of publications for tenure, but from the same work. Doing so, Leslie might commit what [Rochester Institute of Technology communications professor Patrick] Scanlon … calls "academic fraud," a form of self-plagiarism.
(Yes, we noted that, in the last words of this passage, the authors of the document have oddly made the broader category, academic fraud, a subset of an item within the category of self-plagiarism.)
While the scenario states that Leslie faces an “ethical dilemma,” its details suggest something rather different. In many STEM fields, journals that scholarly societies publish routinely solicit authors of proceedings papers from their own conferences -- often asking authors to add material or revise prior to submission. For instance, the Electrochemical Society and the IEEE Robotics and Information Society both provide instructions for authors of conference proceedings papers on how to submit version of those papers to their journals. To associate such reuse with the hyperbolic category of “academic fraud” is clearly misleading.
Ironically, this document might itself be considered fraudulent in its misuse of the 2007 source it cites on this very point. In describing a comparable situation, Patrick Scanlon actually states in his essay that an academic committee on which he served was “unanimous in finding that [the situation before them] did not constitute academic fraud” (emphasis ours).
This erroneous analysis of academic reuse extends to the white paper’s conceptual framing of “self-plagiarism.” According to the authors, “the ethical issue of self-plagiarism is significant, especially because self-plagiarism can infringe upon a publisher’s copyright.” But copyright infringement is matter of law, not of ethics. Further, just as soon as the topic of law is raised, it is abandoned for a cursory and amateurish survey of definitions of plagiarism -- drawn not from the extensive scholarship on plagiarism but from online dictionaries.
But, the paper then tells us, we shouldn’t worry too much about definitions since what we’re really interested in “is the ethics behind self-plagiarism.” And yet only a single, one-page section of this document, titled “Ethical Issues of Self-Plagiarism,” is even putatively devoted to actual ethical matters -- and even this soon wanders back to an inaccurate discourse on copyright law.
The remainder of the section is no better, providing unsupported and misleading speculation about the American Psychological Association’s publication manual. Readers learn that while the fifth edition did not address self-plagiarism, the sixth edition (the most recent in 2011) did -- which is true. The interpretation of this addition as evidence that the APA “has taken a recent position against the practice” is, however, specious. In fact, that edition of the APA manual, as well as the recent seventh edition, articulates a rather nuanced policy on reuse. It establishes that authors should not recycle material in ways that mislead readers or result in duplicate publication of main content, but also that they may appropriately recycle in specific situations with, or even sometimes without, self-citation.
Much of the paper’s penultimate section, “Avoiding Self-Plagiarism,” is merely an advertisement for iThenticate’s services -- a sales pitch that is more subtly reprised in the paper’s conclusion: “Organizations and individual authors and researchers can take preventative measures in their writing practices and editing processes, including the use of technology that helps detect potential self-plagiarism before submitting their work for publication” (emphasis ours).
It does not take a careful reading of this paper to see that it is neither a scholarly nor educational document. It is, rather, a marketing tool for a brilliant if questionable business strategy where publishers pay for the detection service, and research institutions pay to ensure that their employees’ papers won’t get flagged. Profit on both ends.
A Better Alternative: ‘Text Recycling’
What exactly does iThenticate expect to happen when clients find instances of “self-plagiarism” identified in their papers prior to submission? As scholars in writing studies, we can tell you: many will use the software to game the system -- manipulating flagged passages by swapping in synonyms, changing tenses and rearranging clauses until the program generates a clean enough report. (Why else would Turnitin, the student version of this software, give students three tries for each paper?)
Rather than learning how to reuse material from their prior works thoughtfully and transparently in ways that align with the expectations of their fields, many researchers are learning that they should disguise it. While writing teachers across the country are instructing their students in how not to patch write, iThenticate is teaching faculty how.
The result will almost certainly be worse writing. When the aim of revision is simply to avoid detection, the passing version will likely be less clear and pleasing than the original. The document’s most important readers -- those who are following the authors’ research closely -- may have difficulty determining precisely how the new study differs from the prior one. These outcomes are especially likely for papers written by the many -- if not the majority of -- scientists on the planet for whom English is not their first language but who must write in English for publication. It is easy to say, “One can always find another way to express an idea,” but clearly communicating science poses a considerable challenge in itself. Expecting researchers to come up with different ways to express complex ideas for the sole purpose of appeasing software seems unnecessarily burdensome.
In place of the fear-evoking label of “self-plagiarism,” we instead advocate for “text recycling.” By this we refer to the reuse of one’s previously written materials without the syntax of quotation and regardless of its acceptability. Unlike plagiarism, text recycling may be, as the Committee on Publication Ethics puts it, “unavoidable” or even “desirable.” Professional organizations, publishers and scholarly societies such as the American Psychological Association and the Association for Computing Machinery have all also formally stated that some uses of text recycling can be appropriate.
The shift from self-plagiarism to text recycling offers a neutral framework for learning about and teaching this common discursive practice. With a theory of text recycling, researchers, research integrity officers and editors can think openly and honestly about the conditions under which researchers can reuse their previously written materials in ways that serve the advancement of their science without compromising integrity. Mentors can teach without worry their recycling practices as part of the research-writing process. And authors would no longer find themselves pressured to reformulate perfectly effective prose merely to satisfy the plagiarism-detection machine.
We do not mean to imply that text recycling doesn’t involve complicated ethical and practical issues. It most certainly does. One of them is how iThenticate’s hypothetical scholar, Leslie, should represent the pair of papers in her CV and tenure portfolio. In situations like those, scholars should be mindful not to misrepresent such documents as distinct intellectual contributions. Given how common such situations are, institutions should make formal policies stating how faculty members should signal the relationship between “derivative” works and their precursors in their reappointment and promotion materials.
In addition to advocating for increased attention to responsible text recycling, we urge iThenticate to remove this white paper from its website. And given iThenticate’s conflict of interest, the parent corporation should refrain from participating in the ethically and legally complex conversations surrounding text recycling. They should leave it to the research and publishing communities to sort out best practices based on careful, unbiased and informed consideration of the relevant ethical and legal matters.
Those who wish to understand the ethics of text recycling should avoid iThenticate’s paper altogether and instead read one of the many well-written and well-informed papers on the topic. Our first suggestion is Scanlon’s smart and nuanced essay. Readers will find, at its conclusion, not a sales pitch but this thoughtful statement, which also sums up our own views: “We do and should give writers legal and ethical latitude for limited self‐copying, although certainly not for egregious duplication.”