Cite Check

A scholarly paper finds that a significant proportion of academic citations are faulty, suggesting that many researchers don't read the articles they reference.
July 8, 2008

Citations figure prominently in academic promotion and peer review. Theoretically, scholarly references serve a dual purpose: They indicate an author's familiarity with established literature and assign credit to previous work, while from the other direction many would argue they signal a paper's relevance and standing within a discipline.

That's, of course, in theory. The reality may surprise many academics who might not stop to think about the system they rely on for the production of knowledge, or who studiously ignore those little superscript numbers that indicate (again, in theory), Read the referenced paper to learn why the preceding assertion is correct.

What if it isn't correct? What if the authors didn't even read half of the papers they cited?

Like any self-enclosed, loosely policed network, citations are far from perfect. It's well documented, for example, that researchers tend to cite papers that support their conclusions and downplay or ignore work that calls them into question. Scholars also have ambitions and reputations, so it's not surprising to hear that they might weave in a few citations to articles written by colleagues they're trying to impress -- or fail to cite work by competitors. Maybe they overlook research written in other languages, or aren't familiar with relevant work in a related but different field, or spelled an author's name wrong, or listed the wrong journal.

All of these shortcomings are reviewed and discussed in an article published this year 1 in the management science journal Interfaces along with the critical responses to it. 2

As it turns out, scholars have already done some work quantifying problem citations, divided into two categories, "incorrect references" and "quotation errors." The authors of the paper, J. Scott Armstrong of the University of Pennsylvania's Wharton School and Malcolm Wright of the Ehrenberg-Bass Institute at the University of South Australia, Adelaide, write of the former type, "This problem has been extensively studied in the health literature ... 31 percent of the references in public health journals contained errors, and three percent of these were so severe that the referenced material could not be located."

More serious than such botched references are articles that incorrectly quote a cited paper or, as the authors put it, "misreport findings." For example, in the same study of health literature 3, they write, "authors’ descriptions of previous studies in public health journals differed from the original copy in 30 percent of references; half of these descriptions were unrelated to the quoting authors’ contentions."

It wasn't until Wright noticed that a paper Armstrong co-authored in 1977 had been inaccurately cited that they realized the extent of the problem, Armstrong said in an interview. It was Wright who suggested investigating the problem further in a more systematic way. So they focused on that specific article, which outlines a precise method for estimating the extent to which non-responses to mail surveys bias the results. Since the article has been heavily cited and the method it describes can be identified, they were able to trace how well articles that reference it represent the original material.

Using a combination of Google Scholar searches and the ISI Citation Index, Armstrong and Wright found results that are disconcerting even when acknowledging that Google doesn't crawl every academic paper: Among academic studies using mail surveys, only 6 percent mentioned the non-response problem at all, and of those, 2.1 percent (339 articles) cited the 1977 paper. The authors found 36 variations of the paper's citation among those that referenced it, with an "overall error rate" of 7.7 percent.

By analyzing a sample of 50 papers (out of 1,184) that cite the 1977 article (including the 30 most frequently cited of the bunch), the authors also found significant inaccuracies. By their standards, those papers didn't fare especially well either: "In short, although there were over 100 authors and more than 100 reviewers, all the papers failed to adhere to the [1977 paper's] procedures for estimating nonresponse bias. Only 12 percent of the papers mentioned extrapolation, which is the key element of [the paper's] method for correcting nonresponse bias."

The paper concludes: "Given the understandability of the recommendations and the fact that no one contacted Armstrong or [his 1977 co-author] for clarification, one might question whether the citing authors read the ... paper. To present their studies in a more favorable light, some authors may have wanted to dispel concerns about nonresponse bias; thus, they cited [it] for support for their own procedures. Interestingly, one of our colleagues said that it is common knowledge that authors add references that they have not read in order to gain favor with reviewers. One wonders: If it is possible to write a paper without reading the references, why should the authors expect readers to read the references?"

If the problem is as widespread as Armstrong and Wright suggest -- and Armstrong said he believes the findings generalize to other scientific fields -- then a more systemic fix might be warranted. They provide several common-sense remedies intended to address what the peer review system currently, it appears, is unable to counteract ("My experience is most peer reviewers don’t seem to be competent to do the job," Armstrong says). "When an author uses prior research that is relevant to a finding, that author should make an attempt to contact the original authors to ensure that the citation is properly used," they write.

"As I point out in the paper, I’ve been doing this for years, and it doesn’t really require that much work," Armstrong said. "Generally, I found it to be easy to do. I do it as an author; I hardly get anybody asking me -- they just go ahead and quote me incorrectly."

The paper also argues that researchers should have to verify to journal editors that they've tried to contact the relevant authors, and that they've read the papers they cited. Furthermore, they suggest, there could be a solution waiting on the Web -- one that sounds a lot like a cross between the Wikipedia model and reviews: "Journals should open Web sites (free to nonsubscribers) that allow people to post key papers that have been overlooked, along with a brief explanation of how the findings relate to the published study."

Already, Interfaces and a journal Armstrong co-founded, the International Journal of Forecasting, are planning to introduce those suggestions into their editing processes. Rob J. Hyndman, the forecasting journal's editor-in-chief, said in an e-mail that within two weeks, the Web submission system will include a check box with this text: "Confirm that the list of references has been checked carefully for accuracy and that each of the references has been read by at least one of the authors."

And the 2008 paper? For the record, Armstrong said he and Wright followed their own advice in publishing their research: "Oh yeah, we talked about that. We had to make sure that each one of us had read every one of the papers."

1Armstrong, J. S., M. Wright. 2008. The Ombudsman: Verification of Citations: Fawlty Towers of Knowledge? Interfaces 38(2) 125-139.

2This paper is available to download, in PDF format, by clicking here.

3See previous paragraph.


Be the first to know.
Get our free daily newsletter.


Back to Top