New Metric System?

Measuring scholarly influence by citations made sense... once. Scott McLemee looks at the emerging alternatives.

You have /5 articles left.
Sign up for a free account or log in.

Two or three times a month, I head to a research library to spend at least a couple of hours at a computer terminal tracking down journal articles or papers from various databases -- and then printing them out, since I can’t really absorb an argument without pen in hand to mark up the text. A few months ago, a German graduate student showed me what he was taking back home after several weeks here on a fellowship: a zip drive on his keychain, containing thousands of pages of text. I respect the efficiency, and do feel bad about the dead trees. But my cognitive wiring is not readily upgradable; and anyway, there is evidence to suggest that it has its own benefits.

As for the routine of trawling through databases, the latest issue of Against the Grain puts it into an interesting context. I’ve mentioned the magazine in this column from time to time as one focused on the “inside baseball” of scholarly publishing and research-library life. You could call Against the Grain a publication of record for intellectual infrastructure. But that seems too solemn for a journal that, besides the serious articles, runs its fair share of gossip and general quirkiness. The new issue includes a poem that parodies the one by William Blake about the "tyger" -- updated so that it is about Kindle. (Let none but hardcore nerds enter here.)

The September issue looks at the topic of metrics for scholarly journals. Usage statistics are a factor in decisions about what a library should be carrying. You can see where this would tend to have a self-reinforcing effect. Most of us have drawn up a mental list of the major journals in our fields of interest – a rough estimate of their relative prestige and influence. That, not surprisingly, can be quantified. But changes in scholarly publishing have called into question just how this is done.

“There is huge diversity among scholars and the ways in which they use and cite scholarly publishing,” writes the guest editor, Peter Shepherd, in his introduction to the issue. Shepherd is director of Project COUNTER, one of various recent initiatives to create new metrics suited for the way research is done now. (The acronym stands for Counting Online Usage of Networked Electronic Resources.)

A well-established metric in the natural and social sciences is the Journal Impact Factor, which assesses how often articles in a given publication are cited by other journals. This figure has implications going well beyond decision-making about whether or not to renew subscriptions, of course. The impact factor of the journals in which a researcher publishes can influence funding for a department or a project. Some concerns about the influence of this metric were listed in a statement by the European Association of Science Editors. And the impact factor has “limitations that originate from the inherent properties of citation data,” writes Johan Bollen, an associate professor of informatics and computing at Indiana University, in the new Against the Grain. “It can take anywhere from six months to several years to publish an article and for it to become ‘citable,” he writes -- making the impact factor “a delayed indicator of current scholarly activity,” at best.

But more fundamental are the changes in what counts as a scholarly publication now, and in how scholars work. It is now possible to track not just when an article is cited, but how often it is read, or at least accessed. As we sit at our terminals downloading material for research papers -- whether from journals, digital repositories, or commercial databases – it creates usage data that can be used to analyze how scholarly material is disseminating.

One advantage of this, writes Bollen, is that “usage data can be recorded for a wide variety of participants in the scholarly communication process, not merely those who publish journal articles, and can in principle be recorded for any online resource including books, data files, software, images, and sound files.” It also matters that this usage data “is recorded at a very large scale that may exceed the magnitude of all existing citations by several orders of magnitude” -- allowing for “a more reliable assessment of scholarly activity and impact” than is codified in bibliographies.

Of course, none of the records have much value in their raw state, sitting there on university servers. Bollen is the principle investigator of the MESUR (Metrics from Scholarly Usage of Resources) project, which is “aggregating otherwise separately recorded usage data sets from the world’s most significant publishers, aggregators, and institutional consortia.” The data is threshed and sorted, allowing for “the reconstruction of user clickstreams, i.e. the sequence of how a user moves from one article (and journal) to the next in a session.”

This makes it possible to calculate how likely it is that a researcher downloading a given paper will go on to look at another.

Other contributors describe work on data-mining initiatives such as PIRUS (Publisher and Institutional Repository Usage Statistics) and SNIP (Source Normalized Impact per Paper). As with much else in Against the Grain, these are developments that ordinary civilians remain blithely unaware of -- but that will quietly reshape the landscape of research.