Editors' note: this guest entry has been kindly contributed by Pablo Achard (University of Geneva). After a PhD in particle physics at CERN and the University of Geneva (Switzerland), Pablo Achard (pictured to the right) moved to the universities of Marseilles (France) then Antwerp (Belgium) and Brandeis (MA) to pursue research in computational neurosciences. He currently works at the University of Geneva where he supports the Rectorate on bibliometrics and strategic planning issues. Our thanks to Dr. Achard for this 'insiders' take on the challenges of making sense of world university rankings.
Kris Olds & Susan Robertson
If the national rankings of universities can be traced back in the 19th century, international rankings appeared somewhere in the beginning of the 21st century . Shanghai Jiao Tong University’s and Times Higher Education’s (THE) rankings were among the pioneers and remain among the most visible ones. But you might have heard of similar league tables designed by the CSIC, the University of Leiden, the HEEACT, QS, the University of Western Australia, RatER, Mines Paris Tech, etc. Such a proliferation certainly responds to a high demand. But what are they worth? I argue here that rankings are blurry pictures of the academic landscape. As such, they are much better than complete blindness but should be used with great care.
The image of the academic landscape grabbed by the rankings is always a bit out-of-focus. This is improving with time and we should acknowledge the rankers who make considerable efforts to improve the sharpness. Nonetheless, the sharp image remains an impossible to reach ideal.
First of all, it is very difficult to get clean and comparable data on such a large scale. The reality is always grey, the action of counting is black or white. Take such a central element as a “researcher”. What should you count? Heads or full-time equivalents? Full-time equivalents based on their contracts or the effective time spent at the university? Do you include PhD “students”? Visiting scholars? Professors on sabbaticals? Research engineers? Retired professors who still run a lab? Deans who don’t? What do you do with researchers affiliated with non-university research organizations still loosely connected to a university (think of Germany or France here)? And how do you collect the data?
This toughness to obtain clean and comparable data is the main reason for the lack of any good indicator about teaching quality. To do it properly, one would need to evaluate the level of knowledge of the students upon graduation, and possibly compare it with their level when they entered the university. To this aim, OECD is launching a project called AHELO, but it is still in its pilot phase. In the meantime, some rankers use poor proxies (like the percentage of international students) while others focus their attention on research outcomes only.
Second, some indicators are very sensitive to “noise” due to small statistics. This is the case for the number of Nobel prizes used by the Shanghai’s ranking. No doubt that having 20 of them in your faculty says something about its quality. But having one, obtained years ago, for a work partly or fully done elsewhere? Because of the long tailed distribution of the university rankings, such a unique event won’t push a university ranked 100 into the top 10, but a university ranked 500 can win more than a hundred places.
This dynamic seemed to occur in the most recent THE ranking. In their new methodology, the “citation impact” of a university counts for one third of the final note. Not many details were given on how this impact is calculated. But the description on the THE’s website and the way this impact is calculated by Thomson Reuters – who provides the data to THE - in its commercial product InCites. makes me believe that they used the so-called “Leiden crown indicator”. This indicator is a welcome improvement to the raw ratio of citations per publications since it takes into account the citation behaviours of the different disciplines. But it suffers from instability if you look at a small set of publications or at publications in fields where you don’t expect many citations : the denominator can become very small, leading to rocket high ratios. This is likely what happened with the Alexandria University. According to this indicator, this Alexandria ranks 4th in the world, surpassed only by Caltech, MIT and Princeton. This is an unexpected result for anyone who knows the world research landscape .
Third, it is well documented that the act of measuring triggers the act of manipulating the measure. And this is made easy when the data are provided by the university themselves, as for the THE or QS rankings. One can only be suspicious when reading the cases emphasized by Bookstein and colleagues. “For whatever reason, the quantity THES assigned to the University of Copenhagen staff-student ratio went from 51 (the sample median) in 2007 to 100 (a score attained by only 12 other schools in the top 200) […] Without this boost, Copenhagen’s […] ranking would have been 94 instead of 51. Another school with a 100 student-staff rating in 2009, Ecole Normale Supérieure, Paris, rose from the value of 68 just a year earlier, […] thus earning a ranking of 28 instead of 48.”
Pictures of a landscape are taken from a given point of view
But let’s suppose that the rankers can improve their indicators to obtain perfectly focused images. Let’s imagine that we have clean, robust and hardly manipulable data to rely on. Would the rankings give a neutral picture of the academic landscape? Certainly not. There is no such thing as “neutrality” in any social construct.
Some rankings are built with a precise output in mind. The most laughable example of this was Mines Paris Tech's ranking, placing itself and four other French “grandes écoles” in the top 20. This is probably the worst flaw of any ranking. But other types of biases are always present, even if less visible.
Most rankings are built with a precise question in mind. Let’s look at the evaluation of the impact of research. Are you interested in finding the key players, in which case the volume of citations is one way to go? Or are you interested in finding the most efficient institutions, in which case you would normalize the citations to some input (number of articles or number of researchers or budget)? Different questions need different indicators, hence different rankings. This is the approach followed by Leiden which publishes several rankings at a time. However this is not the sexiest and media-friendly approach.
Finally, all rankings are built with a model of what a good university is in mind. “The basic problem is that there is no definition of the ideal university”, a point made forcefully today by University College London's Vice-Chancellor. Often, the Harvard model is the implicit model. In this case, getting Harvard on top is a way to check for “mistakes” in the design of the methodology. But the missions of the university are many. One usually talks about the production (research) and the dissemination (teaching) of knowledge, together with a “third mission” towards society that can in turn have many different meanings, from the creation of spin-offs to the reduction of social inequities. For these different missions, different indicators are to be used. The salary of fresh graduates is probably a good indicator to judge MBAs and certainly a bad one for liberal art colleges.
To pursue the metaphor with photography, every single snapshot is taken from a given point of view and with a given aim. Point-of-views and aims can be visible as it is the case in artistic photography. They can also pretend to neutrality, as in photojournalism. But this neutrality is wishful thinking. The same applies for rankings.
Rankings are nevertheless useful pictures. Insiders who have a comprehensive knowledge of the global academic landscape understandably laugh at rankings’ flaws. However the increase in the number of rankings and in their use tells us that they fill a need. Rankings can be viewed as the dragon of New Public Management and accountability assaulting the ivory tower of disinterested knowledge. They certainly participate to a global shift in the contract between society and universities. But I can hardly believe that the Times would spend thousands if not millions for such a purpose.
What then is the social use of rankings? I think they are the most accessible vision of the academic landscape for millions of “outsiders”. The CSIC ranks around 20,000 (yes twenty thousand!) higher education institutions. Who can expect everyone to be aware of their qualities? Think of young students, employers, politicians or academics from not-so-well connected universities. Is everyone in the Midwest able to evaluate the quality of research at a school strangely named Eidgenössische Technische Hochschule Zürich?
Even to insiders, rankings tell us something. Thanks to improvements in the picture’s quality and to the multiplication of point-of-views, rankings form an image that is not uninteresting. If a university is regularly in the top 20, this is something significant. You can expect to find there one of the best research and teaching environment. If it is regularly in the top 300, this is also significant. You can expect to find one of the few universities where the “global brain market” takes place. If a country - like China - increases its share of good universities over time, this is significant and that a long-term 'improvement' (at least in the direction of what is being ranked as important) of its higher education system is under way.
Of course, any important decision concerning where to study, where to work or which project to embark on must be taken with more criteria than rankings. As one would never go for mountain climbing based solely on blurry snapshots of the mountain range, one should not use rankings as a unique source of information about universities.
 See The Great Brain Race. How Global Universities are Reshaping the World, Ben Wildavsky, Princeton Press 2010; and more specifically its chapter 4 “College rankings go global”.
 The Leiden researchers have recently decided to adopt a more robust indicator for their studies http://arxiv.org/abs/1003.2167 But whatever the indicator used, the problem will remain for small statistical samples.
 See recent discussions on the University Ranking Watch weblog for more details on this issue.