Michael Lewis’s 2003 book, Moneyball -- later made into a movie starring Brad Pitt -- tells the story of how predictive analytics transformed the Oakland Athletics baseball team and, eventually, baseball itself. Data-based modeling has since transcended sport. It’s used in hiring investment bankers, for example. But is academe really ready for its own “moneyball moment” in terms of personnel decisions?
A group of management professors from the Massachusetts Institute of Technology think so, and they’ve published a new study on a data-driven model they say is more predictive of faculty research success than traditional peer-based tenure reviews. In fact, several of the authors argue in a related essay in MIT Sloan Management Review that it’s “ironic” that “one of the places where predictive analytics hasn’t yet made substantial inroads is in the place of its birth: the halls of academia. Tenure decisions for the scholars of computer science, economics and statistics -- the very pioneers of quantitative metrics and predictive analytics -- are often insulated from these tools.”
Many professors oppose the use of bibliometrics in hiring, tenure and promotion decisions, saying that scholarly potential can’t be captured in a formula most often applied to pursuits with a bottom line, like winning games or playing the stock market. Such a system inevitably will be “gamed" by academics, critics say, and time-consuming but ground-breaking research will be sidelined in favor of “sure things” in terms of publishing -- the academic equivalent of clickbait.
But in Sloan Review, the researchers argue that making data-based personnel decisions is in the public interest. “These decisions impact not just the scholars’ careers but the funding of universities and the overall strength of scientific research in private and public organizations as well,” they say. “On an individual level, a tenured faculty member at a prestigious university will receive millions of dollars in career compensation. At a broader scope, these faculty will bring funding into the universities that house them. The National Science Foundation, for instance, provided $5.8 billion in research funding in 2014, including $220 million specifically for young researchers at top universities.”
Bringing predictive analytics to any new industry means “identifying metrics that often have not received a lot of focus” and “how those metrics correlate with a measurable definition of success,” they note. In the case of academics, they say, important metrics are not just related to the impact of scholars’ past research but also details about their research partnerships and how their research complements the existing literature.
Their study, published in Operations Research, suggests that operations research scholars recommended for tenure by the new model had better future research records, on average, than those granted tenure by the tenure committees at top institutions. Defining future success as the volume and impact of a scholar’s future research, the researchers used models based on a concept called “network centrality,” measuring how connected a scholar is within networks related to success: citations, co-authorship and a combination of both. They considered data from a large-scale bibliometric database containing 198,310 papers published from 1975 to 2012 in the field of operations research, and then proposed prediction models of whether a scholar would perform well on a number of future success metrics using statistical models trained with data from the scholar’s first five years of publication, a subset of the information available to tenure committees.
Regarding tenure decisions, the authors curated a data set of 54 scholars who obtained Ph.D.s after 1995 and held assistant professorships at top operations research programs by 2003. They found that their statistical models made different -- read: better -- decisions than did tenure committees for 16 candidates, or 30 percent of the sample, when constrained to selecting the same number of candidates from the pool to tenure as did the committees.
The result was "a set of scholars who, in the future, produced more papers published in the top journals and research that was cited more often than the scholars who were actually selected by tenure committees,” the researchers say in Sloan Review. More precisely, they had better A-journal paper counts, citation counts and h-indexes.
The authors note some limitations of their study -- namely that it doesn’t consider service, which in addition to teaching is facet of academic life that can’t be easily captured in numbers.
While they also say that their data set is small and confined to tenure in one field, they argue that “similar models could be used in a variety of academic contexts, such as hiring new professors, evaluating candidates for grants and awards, and hiring scholars who previously held tenure-track positions at other institutions.”
Prediction models also need to be “separately calibrated for a broad range of academic disciplines using a large-scale database,” they say. “One possibility is to develop and distribute the models as a complementary service to an existing bibliometric database like Google Scholar or Web of Science. Models would need to be updated periodically, as patterns of publication change over time.”
Over all, they say, “though further evaluation is needed, the demonstrated effectiveness of these predictive analytic models in the field of operations research suggests that data-driven analysis can be helpful for academic personnel committees. Subjective assessments alone no longer need to rule the day.”
The study was written by Dimitris Bertsamis, Boeing Leaders for Global Operations Professor of Management and professor of operations research; Erik Brynjolfsson, Schussel Family Professor of management; John Silberholz, a lecturer in management; and Shachar Reichman, assistant professor of management at Tel Aviv University and research affiliate at MIT.
Brynjolfsson said more research is needed, but that he imagined it’s “very likely” the method would work across fields.
Academic “moneyball” has already has fans, including Brad Fenwick, senior vice president of global strategic alliances at Elsevier, who made his own baseball analogy in an interview with Inside Higher Ed about faculty analytics earlier this year. Of the new study, he said Monday that it’s clear to him from years of experience related to university faculty hiring and evaluation that “more and better data would be helpful.” Yet data “should inform, not make the decision,” he cautioned.
“We are not at the point of using [artificial intelligence] to replace human judgment. I would note that unless the tenure period is lengthened, citations are a lagging indicator and focused only on value by the academy,” Fenwick added. Downloads -- how many people are actually reading the study, not just citing it -- would also have to be included.
Yves Gingras, professor and Canada Research Chair in history and sociology of science at the University of Quebec at Montreal and author of Bibliometrics and Research Evaluation: Uses and Abuses, has argued that most bibliometrics are meaningless. He had a different take on the study. Unfortunately, he said, we will in the near future hear more and more such “simplistic attempts” at “automatizing” decisions about faculty members.
“The subtle but fundamental problem that the authors of all these kinds of algorithms seem to miss is the fact that hiring is an individual process, not a statistical one on a large group for which we would like to get a good mean,” he said.
They authors imply that tenure committees made a wrong choice, Gingras said, but that’s “based on the assumption that their criteria are the only right ones, whereas one can sensibly imagine that other criteria than the ones they have retained do play an important role in hiring and tenure decisions.” After all, he added, “a university is not only a paper production factory, but a place where people teach and train students.”
Recalling concerns at Rutgers and Georgetown Universities about the accuracy of the bibliometric service Academic Analytics, Gingras said a more serious concern is that it’s “unethical” to use algorithms to make decisions about candidates “without the possibility of checking to accuracy and validity of the chosen indicators.”
Gingras said via email that with “simplistic views of university life curiously coming from supposedly renown institutions, academics will have to constantly recall some basic principles: academic institutions hire individuals with a variety of skills in order to create departments that are opened to different views of things and ways of thinking. One can predict that using 'algorithms' to make decision will only lead to purely homogeneous organizations populated by clones doing all the same kind of research published in the so-called [top] journals.” (MIT Press published the English edition of Gingras’s book.)
Of such criticism, Brynjolfsson said, “If people read the paper carefully, most people will agree with it.” It doesn’t propose replacing tenure committees with quantitative metrics or even having its metrics become the primary source of information, he said.
“Real decisions are based on many factors, which we can’t quantify, including teaching and graduate student advising, helping colleagues, service to the school, shaping research and research agendas in ways that we don’t measure,” he added. “That said, we think having this kind of information can help tenure committees make better decisions, just as ‘moneyball’ helped managers find better players for their baseball teams. When committees are better informed and have objective data, they can weigh it in their ultimate decisions, alongside other factors.”