Britain Tries to Evaluate Teaching Quality

Controversial effort sorts universities into gold, silver and bronze. Some prestigious institutions didn’t get gold.

June 22, 2017

Gold, silver and bronze.

The British government releases the results today of its new three-tiered rating system of teaching quality at universities. The government rating exercise sorting institutions into gold, silver and bronze categories has been controversial on a number of counts and echoes similar accountability movements in the U.S., including performance-funding initiatives at the state level and the Obama administration’s scuttled attempt to create a national college ratings system.

It remains to be seen how much influence the British ratings will have with students and their families, but results of the Teaching Excellence Framework, or TEF, as it is known, could eventually have financial consequences for universities. Future TEF results could be linked to universities’ abilities to raise tuition by differential amounts as early as academic year 2020-21, after the completion of an independent study.

In the meantime, it will no doubt be widely remarked upon that the results of the first round of the exercise do not follow traditional reputational hierarchies. The Universities of Cambridge and Oxford both scored a gold, but so did many less well-known, regionally focused universities, while the internationally recognized London School of Economics and Political Science settled for a bronze. Two other members of Britain’s elite club of Russell Group universities, the Universities of Liverpool and Southampton, also were rated bronze. A total of 295 institutions opted to participate in the ratings, and, excluding those that earned provisional ratings due to insufficient data, about a quarter each of participating institutions earned gold and bronze awards, and the remaining half silver. No institution got anything less than a bronze.

The TEF ratings are based on relative, rather than absolute, measures of quality: universities are compared on six core quantitative metrics against benchmarks calculated to account for the demographic profile of their students and the mix of programs offered. In other words, a university rated gold doesn’t necessarily have better student satisfaction data, retention rates or employment outcomes -- all core metrics factored into the survey -- than a university rated bronze. Rather, a university with a gold rating may have been judged to perform better on those measures than would have been predicted based on the profile of the students they serve and the programs they offer.

All this means that some teaching-intensive universities that do a good job teaching students from a wide array of backgrounds but don’t factor into the international rankings, which largely reward reputation and research output, have a chance to rise to the top. Indeed, the British government says that its purposes for the TEF include raising esteem for teaching and recognizing excellence in the classroom. “The Teaching Excellence Framework is refocusing the sector’s attention on teaching -- putting in place incentives that will raise standards across the sector and giving teaching the same status as research,” Universities Minister Jo Johnson said in a statement.

But in the views of many observers, the TEF suffers from the same problem that perpetually plagues efforts to put in place meaningful university ranking systems, including in the U.S. -- a lack of adequate data about teaching quality and student learning gains.

“The Teaching Excellence Framework would have comprehensively failed if it had simply replicated existing hierarchies,” said Nick Hillman, the director of the Higher Education Policy Institute, a British think tank. “It was always designed to do something different to other league tables and rankings -- namely, to show where there are pockets of excellence that have been ignored and to encourage improvements elsewhere.”

Hillman said the gold ratings are hard-won and well deserved. “Nonetheless, in this early guise, the TEF is far from a perfect assessment of teaching and learning,” he said. “While it tells us a lot of useful things, none of them accurately reflects precisely what goes on in lecture halls. I hope university applicants will use the results in their decision making, but they should do so with caution, not least because the ratings are for whole universities rather than individual courses.”

A Controversial Rating

The methodology for the TEF includes both quantitative and qualitative components. There are six core quantitative metrics: retention rates, student satisfaction data on measures related to teaching, assessment, and academic support taken from the National Student Survey, and data on rates of employment or postgraduate study six months after graduation taken from the Destination of Leavers From Higher Education survey.

Universities are judged on their performance on these metrics, both overall and in relation to various demographic groups, in a statistical calculation intended to control for different universities’ student characteristics, admissions requirements and academic programs. This process generates a “hypothesis” of gold, silver or bronze, which a panel of assessors then tests against additional evidence submitted for consideration by the university (higher education institutions can make up to a 15-page submission to TEF assessors). Ultimately the decision of gold, silver or bronze is a human judgment, not the pure product of a mathematical formula.

Chris Husbands, the chair of the TEF panel and the vice chancellor of Sheffield Hallam University, acknowledged that the process has been controversial, with a number of different objections raised.

“The first is a philosophical one as to whether you can make reliable judgments about the quality of teaching across an institution,” he said. “The second has been the government’s decision to classify the institutions as gold, silver or bronze. You’ve got some very complex institutions. My own institution has 33,000 students across something like 20 different departments, and we’re expressing a single judgment.”

“And the third, this is not an inspection-based system. So the panel have not looked at teaching in any lecture room in any of these universities. What we have done is to look at the outcomes of teaching and to project what were the institutional processes that produced this outcome.”

All that said, Husbands argues that the rating process brings value. “What the TEF does is to focus attention on the relationship between institutional policies, what institutions say they do, institutional practices -- which may or may not turn out to be the same as policies -- and student outcomes,” he said.

“It’s forcing universities to think clearly about the relationship between the activities they undertake and the way they describe them and the outcomes the students achieve. And although I can be self-critical about many aspects of the metrics, that connection between what an institution says it does, what an institution actually does and what outcomes it achieves for its students seems to me to be worth having.”

Husbands added, “What the TEF has been is a massive pebble chucked into the pond of U.K. higher education. I suspect that though we could have had years of piloting, actually just making the decision -- 'we are going to do this and we’re going to make this happen' -- is the way to make real change happen.”

“What I do believe is the TEF says something about the environment that we create for our students, the sorts of students we can attract here and what we do in terms of adding value to them by the time they leave,” said Robert Allison, the vice chancellor and president of Loughborough University, which received a gold rating. Allison added that he has no doubt the U.K. government will make continual improvements to the TEF, as it has with a research-oriented equivalent, the REF.

Others are less convinced of the TEF’s value. The National Union of Students issued a statement describing it as "another meaningless university ranking system no one asked for, which the government is introducing purportedly in the name of students. Yet students have walked away from it, with thousands boycotting one of its key components, the National Student Survey."

The student union accused the government of having "ignored the concerns of students, academics and experts across the country who have warned against the introduction of the TEF, arguing that its measurements fail to capture anything about teaching quality. Until this is addressed, this ranking system is nothing but a Trojan horse to justify raised tuition fees and treat the higher education sector like any other market, to be ineptly measured and damagingly sold."

“Crucially, this is a pilot year for an exercise that is really untried and untested,” said Tim Bradshaw, the acting director of the Russell Group. “A lot of the measures that make up the fundamental baseline of the TEF are proxies, and not all of them proxies for anything to do with teaching excellence.”

Bradshaw pointed out that half of the quantitative metrics that feed into TEF come from a student satisfaction survey, and he said that a student who took a particularly challenging course might well be unhappy, but for a good reason -- “We were challenging them; we were stretching them.” Bradshaw also said he was concerned about the potential that the nuances of benchmarking and the fact that the ratings measure relative versus absolute performance might be overlooked by students, including prospective international students who just see a gold, silver or bronze rating attached to an institution.

“The TEF is a pilot year; it’s one amongst many different sorts of sets of data one can look at,” Bradshaw said. He noted for example the release last week of new graduate earnings data: the data, for example, show alumni of LSE, which got a bronze on the TEF, toward the very top of the income strata five years after graduation compared to other graduates of social science and economics programs.

Universities UK, the umbrella association for university leaders, also stressed in its response to the TEF that this is a trial year for the exercise. "These new Teaching Excellence Framework ratings are based on a number of publicly available data and are intended to complement the range of other information available to students. They are not a comprehensive assessment of a university's academic quality," the group's president, Julia Goodfellow, the vice chancellor at the University of Kent, said in a statement.

"It is important that the data used are appropriate, robust and take account of the considerable diversity within our university sector. The challenge will be to develop the system to ensure the information is properly communicated and helpful to students in the decision-making process."

A U.S. Perspective

Could -- should -- something like the TEF be replicated in the U.S.? Former President Obama's administration got major pushback from universities when it proposed creating a college rating system, an effort it eventually abandoned in favor of releasing a revamped and expanded consumer information tool called the College Scorecard.

Robert Kelchen, an assistant professor of higher education at Seton Hall University who researches finance and accountability policies, said he sees parallels to state-level performance-funding formulas, which tie some funding to measures of outcomes, and to the push by some in Congress to set minimum quality standards for accreditors. "U.S. higher education policy has focused more on trying to identify the worst actors than [doing] finer gradation among higher-performing institutions," he said.

"It would be logistically difficult and expensive to do certain parts" of what the U.K. is doing, Kelchen continued. "For example, doing national surveys of former students would be expensive. We would need to be better at being able to track student outcomes. The College Scorecard was a step, the program-level gainful-employment data represents another step, and individual states have the kinds of data systems that are needed, but not all those systems talk across state lines. If the federal government wanted to do something like this and was willing to invest significant time and money, they could do this probably in about five years or so, but I don’t think anyone in the federal government wants to do this sort of systematic look at all colleges. I think it’s much more at the federal level about trying to identify the lowest-performing institutions, while maybe some states may try to be more nuanced in their approaches."

"If we ever tried to do red light, green light or they’re trying gold, bronze and silver, I think a lot of heads would roll," said Mark Schneider, a vice president and institute fellow at the American Institutes for Research and a former commissioner of the National Center for Education Statistics under the George W. Bush administration.

"Colleges and universities are really very powerful, and the organizations that represent them, especially the not-for-profits, are very powerful and they are trying very hard not to be judged. You saw what happened with the Scorecard. It was going to be a ranking system and then it turned into an informational system, and that was the end of that. Theoretically it was going to be tied to Pell Grant and Title IV [federal financial aid awards]; that got scotched almost immediately."

Jamienne Studley, a former deputy under secretary at the Department of Education during the Obama administration, said from her perspective the federal college ratings effort foundered on data limitations, specifically constraints having to do with data definitions that only captured first-time, full-time students who start in the fall and a lack of sufficient information about student preparation levels.

"Those two constraints meant that time and again, even when we had a good idea like clusters, like red, yellow, green, or gold, silver, bronze or any other permutations, as close as we would get to, 'oh, we could do it this way,' it would founder on the data that was available to us, and our fear that we would do something that would work backward for students and the schools we were trying to serve instead of forward," Studley said.

Backward, she said, in that “if you can't include information about student preparation that’s wide enough and informative enough, then the danger that places that can cherry-pick students would just not take students who are more challenging to educate or that cost more to educate was very frightening.”

"What we’re all looking for are ways to understand what people know and can do when they finish an educational experience, and that’s still very hard to get at," said Studley, who's now an independent consultant and national policy adviser for the nonprofit organization Beyond 12. "So many of the metrics that we look at have to be proxies for the basic know and can do: Is your employer satisfied, do you report five years later that you feel prepared for the things you’re called on to do in the workplace, did you pass the licensing test in your field that tests practical knowledge of nursing or engineering? It's very hard to get at that fundamental [question of] where do people learn important things and where do they learn them in ways that have the most effect and significance. That’s still the thing we’re circling around."


Be the first to know.
Get our free daily newsletter.


Back to Top