Many colleges are abandoning or downgrading student evaluations during coronavirus crisis. Will that stick?

You have /5 articles left.
Sign up for a free account or log in.

The onset of COVID-19 has turned higher education (like the rest of the world) upside down, forcing colleges, their staffs and their students to adapt on the fly. Sometimes that entails doing many "normal" things in new and often unaccustomed ways, like delivering mental health services virtually, or holding online commencements.

Other times it means more colleges taking steps that some peers had taken previously, such as temporarily suspending the use of standardized tests in admissions.

The realm of teaching and learning, this column’s domain, has been dramatically transformed. The most obvious example is this spring’s big shift of all instruction into remote or virtual formats, which was the topic of the last few iterations of this column.

But that’s far from the only aspect of the learning process that has undergone significant change in the wake of the novel coronavirus. My Inside Higher Ed colleagues, for instance, have documented the spreading embrace of pass/fail grading to give students more flexibility and reduce their anxiety during this nerve-racking time.

Students aren’t the only ones feeling anxious. The closing of physical classrooms has forced the majority of instructors who’ve never taught anything but a face-to-face course to do so under less-than-ideal circumstances, to say the least. In the shift to remote instruction, many professors are themselves using entirely new tools and pedagogical techniques -- almost certainly with varying degrees of success.

That reality has led some colleges to decide -- and some organizations and commentators to recommend -- that institutions alter their approach to the student evaluations that many use to assess professors’ performance. Numerous colleges and universities are allowing instructors to opt out of collecting student ratings of their teaching for the winter and spring terms, while others have said they will continue to collect the evaluations during this time but won’t consider them in assessing faculty performance.

How are colleges approaching their use of student reviews during the coronavirus crisis -- and how should they be?

Will their approach during this unusual time change how they view (and use) student evaluations of teaching going forward? Will this accelerate the momentum against how they are used?

Read on.

***

Even BC (before coronavirus), skepticism about the value of student evaluations of teaching was mounting. Two years ago, the University of Southern California stopped using the ratings in tenure, promotion or other high-stakes personnel decisions, as Inside Higher Ed's Colleen Flaherty reported, citing mounting evidence of bias in student reviews of instructors and the lack of correlation between student ratings and learning outcomes.

Selection of Recent Changes
to Faculty Evaluation Policies

Dalhousie University: Student
ratings "entirely formative," no
results provided to administrators.
DePaul University: Continuing
with evaluations as planned.
Provost: "I make a commitment
to you" that spring 2020 course
reviews "will not be held against
you in any review, or promotion
and tenure decisions."
U of California, Irvine: Evaluations
"excluded from future review files
unless" faculty member chooses
to include them.
U of Georgia: Teaching and
learning center officials "strongly
recommend" that instructors not
engage in "typical course
evaluations."

Since then, the arguments against overdependence on student reviews have grown. Last year 18 scholarly associations urged colleges and universities to stop using the evaluations as the primary way of judging teaching effectiveness in personnel decisions. And in February a new study made the case that even "good" student evaluations of teaching -- those that aren't biased, and that are correlated -- are not effective in differentiating between better and worse instructors.

And yet student evaluations are ubiquitous in higher education. "Because these instruments are cheap, easy to implement, and provide a simple way to gather information, they are the most common method used to evaluate faculty teaching for hiring, tenure, promotion, contract renewal and merit raises," the American Sociological Association and other groups said in their statement last year.

Even colleges that lean heavily on student ratings sometimes exempt the use of (or dependence on) student ratings in extreme circumstances, such as when a professor becomes ill during a course, or when an instructor makes a major modification in a course that might diminish the significance of students' responses.

So what should happen in a moment like this one when just about every instructor is facing extreme circumstances -- teaching in the middle of pandemic, having to adopt a significantly (if not entirely) new way of teaching in a new mode of instruction?

"Faculty also need to be judged on a pass/fail basis," David M. Perry, senior academic adviser for the University of Minnesota's history department, wrote in a CNN op-ed last month recommending that colleges ease their grading policies for students. "Online teaching is uniquely hard and cannot be mastered as a skill in a couple of hours or days. Student evaluations -- which for pre-tenure or pre-promotion faculty are vitally important to their career advancement -- will be useless in this new arrangement."

The American Association of University Professors issued guidelines last month reiterating that decisions about how to assess instructional quality are the faculty's domain, and that "under these extraordinary circumstances, the faculty may wish to consider whether temporary adjustments in faculty evaluation, including suspending the administration of student evaluations, may be appropriate."

Joshua Eyler, director of faculty development at the University of Mississippi and author of How Humans Learn: The Science and Stories Behind Effective College Teaching, endorsed that strong anti-ratings stance in a recent blog post entitled "Cancel the Teaching Evaluations Too!"

"While much of the conversation in higher ed at this time has rightly focused on students -- because of the major disruption caused by the coronavirus, the emotional and psychological turmoil brought on by the crisis, and the sudden transition to a new learning environment -- faculty are facing all of this too. Universities should grant faculty the same grace they are extending to students and administrators should not evaluate their teaching in the same way they would in other semesters."

Eyler laid out several possible approaches that colleges might take to rethink the use of student evaluations during a time when "they are being asked to do so much with tools, parameters, and environments that are -- in many cases -- unfamiliar."

The first, keeping the status quo by collecting reviews of individual faculty members and continuing to use them to judge professors' teaching, he openly rejected as "unjustifiable." A second, continuing to collect the ratings but either issuing a blanket declaration not to use them in faculty evaluations or giving professors a choice of whether to include them in future reviews, "look[s] reasonable on the surface" but "share[s] a common flaw," he argued.

That flaw is that once the data are collected, there's no way to ensure that they won't ultimately be used to evaluate faculty members for tenure or promotion, even if administrators currently commit that they won't.

"As a person, I'm optimistic to a fault, and I believe that everyone will act in good faith," Eyler said in an interview. "But as someone involved in decision making, I can’t create policy based on that assumption. In an ideal world, we would collect this information and let faculty choose how it's used. But practically and realistically, we want to protect those faculty who are most vulnerable" -- adjuncts, instructors early on the tenure track, etc.

Another possibility might be to create a new student evaluation precisely for this moment, but that is impractical given the compressed timeline, Eyler wrote.

The last option -- "do not collect SETs during COVID-affected semesters" -- he described as "the most equitable model." This is the option with the least possibility for misuse (because there will not be anything to misuse) and it is the only one that truly levels the playing field."

Not collecting information about individual faculty members through student evaluations would forfeit information that could help instructors understand how they served their students during this strange and uncomfortable time, and "what they can improve upon for themselves." But in Eyler's view, that's a small price to pay for avoiding any unfair impact on professors' careers.

A Partial Counterpoint

Thomas J. Tobin shares many of the oft-voiced concerns about the use of student evaluations -- so much so that he declines to use that standard term to describe them.

"We call them 'student ratings,' not 'student evaluations,' because students are not qualified to evaluate their professors on almost anything," says Tobin, program area director for distance teaching and learning at the University of Wisconsin at Madison. (Asking students whether a professor assigned them the "appropriate" amount of work is silly, he suggests; how do students know how much work is pedagogically appropriate, he says.)

Tobin believes that student ratings have a role in a well-rounded evaluation of instructors by other professional educators, along with observations by peers of an instructor's teaching and administrative perceptions of his or her instructional skills.

But because most institutions "usually don’t have enough people and time to be able to perform multivariate evaluation of professors," Tobin says, most depend way too much on student ratings. As a result, he adds, "the current system of use of student ratings is deeply flawed."

Yet he doesn't suggest throwing a flawed system out in a time like this, for a variety of reasons.

"Under any circumstances, but especially under emergency circumstances, we should strive to do as much as possible in the exact same way we always did," to provide students with "as much familiarity as possible," Tobin says. It will be useful to have "apples-to-apples comparisons" of students' experiences in the Remote Learning 1.0 courses that most institutions stood up on the fly this spring.

While he recognizes Eyler's concern about possible misuse of the information, he notes that most colleges and universities already have mechanisms in place for "collecting but not counting student ratings data." Professors can apply for what's often called a "provost's letter" when they try out something new or get sick, which will keep the data from being used against them.

For Tobin, ultimately, the potential harm to a vulnerable instructor that Eyler worries about must be balanced against the "aggregate benefit" of the "unique window" that student ratings offer into students' experiences of their courses at this unusual time.

"We can get insights about students' well-being, about how students as a group and as small subgroups experience this shift, so we can make informed shifts in our remote instruction," Tobin says. "This is the kind of information we absolutely need, maybe more than ever in a time of emergency."

Eyler agrees that colleges should be collecting information about students' learning experiences during the coronavirus -- "about hiccups along the way," impediments students found to their ability to learn, and the like. In fact, Mississippi's Center for Excellence in Teaching and Learning is producing just such a survey now to distribute to students and faculty members alike.

"If institutions aren't collecting that [information], they should, but it doesn't need to be -- really shouldn't be -- done through an individual faculty member's evaluation," he said.

***

Despite their differences, Tobin and Eyler end up in roughly the same place when they're asked how the current moment might affect the overall debate about the role of teaching evaluations.

"In times like this, I'd like to think we'd focus on what's essential, and be asking, 'What does this actually do for us? What's actually valid?'" Tobin said. "There's an opportunity to rethink what we're doing."

Says Eyler, "What I hope happens is that it forces a spotlight on the process itself, and that we ask really important and hard questions about how we use them. Yes, you can get some valuable information from them, but they should be one piece of a larger puzzle. If places get by this semester without evaluations, they will have to find other pieces to look at [to evaluate instructors]. Maybe that will help us establish at least the foundation of a new kind of [faculty evaluation] process."