As colleges and universities grapple with both the financial fallout of the pandemic and the overdue recognition of our nation’s long-standing history of racial injustice, especially anti-Black and anti-Indigenous racism, the question of who should teach is of paramount importance.
In our just-published paper, “Staffing the Higher Education Classroom,” in the Journal of Economic Perspectives, we presented empirical analyses of our own and other scholars that explored whether there is a better way to measure teaching effectiveness than having students fill out course evaluations. We also specifically addressed the following questions:
- Are charismatic teachers better teachers?
- Is there a trade-off between faculty research and teaching excellence?
- Does the rise of non-tenure-eligible faculty relative to tenure-line faculty impact student learning?
- In what ways do instructor gender, race and ethnicity matter?
We began with the issue of how to measure teaching effectiveness, because a large literature shows that the results of the commonplace practice of students filling out evaluation forms are biased by gender, race and nationality. White American men are often given higher ratings than others, and without objective measures of student learning, it is impossible to evaluate whether those ratings are actually “earned.” Concerns about bias have led the American Sociological Association to caution against the overreliance on student evaluations of teaching, and at least 17 other scholarly associations have endorsed that view.
So how might we better evaluate teaching excellence? Using data on Northwestern University students and the faculty members who taught them during their first quarter in entry-level courses at Northwestern, we identified in our research two different measures. The first considers the ability of professors to encourage students into taking additional courses in a subject area. Compelling and charismatic teachers presumably inspire students into further disciplinary study, whether or not those students were predisposed to doing so.
The second teaching measure examines grades a student earns while pursing further study, measuring the unexpected deviation in the grade received by a student in follow-up courses in that subject. Successful undergraduate instructors in, say, introductory psychology, not only induce their students to take additional psychology courses, but they also prepare those students to do unexpectedly well in those additional classes (based on what we would have predicted given their standardized test scores, other grades, grading standards in that field and the like).
In our analysis of 170 tenured faculty at Northwestern, we have found that teachers who inspire many new majors appear to be no better or worse at teaching the material than their less captivating counterparts. Instructors who are exceptional at conveying course material -- as proxied by our second method based on subsequent grades in the subject -- are no more likely than others to encourage students to take more courses in the subject area.
Of course, increasing enrollment may be a reward in itself, at least at the department level, where it might translate into additional faculty lines. But for an institution, it is basically a zero-sum game, as adding a major in one department typically means losing one in another. And since the more charismatic faculty who attract majors don’t have any advantage in raising future grades, they might be piquing students’ interest in subjects, but they don’t seem to be setting those students up for disproportionate success in that subject.
Research and Teaching Excellence: A Trade-Off?
The two measures of teaching excellence we’ve described allowed us to address empirically whether those faculty who do particularly well in the undergraduate classroom pay a price in terms of their scholarship. But first, a word about our scholarship measures.
While measuring scholarly excellence is somewhat less contentious than evaluating teaching effectiveness, it is nonetheless fraught. In some fields, well-received books indicate success; in others, it is artistic performances; and in still others, it is highly cited articles or the awarding of grants.
We employed two very different scholarship measures. Each year since 1988, Northwestern has tasked a committee, composed of distinguished professors from a wide range of disciplines, to review the scholarly recognition of the faculty over the previous academic year and select a subset to be honored for their research excellence at an annual dinner. We reviewed those faculty as examples of achieving scholarly excellence. As an alternative measure, we followed a more traditional approach and constructed a citation index for each faculty member.
With the two measures of teaching quality and two measures of research quality, we made four comparisons of teaching quality and research quality among the tenured Northwestern faculty in our sample.
And we have found that, regardless of which measure used, top teachers are no more or less likely to be especially productive scholars than their less accomplished teaching peers. That is encouraging for those who fear that great teachers specialize in pedagogy at the expense of research. But it is disappointing to observe that weak undergraduate teachers do not make up for their limitations in the classroom with disproportionate research excellence.
Non-Tenure-Eligible Faculty and Classroom Quality
In 1975, 57 percent of all faculty members -- full-time and part-time, excluding graduate students -- were in the tenure system. In 2010, that percentage had been cut in half, and it continued to decline over the subsequent decade.
While some people have argued that this trend has implications for academic freedom and governance, we ask a different question: What does this mean for learning in the undergraduate classroom? We analyzed the teaching results of around 1,200 faculty and their students in introductory courses at Northwestern to see whether undergraduates taught by non-tenure-eligible faculty learn as much as those taught by tenure-line faculty.
Our findings show that, on average, they learned more. Such faculty not only are better on average than tenure-line faculty in encouraging students in introductory classes to take additional courses in their discipline, their students are more likely to outperform grade expectations in those courses.
It also turns out that the exceptional performance by non-tenure-eligible faculty members in introductory classes at Northwestern is driven entirely by the fact that the bottom quarter of the tenure-line faculty have much lower teaching effectiveness than the bottom quarter of the non-tenure-eligible faculty. That difference is especially large for the bottom 13 percent of the distribution (the weakest 150 professors).
That isn’t all that surprising. Non-tenure-eligible faculty are hired to teach, and those who perform relatively poorly are presumably less likely to be renewed than are those who perform well. In contrast, tenure-line faculty who are relatively poor instructors may be promoted and retained for reasons other than their teaching ability.
Our results differ from much of the existing literature that finds that students often rate non-tenure-eligible faculty more poorly and that greater numbers of those instructors lead to lower graduation rates. Perhaps the difference here is that non-tenure-eligible faculty at Northwestern, unlike adjuncts at some other institutions, tend to have stable, longer-term relationships with the university, and a substantial majority are full-time.
The Impact of Instructor Gender, Race and Ethnicity
A large literature regarding K-12 education suggests that the demographic match between teachers and students influences outcomes like test scores, attendance and graduation rates. For example, if a Black male student has at least one Black teacher in the third, fourth or fifth grade, that student is significantly less likely to drop out of high school and more likely to aspire to attend a four-year college. Those effects are particularly pronounced if the student comes from an economically disadvantaged background.
Evidence also suggests that a similar pattern exists at the collegiate level: having a woman professor in a STEM field substantially increases the likelihood that other women will take more classes and eventually graduate with a STEM degree. Similarly, racial and ethnic minority faculty members reduce the minority achievement gap in class performance and dropout rate.
It would seem likely, then, that hiring a more racially, ethnically and gender-diverse faculty will improve learning outcomes for many of our students.
But national data show that we have a long way to go before the faculty are as diverse as the students they teach. Not only are women and minorities underrepresented among faculty members in general, they are especially underrepresented among tenure-line faculty, particularly in the more senior ranks.
Women compose 52 percent of assistant professors but just 33 percent of full professors, while Black, Hispanic, Indigenous and mixed-race faculty represent 13 percent of assistant professors and only 8 percent of full professors. And while there are 1.8 white male full professors for every white male assistant professor, there are only 0.4 Black female full professors for every Black female assistant professor and 0.5 Hispanic female full professors for every Hispanic female assistant professor.
And if anything, the disproportionate impacts of the pandemic on female faculty, especially of color, seem likely to exacerbate current gender inequities.
These results indicate that, while universities facing multitasking problems have good reasons to recruit and reward faculty members on the basis of research, it comes at the cost of having a fraction of tenure-line faculty be disproportionately poor performers in the undergraduate classroom -- both in terms of converting majors and in our direct learning measure.
So why have those high-priced scholars in the undergraduate classroom in the first place? Some people may conclude that it would surely be more cost-efficient to replace them either with lower-paid assistant professors or, cheaper still, faculty not on the tenure line. The latter is what has, in fact, been happening throughout American higher education for the past several decades.
One answer might be that even if the teaching of illustrious research faculty isn’t exceptional, their presence often is. Having outstanding scholars teaching first-year students sends a signal to the community that the institution takes undergraduate education seriously -- that it isn’t just research and the production of Ph.D. students that matter.
What about those all-too-often maligned non-tenure-eligible faculty who, at least in our analysis, are exceptionally good teachers? While following the path of universities that now offer positions such as “lecturer with security of employment” seems inconsistent with our results (if you effectively grant tenure to teaching faculty, you may be forced to reappoint the weakest of the teachers), longer-term three- to five-year contracts, sabbaticals, increased pay and titles such as “professor of instruction” seem long overdue.
We caution, however, against institutions using our results that non-tenure-eligible faculty perform as well or better in the classroom as an excuse to address faculty diversity concerns by hiring female faculty and faculty of color into non-tenure-eligible ranks. It is essential that institutions work diligently to increase representation of women and faculty of color at all ranks, and to create systems that nurture junior female faculty and faculty of color so that they can thrive and ultimately become leaders on campus.
We want to reiterate that our empirical analysis is limited to introductory courses, and we imagine that some of those tenure-line faculty do a much better job teaching either upper-level undergraduate courses or classes at the graduate or professional school level. (At least, we can hope.)
We also recognize that our analysis takes place in the relatively rarefied setting of Northwestern University. However, we note that scholars have since replicated our results in other settings, including Florence Ran and Di Xu, who studied an anonymous state’s two- and four-year public university system. And in our own work, we’ve found that our results are stronger for the Northwestern students who are less traditionally academically qualified than for those who are more traditionally qualified. Those findings give us more confidence that our results will transfer to other institutions. And, of course, at the very least, our approaches provide other institutions with ideas for carrying out their own analyses.
We don’t pretend to offer a definitive formula for staffing undergraduate classrooms. But we believe that empirical analyses such as those we’ve explored here can help us discover the right answers.