Observations of Professors: Tread Lightly

Student evaluations of teaching are suspect -- but increasing classroom observation of professors as an alternative has its own set of problems, write Jonathan Golding and Philipp Kraemer.

July 24, 2015

The controversy over the value of teaching evaluation surveys completed by students has led to increased calls to include some type of faculty observation as part of one’s teaching dossier. Those against student evaluations argue that direct observations of teaching avoid the questionable validity of student opinions, which are heavily influenced by popularity and are vulnerable to faculty pandering. Even those who feel that student evaluations are still valuable for faculty evaluation feel that gaining additional data through observation is a worthy goal.

We argue that although there may be a place for direct observation of teaching (e.g., professional developmental -- such as helping new faculty members master a particular classroom technique), this type of evaluation raises a number of questions. Moreover, we believe that after careful consideration of the complexities of faculty observation, what now seems like a reasonable alternative or addition to student evaluations is not a worthwhile pursuit. Most important in this regard is that the observation of faculty does not help us deal effectively with critical issue of faculty accountability in the classroom.

There are both conceptual and methodological problems with direct observation of teaching. One concern is that it violates a sense of professionalism and academic freedom, both of which have been cornerstones of teaching in higher education. College professors, unlike secondary education teachers, are expected to assert their agency as scholars. Professors are trained, credentialed and expected to be responsible for their professional work. Both teaching and research depend on the assumption that professors do not require top-down management of their work. As a core principle of our professionalism, academic freedom is intended to preclude direct interference in our teaching and research.

College teachers are to be held accountable, yes, but as professionals who are governed by intrinsic dedication to teaching rather than extrinsic management. Having an outsider to the class drop in to watch seems to disrespect the professionalism of college teaching.

The classroom is not the factory floor; college teaching should not require nor should it tolerate efforts to manage the process from beyond the agency of the teachers themselves. The idea that we need outsiders to watch us teach is the kind of assumption that can transform a teacher into a mere knowledge worker. Evaluating a college professor is not the same as evaluating a teacher in grades 1-12.

Teaching is ultimately an intimate affair, regardless of whether it occurs with a single student or a class of 500. Students and teachers engage in a personal dance; we do not teach classes, we teach students. Direct observation of classroom performance can be a major intrusion that disrupts the very nature of the teaching moment.

Methodologically, there are several practical questions about faculty observation that are rarely addressed. First, who will actually observe? Perhaps it will be leaders of academic units (i.e., department chair, associate dean or dean). The problems with such a strategy are that administrators have likely not been fully engaged in the classroom for some time, and (like most college faculty) they probably have had little or no formal training in theories of teaching or pedagogical techniques. There is no more harmful evidence than that generated by an uninformed, incompetent observer.

Especially troubling is the prospect of an administrator observing and evaluating faculty from disciplines that use different pedagogies than their home department. Alternatively, some or all faculty in a unit could serve as observers. The problems here are that some faculty will balk at the idea of judging their peers, especially without anonymity, and others may be unwilling to participate as observers because it takes away valuable time from other scholarly activities. Another problem with faculty observers is that the amount of teaching experience between observers will likely vary a great deal. This raises questions such as, “Should an untenured faculty member evaluate a senior tenured colleague?”

Second, how should what we will call “observer bias” be handled? This bias may occur because each observer is likely set in their own way of teaching. Different faculty members have radically different teaching philosophies, and our attitudes toward our philosophies will taint our regard for different approaches. In addition, an observer may be biased because they have a limited repertoire of teaching experience. For example, an observer may not have taught the course being observed or may not have taught a class of the same size. How valid are judgments about teaching in a class of 500 made by an observer who has only taught seminars?

Third, how many peers should observe each faculty member and how many observations should occur? There is considerable evidence that observational evidence can be easily distorted. For example, both the number of observers and the number of observations can dramatically influence the validity of observations.

Also, should the teacher being observed know in advance of the observational session? It would seem that unannounced observations would better serve the evaluation process, but that strategy introduces its own problems. For example, variations in course content (some topics are more difficult than others), external factors that impact students (the stress of midterm exams), and the physical health of students and teachers can all affect teaching and learning. To be observed on the wrong day by the wrong observer could easily produce a meaningless assessment.

Fourth, how should the observer evaluate the faculty member? There are many possibilities for this type of evaluation ranging from some type of behavioral assessment (e.g., how many students attended lecture) to tallies of how many active learning techniques were used during a class period. However, given the amount of variability between college courses, it is simply unclear what evaluation technique(s) would be best.

As accountability in higher education continues to grow, fueled by both internal and external institutional forces, expect to hear louder calls for improved teaching. Consequently, expect more hand-wringing over our inability to effectively measure teaching competence, and watch the momentum rise for implementing quick fixes such as direct observation of teaching.

Of course, instead of measuring the process of teaching, we could adopt the more reasonable tack of measuring what students actually learn. Rather than having faculty invest time and creativity in improving assessment of teaching as a proxy for measuring learning, we should be more deeply committed to the latter. Given the impressive progress shown by mind sciences in understanding learning, memory and thinking, and given the many new tools available as a result of the digital revolution, we believe it is possible to do just that.

It is our opinion that this measurement does not require reliance on standardized tests, simplistic rubrics or other conventions of the assessment movement. Rather, if we trust that faculty can design effective measurements of learning in their classes (e.g., through exams), then we should be able to develop ways of using similar measures to evaluate teaching effectiveness. As we move toward developing these measures, we should never forget that teaching is merely a means to an end, and it is that end for which we need to be accountable.


Jonathan M. Golding and Philipp J. Kraemer are professors of psychology at the University of Kentucky.

Back to Top