In a rare moment of inattention a couple of years ago, I let myself get talked into becoming the chair of my campus’s Institutional Review Board. Being IRB chair may not be the best way to endear oneself to one’s colleagues, but it does offer an interesting window into how different disciplines conceive of research and the many different ways that scholarly work can be used to produce useful knowledge.
It has also brought home to me how utterly different research and assessment are. I have come to question why anyone with any knowledge of research methods would place any value on the results of typical learning outcomes assessment.
IRB approval is required for any work that involves both research and human subjects. If both conditions are met, the IRB must review it; if only one is present, the IRB can claim no authority. In general, it’s pretty easy to tell when a project involves human subjects, but distinguishing nonresearch from research, as it is defined by the U.S. Department of Health and Human Services, is more complicated. It depends in large part on whether the project will result in generalizable knowledge.
Determining what is research and what is not is interesting from an IRB perspective, but it has also forced me to think more about the differences between research and assessment. Learning outcomes assessment looks superficially like human subjects research, but there are some critical differences. Among other things, assessors routinely ignore practices that are considered essential safeguards for research subjects as well as standard research design principles.
A basic tenet of ethical human subjects research is that the research subjects should consent to participate. That is why obtaining informed consent is a routine part of human subject research. In contrast, students whose courses are being assessed are typically not asked whether they are willing to participate in those assessments. They are simply told that they will be participating. Often there is what an IRB would see as coercion. Whether it’s 20 points of extra credit for doing the posttest or embedding an essay that will be used for assessment in the final exam, assessors go out of their way to compel participation in the study.
Given that assessment involves little physical or psychological risk, the coercion of assessment subjects is not that big of a deal. What is more interesting to me is how assessment plans ignore most of the standard practices of good research. In a typical assessment effort, the assessor first decides what the desired outcomes in his course or program are. Sometimes the next step is to determine what level of knowledge or skill students bring with them when they start the course or program, although that is not always done. The final step is to have some sort of posttest or “artifact” -- assessmentspeak for a student-produced product like a paper rather than, say, a potsherd -- which can be examined (invariably with a rubric) to determine if the course or program outcomes have been met.
On some levels, this looks like research. The pretest gives you a baseline measurement, and then, if students do X percent better on the posttest, you appear to have evidence that they made progress. Even if you don’t establish a baseline, you might still be able to look at a capstone project and say that your students met the declared program-level outcome of being able to write a cogent research paper or design and execute a psychology experiment.
From an IRB perspective, however, this is not research. It does not produce generalizable knowledge, in that the success or, more rarely, failure to meet a particular course or program outcome does not allow us to make inferences about other courses or programs. So what appears to have worked for my students, in my World History course, at my institution, may not provide any guidance about what will work at your institution, with your students, with your approach to teaching.
If assessment does not offer generalizable knowledge, does assessment produce meaningful knowledge about particular courses or programs? I would argue that it does not. Leaving aside arguments about whether the blunt instrument of learning outcomes can capture the complexity of student learning or whether the purpose of an entire degree program can be easily summed up in ways that lend themselves to documentation and measurement, it is hard to see how assessment is giving us meaningful information, even concerning specific courses or programs.
First, the people who devise and administer the assessment have a stake in the outcome. When I assess my own course or program, I have an interest in the outcome of that assessment. If I create the assessment instrument, administer it and assess it, my conscious or even unconscious belief in the awesomeness of my own course or program is certain to influence the results. After all, if my approach did not already seem to be the best possible way of doing things, as a conscientious instructor, I would have changed it long ago.
Even if I were the rare human who is entirely without bias, my assessment results would still be meaningless, because I have no way of knowing what caused any of the changes I have observed. I have never seen a control group used in an assessment plan. We give all the students in the class or program the same course or courses. Then we look at what they can or cannot do at the end and assume that the course work is the cause of any change we have observed. Now, maybe this a valid assumption in a few instances, but if my history students are better writers at the end of the semester than they were at the beginning of the semester, how do I know that my course caused the change?
It could be that they were all in a good composition class at the same time as they took my class, or it could even be the case, especially in a program-level assessment, that they are just older and their brains have matured over the last four years. Without some group that has not been subjected to my course or program to compare them to, there is no compelling reason to assume it’s my course or program that’s causing the changes that are being observed.
If I developed a drug and then tested it myself without a control group, you might be a bit suspicious about my claims that everyone who took it recovered from his head cold after two weeks and thus that my drug is a success. But these are precisely the sorts of claims that we find in assessment.
I suspect that most academics are either consciously aware or at least unconsciously aware of these shortcomings and thus uneasy about the way assessment is done. That no one says anything reflects the sort of empty ritual that assessment is. Faculty members just want to keep the assessment office off their backs, the assessment office wants to keep the accreditors at bay and the accreditors need to appease lawmakers, who in turn want to be able to claim that they are holding higher education accountable.
IRBs are not supposed to critique research design unless it affects the safety of human subjects. However, they are supposed to weigh the balance between the risks posed by the study and the benefits of the research. Above all, you should not waste the time or risk the health of human subjects with research that is so poorly designed that it cannot produce meaningful results.
So, acknowledging that assessment is not research and not governed by IRB rules, it still seems that something silly and wasteful is going on here. Why is it acceptable that we spend more and more time and money -- time and money that have real opportunity costs and could be devoted to our students -- on assessment that is so poorly designed that it does not tell us anything meaningful about our courses or students? Whose interests are really served by this? Not students. Not faculty members.
It’s time to stop this charade. If some people want to do real research on what works in the classroom, more power to them. But making every program and every faculty member engage in nonresearch that yields nothing of value is a colossal, frivolous waste of time and money.
Erik Gilbert is a professor of history at Arkansas State University.
Two months ago I started keeping a notebook about the presidential election -- in part to jot down my musings and fulminations in a real-time chronicle of the most terrifying length of track on this year’s roller-coaster ride, and in part to wean myself from the habit of snarling profanities at the cable television news. (It was scaring the cats.)
A nickname for the project suggested itself -- The Trump Dump. For it really has been just the one candidate -- his moods and his impulses, far more than his policies, insofar as they could ever be determined -- who set the terms and the pace of the entire contest. Making sense of 2016 meant making sense of Donald Trump, or, rather, of how he ever emerged as a serious political force. “He is impervious to every bullet he shoots into his own feet,” reads one of my notes from before the first debate. “It’s hard to keep thinking about this, but impossible to stop.”
Hillary Clinton, by contrast, is all ineluctability and no enigma. She became the de facto presumptive Democratic candidate for 2016 no later than 2009. Even the scandals linked to her name seem perennial. As a tough-minded and successful professional woman in her 60s, Clinton embodies a misogynist’s worst nightmare, but that just means that the psychodrama of recent months has all been on the part of the candidate with the Tic Tacs.
The Clinton campaign’s greatest advantage was never her aura of inevitability, of course, but rather the widespread suspicion that a Trump presidency would prove to be, like a game of Russian roulette, altogether too exciting for everyone involved. HRC would have guaranteed us the comforts of familiar crises: annual displays of government-shutdown brinksmanship for one, along with a shrinking Supreme Court as the justices die off, with confirmation hearings postponed until after the latest presidential impeachment attempt.
In reading a selection of the master’s theses and doctoral dissertations on Hillary Clinton that academics completed between 1994 and May of this year, I’ve had much the same feeling: most of the scholarly attention to her has come from two or three disciplines and focused on a small range of topics.
I made two collections of abstracts from an online repository of theses and dissertations -- 30 in all, although one item appeared in both sets, bringing the total down to 29. The degrees sought were about evenly divided between the M.A. and the Ph.D., along with one Ed.D. and two M.S. degrees. A plurality of the work -- 10 out of all the theses or dissertations -- was identified as conducted in communications departments, with three more in rhetoric. Departments of political science, sociology, education and leadership studies hosted one study each, while two were listed as done in liberal studies programs. Of the five theses or dissertations for which no disciplinary affiliation was given, at least two or three showed an affinity to the study of rhetoric and communications -- historically, closely associated fields.
In short, more than half of the work on Clinton was performed by students working in rhetoric/communications. In a rough analysis of the topics, I found that 14 were clearly marked as focusing on gender (an implicit emphasis in a number of others). Ten each were identified as studies of rhetoric and media; three specified a focus on communication in general and three on online communication specifically. Seven concentrated on Clinton as first lady and nine on her 2008 campaign. All that said, I should make clear that a single thesis or doctoral dissertation might fall under up to three of these topical headings.
Over all, the emphasis of the studies was overwhelmingly on Clinton either as a user of some form of communication media or as an object of media representation. To give two master’s theses as examples, respectively: Christina Young Guest’s M.A. thesis, “Political Feminine Style and the Feminist Implications of the Respective Convention Speeches: Hillary Rodham Clinton and Sarah Palin” (University of Central Missouri, 2010), and Heidi Johnson’s “Clinton as Matron, Palin as MILF in 2008 Political Cartoons: Transformation in the Caricature of Female Authority?” (Hawaii Pacific University, 2009). As these titles suggest, questions regarding communications and gender issues were interrelated: every dissertation or thesis specifically focused on gender also addressed some aspect of rhetoric, media or communication.
Much less common were studies focusing on Clinton and policy. By my count, only five did. To risk an overgeneralization, researchers have tended to be more interested in how Clinton challenged or was constrained by traditional female roles or implicit assumptions about the proper relationship between public and private identity than in her activity as a senator or secretary of state.
The most recent of the studies -- accepted for the master of arts in liberal studies at Wake Forest University in May of this year -- concerned a matter that proved especially persistent throughout this year’s campaign: Whitney Jessica Threatt’s “A Transparent Hillary Clinton Through the Lens of Apologia Discourse,” wherein Clinton’s email server and its vexed status is addressed with respect to the Obama presidency’s policy on transparency and open government.
Drawing on a specialist literature about apologia (discursive mitigation when accused of injury and/or failure to live up to a certain standard), Threatt considers how various routine responses (denial, corrective action, shifting of blame, etc.) can serve to improve or worsen the accused’s situation vis-à-vis an audience. Complicating apologia for a very public figure such as Clinton is the double problem of media repetition (asking the same question over and over “suggests that the charges brought are true”) and widespread “alienation from politicians as well as the political process.”
Between Whitewater, the Lewinsky affair and so forth, Clinton has spent much of the past quarter century negotiating the terms of “image repair.” (Let’s nod at the existence of an additional set of specialist typologies here and just continue.) Meanwhile, opposing political operatives have built entire careers around raising the earlier circumstances for discussion again at every opportunity. In the case of the private email server, the researcher finds Clinton using certain forms of apologia employed in earlier controversies but with one mode in particular. Examining speeches and interviews with Clinton, Threatt notes that she “consistently attempts to demonstrate that she, herself, has been transparent about not only the investigation but throughout her time as secretary of state.” She assures listeners “that she is doing everything in her power to display transparency by providing the public with the actual emails …. The fact that she is doing more than what has been asked of her insinuates that she is being a leader.”
The upshot here is that Clinton has had an arsenal of rhetorical strategies at her disposal and considerable practice in using them -- with repetition and consistency as primary guiding principles, in part because the same questions and accusations return time and time again. On Tuesday, those strategies failed her.