How assessment falls significantly short of valid research (essay)

In a rare moment of inattention a couple of years ago, I let myself get talked into becoming the chair of my campus’s Institutional Review Board. Being IRB chair may not be the best way to endear oneself to one’s colleagues, but it does offer an interesting window into how different disciplines conceive of research and the many different ways that scholarly work can be used to produce useful knowledge.

It has also brought home to me how utterly different research and assessment are. I have come to question why anyone with any knowledge of research methods would place any value on the results of typical learning outcomes assessment.

IRB approval is required for any work that involves both research and human subjects. If both conditions are met, the IRB must review it; if only one is present, the IRB can claim no authority. In general, it’s pretty easy to tell when a project involves human subjects, but distinguishing nonresearch from research, as it is defined by the U.S. Department of Health and Human Services, is more complicated. It depends in large part on whether the project will result in generalizable knowledge.

Determining what is research and what is not is interesting from an IRB perspective, but it has also forced me to think more about the differences between research and assessment. Learning outcomes assessment looks superficially like human subjects research, but there are some critical differences. Among other things, assessors routinely ignore practices that are considered essential safeguards for research subjects as well as standard research design principles.

A basic tenet of ethical human subjects research is that the research subjects should consent to participate. That is why obtaining informed consent is a routine part of human subject research. In contrast, students whose courses are being assessed are typically not asked whether they are willing to participate in those assessments. They are simply told that they will be participating. Often there is what an IRB would see as coercion. Whether it’s 20 points of extra credit for doing the posttest or embedding an essay that will be used for assessment in the final exam, assessors go out of their way to compel participation in the study.

Given that assessment involves little physical or psychological risk, the coercion of assessment subjects is not that big of a deal. What is more interesting to me is how assessment plans ignore most of the standard practices of good research. In a typical assessment effort, the assessor first decides what the desired outcomes in his course or program are. Sometimes the next step is to determine what level of knowledge or skill students bring with them when they start the course or program, although that is not always done. The final step is to have some sort of posttest or “artifact” -- assessmentspeak for a student-produced product like a paper rather than, say, a potsherd -- which can be examined (invariably with a rubric) to determine if the course or program outcomes have been met.

On some levels, this looks like research. The pretest gives you a baseline measurement, and then, if students do X percent better on the posttest, you appear to have evidence that they made progress. Even if you don’t establish a baseline, you might still be able to look at a capstone project and say that your students met the declared program-level outcome of being able to write a cogent research paper or design and execute a psychology experiment.

From an IRB perspective, however, this is not research. It does not produce generalizable knowledge, in that the success or, more rarely, failure to meet a particular course or program outcome does not allow us to make inferences about other courses or programs. So what appears to have worked for my students, in my World History course, at my institution, may not provide any guidance about what will work at your institution, with your students, with your approach to teaching.

If assessment does not offer generalizable knowledge, does assessment produce meaningful knowledge about particular courses or programs? I would argue that it does not. Leaving aside arguments about whether the blunt instrument of learning outcomes can capture the complexity of student learning or whether the purpose of an entire degree program can be easily summed up in ways that lend themselves to documentation and measurement, it is hard to see how assessment is giving us meaningful information, even concerning specific courses or programs.

First, the people who devise and administer the assessment have a stake in the outcome. When I assess my own course or program, I have an interest in the outcome of that assessment. If I create the assessment instrument, administer it and assess it, my conscious or even unconscious belief in the awesomeness of my own course or program is certain to influence the results. After all, if my approach did not already seem to be the best possible way of doing things, as a conscientious instructor, I would have changed it long ago.

Even if I were the rare human who is entirely without bias, my assessment results would still be meaningless, because I have no way of knowing what caused any of the changes I have observed. I have never seen a control group used in an assessment plan. We give all the students in the class or program the same course or courses. Then we look at what they can or cannot do at the end and assume that the course work is the cause of any change we have observed. Now, maybe this a valid assumption in a few instances, but if my history students are better writers at the end of the semester than they were at the beginning of the semester, how do I know that my course caused the change?

It could be that they were all in a good composition class at the same time as they took my class, or it could even be the case, especially in a program-level assessment, that they are just older and their brains have matured over the last four years. Without some group that has not been subjected to my course or program to compare them to, there is no compelling reason to assume it’s my course or program that’s causing the changes that are being observed.

If I developed a drug and then tested it myself without a control group, you might be a bit suspicious about my claims that everyone who took it recovered from his head cold after two weeks and thus that my drug is a success. But these are precisely the sorts of claims that we find in assessment.

I suspect that most academics are either consciously aware or at least unconsciously aware of these shortcomings and thus uneasy about the way assessment is done. That no one says anything reflects the sort of empty ritual that assessment is. Faculty members just want to keep the assessment office off their backs, the assessment office wants to keep the accreditors at bay and the accreditors need to appease lawmakers, who in turn want to be able to claim that they are holding higher education accountable.

IRBs are not supposed to critique research design unless it affects the safety of human subjects. However, they are supposed to weigh the balance between the risks posed by the study and the benefits of the research. Above all, you should not waste the time or risk the health of human subjects with research that is so poorly designed that it cannot produce meaningful results.

So, acknowledging that assessment is not research and not governed by IRB rules, it still seems that something silly and wasteful is going on here. Why is it acceptable that we spend more and more time and money -- time and money that have real opportunity costs and could be devoted to our students -- on assessment that is so poorly designed that it does not tell us anything meaningful about our courses or students? Whose interests are really served by this? Not students. Not faculty members.

It’s time to stop this charade. If some people want to do real research on what works in the classroom, more power to them. But making every program and every faculty member engage in nonresearch that yields nothing of value is a colossal, frivolous waste of time and money.

Erik Gilbert is a professor of history at Arkansas State University.

Editorial Tags: 
Image Source: 
Is this diversity newsletter?: 

Education Department releases gainful employment data for vocational programs

Smart Title: 

Graduates who earned certificates at public institutions have larger salaries, but there is wide variation between programs even at the same institutions.

Developing metrics and models that are vital to student learning and retention (essay)

Is English 101 really just English 101? What about that first lab? Is a B or C in either of those lower-division courses a bellwether of a student’s likelihood to graduate? Until recently, we didn’t think so, but more and more, the data are telling us yes. In fact, insights from our advanced analytics have helped us identify a new segment of at-risk students hiding in plain sight.

It wasn’t until recently that the University of Arizona discovered this problem. As we combed through volumes of academic data and metrics with our partner, Civitas Learning, it became evident that students who seemed poised to graduate were actually leaving at higher rates than we could have foreseen. Why were good students -- students with solid grades in their lower-division foundational courses -- leaving after their first, second or even third year? And what could we do to help them stay and graduate from UA?

There’s a reason it’s hard to identify which students fall into this group; they simply don’t exhibit the traditional warning signs as defined by the retention experts. These students persist into the higher years but never graduate despite the fact that they’re strong students. They persist past their first two years and over 40 percent have GPAs above 3.0 -- so how does one diagnose them as at risk when all metrics indicate that they’re succeeding? Now we’re taking a deeper look at the data from the entire curriculum to find clues about what these students really need and even redefine our notion of what “at risk” really means.

Lower-division foundational courses are a natural starting point for us. These are the courses where basic mastery -- of a skill like writing or the scientific process -- begins, and mastery of these basics increases in necessity over the years. Writing, for instance, becomes more, not less, important over students’ academic careers. A 2015 National Survey of Student Engagement at UA indicated that the number of pages of writing assigned in the academic year to freshmen is 55, compared to 76 pages for seniors. As a freshman or sophomore, falling behind even by a few fractions can hurt you later on.

To wit, when a freshman gets a C in English 101, it doesn’t seem like a big deal -- why would it? She’s not at risk; she still has a 3.0, after all. But this student has unintentionally stepped into an institutional blind spot, because she’s a strong student by all measures. Our data analysis now shows that this student may persist until she hits a wall, usually during her major and upper-division courses, which is oftentimes difficult to overcome.

Let’s fast forward two years, then, when that same freshman is a junior enrolled in demanding upper-level classes. Her problem, a lack of writing command, has compounded into a series of C’s or D’s on research papers. A seemingly strong student is now at risk to persist, and her academic life becomes much less clear. We all thought she was on track to graduate, but now what? From that point, she may change her major, transfer to another institution or even exit college altogether. In the past, we would never have considered wraparound support services for students who earned a C in an intro writing course or a B in an intro lab course, but today we understand that we have to be ready and have to think about a deeper level of academic support across the entire life cycle of an undergrad.

Nationally, institutions like ours have developed many approaches to addressing the classic challenges of student success, developing an infrastructure of broad institutional interventions like centralized tutoring, highly specialized support staff, supplemental classes and more. Likewise, professors and advisers have become more attuned to responding to the one-on-one needs of students who may find themselves in trouble. There’s no doubt that this high/low approach has made an impact and our students have measurably benefited from it. But to assist students caught in the middle, those that by all measurement are already “succeeding,” we have to develop a more comprehensive institutional approach that works at the intersections of curricular innovation and wider student support.

Today, we at UA are adding a new layer to the institutional and one-to-one approaches already in place. In our courses, we are pushing to ensure that mastery matters more than a final grade by developing metrics and models that are vital to student learning. This, we believe, will lead to increases in graduation rates. We are working hand in hand with college faculty members, administrators and curriculum committees, arming those partners with the data necessary to develop revisions and supplementary support for the courses identified as critical to graduation rather than term-over-term persistence. We are modeling new classroom practices through the expansion of student-centered active classrooms and adaptive learning to better meet the diverse needs of our students.

When mastery is what matters most, the customary objections to at-risk student intervention matter less. Grade inflation by the instructor and performance for grade by the student become irrelevant. A foundational course surrounded by the support that a student often finds in lower-division courses is not an additional burden to the student, but an essential experience. Although the approach is added pressure on the faculty and staff, it has to be leavened with the resources that help both the instructor and the students succeed.

This is a true universitywide partnership to help a population of students who have found themselves unintentionally stuck in the middle. We must be data informed, not data driven, in supporting our students, because when our data are mapped with a human touch, we can help students unlock their potential in ways even they couldn’t have imagined.

Angela Baldasare is assistant provost for institutional research. Melissa Vito is senior vice president for student affairs and enrollment management and senior vice provost for academic initiatives and student success. Vincent J. Del Casino Jr. is provost of digital learning and student engagement and associate vice president of student affairs and enrollment management at the University of Arizona.

Image Source: 
Is this diversity newsletter?: 

Group releases draft quality standards for competency-based education

Smart Title: 

Group of colleges releases voluntarily standards for competency-based education, which Education Department official says could help prevent the rise of bad actors.

Lumina Revises Plan for Completion Push

The Lumina Foundation on Monday released a revised strategic plan for achieving its goal of 60 percent of Americans holding a college degree, certificate or other high-quality credential by 2025. The foundation has released a new plan every four years since first proposing the goal in 2008.

The latest iteration provides a more detailed breakdown of the 16.4 million Americans who will need to earn a credential to meet the goal. About 4.8 million are traditional-age students who now are not likely to earn a college degree or certificate. Another 6.1 million are potential returning adult students, who attended college but did not earn a credential. The final group is 5.5 million with no college credits -- 64 million Americans fit this description, Lumina said.

"Through the work we’ve done under our first two strategic plans, we have learned what it will take to reach the goal. But we also have learned that the changes that must be made are not mere tweaks. Modest, incremental improvement will not suffice. Indeed, fundamental redesign is required," the report said. "We must move from a system that is centered on institutions and organized around time to one that is centered on students, organized around high-quality learning and focused on closing attainment gaps. In short, we must build a true system of postsecondary learning from the disconnected and fragmented pieces we have now."

Obama administration releases final rules for teacher preparation programs

Smart Title: 

Federal regulations impose new standards on teacher education. Reformers endorse plan to link program evaluations to student performance, to dismay of teachers' groups.

Indiana creates student 'value index' while support builds for a federal student data system

Smart Title: 

While political support in Washington builds slowly for a federal student record database, Indiana and the University of Texas System get creative with their own data on how students fare after college.

B Lab Releases Standards for Colleges

B Lab is a nonprofit group that issues a seal of approval to companies across 120 industries that adhere to voluntary standards based on social and environmental performance, accountability and transparency. After a two years of work, the group on Friday released a new benchmarking tool for colleges. The voluntary standards are designed to enable comparisons of both nonprofit and for-profit institutions.

"B Lab recognizes that the cost and outcomes of higher education, particularly regarding for-profit institutions, have become increasingly controversial, but regardless of structure institutions should put their students’ needs first," Dan Osusky, standards development manager at B Lab, said in a written statement. "We see our role as the promoter of robust standards of industry-specific performance that can be used by for-profits and nonprofits alike to create the greatest possible positive impact and serve the public interest, ultimately by improving the lives of their students."

A committee of experts, working with HCM Strategists and with funding from the Lumina Foundation, devised the standards. Laureate Education, a global for-profit chain, already uses the assessment tool.

An evaluation of whether performance funding in higher education works (essay)

More than 30 states now provide performance funding for higher education, with several more states seriously considering it. Under PF, state funding for higher education is not based on enrollments and prior-year funding levels. Rather, it is tied directly to institutional performance on such metrics as student retention, credit accrual, degree completion and job placement. The amount of state funding tied to performance indicators ranges from less than 1 percent in Illinois to as much as 80 to 90 percent in Ohio and Tennessee.

Performance funding has received strong endorsements from federal and state elected officials and influential public policy groups and educational foundations. The U.S. Department of Education has urged states to “embrace performance-based funding of higher education based on progress toward completion and other quality goals.” And a report by the National Governors Association declared, “Currently, the prevailing approach for funding public colleges and universities … gives colleges and universities little incentive to focus on retaining and graduating students or meeting state needs …. Performance funding instead provides financial incentives for graduating students and meeting state needs.”

But with all this state activity and national support, does performance funding actually work? As we report in a book being published this week, Performance Funding for Higher Education (Johns Hopkins University Press), the answer is both yes and no.

Based on extensive research we conducted in three states with much-discussed performance funding programs -- Indiana, Ohio, and Tennessee -- we find evidence for the claims of both those who champion performance funding and those who reject it. In keeping with the arguments of PF champions, we find that performance funding has resulted in institutions making changes to their policies and programs to improve student outcomes -- whether by revamping developmental education or altering advising and counseling services.

Underpinning those changes have been increased institutional efforts to gather data on their performance and to change their institutional practices in response.

But we often cannot clearly determine to what degree performance funding is driving those changes. Many of the colleges we studied stated they were already committed to improving student outcomes before the advent of performance funding. Moreover, in addition to PF, the states often are simultaneously pursuing other policies -- such as initiatives to improve developmental education or establish better student pathways into and through higher education -- that push institutions in the same direction as their PF programs. As a result, it is nearly impossible to determine the distinct contribution of PF to many of those institutional changes.

Meanwhile, supporting the arguments of the PF detractors, we have not found conclusive evidence that performance funding results in significant improvements in student outcomes -- and, in fact, we’ve discovered that it produces substantial negative side effects. In reviewing the research literature on PF impacts, we find that careful multivariate studies -- which compare states with and without performance funding and control for a host of factors besides PF that influence student outcomes -- largely fail to find a significant positive impact of performance funding on student retention and degree attainment. Those studies do find some evidence of effects on four-year college graduation and community college certificates and associate degrees in some states and some years. However, those results are too scattered to allow anyone to conclude that performance funding is having a substantial impact on student outcomes.

Various organizational obstacles may help explain that lack of effect. Many institutions enroll numerous students who are not well prepared for college. In addition, state performance metrics often do not align well with the missions of broad-access institutions such as community colleges, and states do not adequately support institutional efforts to better understand where they are failing and how best to respond.

Even if performance funding ultimately proves to significantly improve student outcomes, the fact remains that it has serious unintended impacts that need to be addressed. Faced both by state financial pressures to improve student outcomes and substantial obstacles to doing so easily, institutions are tempted to game the system. By reducing academic demands and restricting the enrollment of less-prepared students, broad-access colleges can retain and graduate more students, but only at the expense of an essential part of their social mission of helping disadvantaged students attain high-quality college degrees. Policy makers should address such negative side effects, or they could well vitiate any apparent success that performance funding achieves in improving student outcomes.

In the end, performance funding, like so many policies, is complicated and even contradictory. To the question of whether it works, our answer has to be both yes and no. It does prod institutions to better attend to student outcomes and to substantially change their academic and student-service policies and programs. However, performance funding has not yet conclusively produced the student outcomes desired, and it has engendered serious negative side effects. The question is whether, with further research and careful policy making, it is possible for performance funding to emerge as a policy that significantly improves student retention, graduation and job placement without paying a stiff price in reduced academic quality and restricted admission of disadvantaged students. Time will tell.

Kevin Dougherty is a senior research associate at the Community College Research Center, Teachers College, Columbia University and an associate professor at Teachers College. Sosanya M. Jones is an assistant professor at Southern Illinois University. Hana Lahr is a research associate, Rebecca S. Natow is a senior research associate, Lara Pheatt is a former research associate and Vikash Reddy is a postdoctoral research associate, all with CCRC.

Editorial Tags: 

Report Proposes Alternate Form of Accreditation

The Center for American Progress today released a report that proposes a "complementary competitor" to the current system of accreditation.

The report describes three primary components for an outcomes-focused, alternative system, which, like current accreditors, would serve as a gatekeeper to federal financial aid.

  • High standards for student outcomes and financial health;
  • Standards set by private third parties;
  • Data definition, collection and verification, as well as enforcement of standards by the federal government.

"If implemented, this new system would provide a pathway to address America’s completion and quality challenges through desperately needed innovation," the report said. "And it would do so while establishing strong requirements to ensure that students and taxpayers get their money’s worth."


Subscribe to RSS - assessmentaccountability
Back to Top