New effort aims to standardize faculty-driven review of student work

You have /5 articles left.
Sign up for a free account or log in.

The debate over how much actual learning is taking place on college campuses is a historically heated one, as is the related discussion about how to measure that learning.

At the risk of oversimplifying, opinions on the latter range between two extremes. On one end are those (typically policy makers, researchers and trustees) who believe faculty grading of academic work at individual campuses says little to nothing about whether students there are really learning. On the other are those (mostly on college faculties) who believe that attempts to standardize assessment of student learning (through a national exam, say) are seriously flawed because they are too distant from what happens in the classroom and define learning too narrowly, among other problems.

Finding common ground between those polar viewpoints (though there are many perspectives in between) has been difficult.

A fledgling effort by the Association of American Colleges and Universities and the State Higher Education Executive Officers, though, holds promise in bridging that gap. The initiative has a bulky title, Multi-State Collaborative to Advance Learning Outcomes Assessment, and like most things related to student learning, it is complicated and a bit hard to explain.

But by getting professors from around the country to (a) agree on a set of general education outcomes and (b) use that rubric to judge actual classroom work from representative groups of students at colleges around the country, the AAC&U project could produce a cross-institutional method of judging student learning that can win the trust of instructors skeptical about most national forms of learning assessment. Lest one think the faculty-driven process will paint a rosy picture of student learning, however, the results from the effort's first, pilot year largely confirm other studies showing many students scoring low on key outcomes.

"This holds a lot of promise in grappling with these issues in a new, different way," said Corbin M. Campbell, an assistant professor of higher education at Columbia University's Teachers College who studies college educational quality.

Measuring Quality

Agitation over how much learning is taking place on college campuses probably peaked near the end of the George W. Bush administration, when the commission formed by then Education Secretary Margaret Spellings embraced the adoption of standardized measures of college achievement, to help determine "how much students learn in colleges and whether they learn more at one college than another," as the panel said in its final report. The 2011 publication of Academically Adrift -- which used data from transcripts and one such national test, the Collegiate Learning Assessment, to make the case that many students did not show learning gains at four-year colleges -- reinforced the commission's concerns.

The question of student learning outcomes has been largely relegated to the back burner of public policy in the last few years, displaced by recession-driven concerns over whether students are emerging from college prepared for jobs. Evidence of that new orientation can be found in the fact that the Obama administration's newly revamped College Scorecard, designed as a way to judge institutional performance, focuses almost wholly on economic-related and other nonacademic outcomes.

That's not entirely by choice, though; administration officials noted in a policy paper accompanying the Scorecard that while learning outcomes are "an important way to understand the results and quality of any educational experience … there are few recognized and comprehensive measures of learning across higher education, and no data sources exist that provide consistent, measurable descriptions across all schools or disciplines of the extent to which students are learning, even where frameworks for measuring skills are being developed."

At the crux of the debate over student learning outcomes in higher education is whether it is both necessary and possible to develop measures of learning that can be compared from institution to institution. Many colleges, and many individual professors at those colleges, have developed their own tools or approaches for measuring student outcomes, but the results mean little beyond that particular setting. Advocates for comparability argue that grade inflation has rendered professors' grades an inadequate way of assuring academic quality, and that a national measure is needed to help institutions benchmark their own performance and help prospective students judge colleges' performance.

The rebuttal: the many things we expect students to learn in college -- and the many differing things that different sorts of institutions seek to impart to their own students -- cannot be captured in a national assessment. (And besides, look at what No Child Left Behind has done to elementary and secondary education, they are often quick to add.)

While the learning outcomes discussion has moved out of the public policy spotlight, a lot of work has continued -- and AAC&U has been at the epicenter of much of it. It developed "essential learning outcomes" as part of its Liberal Education and America's Promise program, and then brought hundreds of faculty members together to develop a set of rubrics for gauging those outcomes through its Valid Assessment of Learning in Undergraduate Education (VALUE) initiative. Through those rubrics, academics from many institutions and disciplines essentially have developed a national set of expectations for what students should know and be able to do, touching on everything from critical and creative thinking to ethical reasoning to integrating learning.

The next step came in 2011, when (with the support of the Bill & Melinda Gates Foundation) AAC&U teamed up with SHEEO on the Multi-State Collaborative, which involved 60 institutions in nine states, mostly regional comprehensive universities and community colleges. The colleges shared 7,000 samples of actual student course work, collected in a digital platform created by the e-portfolio company Taskstream, which were independently scored by more than 125 professors who had been trained in using the VALUE common scoring rubrics. (Professors did not judge work from their own campuses.)

Katherine V. Wills, associate professor and English program director at Indiana University-Purdue University at Columbus, said she was drawn to participate in the multistate project because "I resist the notion that tests and pretests and posttests are best way to assess students," as she put it. "But I also recognize that some things aren't fully captured by the standard grading that we're doing. So I saw an opportunity to look at some kind of way to assess the work that's being done in higher education through the expertise of educators and teachers."

Jeanne Mullaney, assessment coordinator and a professor of French and Spanish at the Community College of Rhode Island, said her institution experimented with one national assessment test and several homegrown tools with modest results. Her complaint about the standardized measurement was a common one: because the test was not tied to students' classroom work (and hence their grades), "students were not motivated to take it. And even if you can capture them, they're not necessarily going to do their best work."

Mullaney was drawn to the AAC&U project, she says, because it was based on "actual student work that faculty members clearly feel is important, and that students take seriously because they have to hand it in in class."

After her training in using the VALUE rubrics, Mullaney gathered nine faculty members on her campus to be the core of the two-year college's project group. They were previously unfamiliar with the rubrics, she says, but together they "went through them with a fine-toothed comb" and agreed "that these rubrics do represent an accurate way to assess these skills." The professors brought in their own (and their colleagues') assignments to see how well (or poorly) they aligned with the rubrics, Mullaney said. "Sometimes their assignments were missing things, but they could easily add them in and make them better."

The last step of the process at the institutional level, she said, was gathering a representative sample of student work, so that it came from all of CCRI's four campuses and 18 different disciplines, and mirrored the gender, racial and ethnic demographics and age of the community college's student body. Similar efforts went on at the other 60-odd campuses.

The Results

The faculty participants scored the thousands of samples of work (which all came from students who had completed at least 75 percent of their course work) in three key learning outcome areas: critical thinking, written communication and quantitative literacy. Like several other recent studies of student learning, including Academically Adrift, the results are not particularly heartening.

A few examples:

Fewer than a third of student assignments from four-year institutions earned a score of three or four on the four-point rubric for the critical thinking skill of "using evidence to investigate a point of view or reach a conclusion."
Nearly four in 10 work samples from four-year colleges scored a zero or one on how well students "analyzed the influence of context and assumptions" to draw conclusions.
While nearly half of student work from two-year colleges earned a three or four on "content development" in written communication, only about a third scored that high on their use of sources and evidence.
Fewer than half of the work from four-year colleges and a third of student work from two-year colleges scored a three or four on making judgments and drawing "appropriate conclusions based on quantitative analysis of data."

In some ways, those results may reassure skeptics of grading who may wonder whether a faculty-led effort like the collaborative will be sufficiently rigorous.

Richard Arum, a professor of sociology and education at New York University and one of the authors of Academically Adrift, said the new findings "are consistent with our results" and the results of other studies showing "that large numbers of students aren't really performing at the level that you'd hope they'd be performing at. This is as close to a consensus as one ever sees in social science about the nature and character of this problem."

Arum called the collaborative "a very ambitious, important and promising pilot" in which "faculty themselves are taking responsibility for assessment," in direct contrast, he noted, to the federal government's College Scorecard. "How can you design systems like that without faculty input being at the center? [President] Obama should know better," he said.

Arum said the AAC&U/SHEEO approach has the potential to be one of "multiple indicators" that higher education institutions and policy makers eventually embrace to understand student learning. "No one measure is going to be sufficient to capture student learning performance outcomes," he said. "Responsible parties know there's a place for multiple measures, multiple approaches."

Campbell, of Teachers College, agrees that "because [student learning] is such a complicated issue, any one method is going to have complications and potential limitations" -- and like Arum, she sees them in the AAC&U approach.

By focusing on those students who have completed at least 75 percent of their course work, for instance, the initial release of data focuses on the institutions' most successful students, Campbell noted, ignoring the many who might have dropped out or transferred. (Terrel Rhodes, vice president for quality, curriculum and assessment at AAC&U, said in an interview that the participating institutions chose the 75 percent mark for their initial, limited analysis because they wanted to focus on "students who were graduating and how well they were prepared for the workforce.")

Campbell also said that the project will be much more significant if it ultimately shows whether students' skills improve over time. "If you don't have some kind of comparison of change, showing what they could do when they came in and when they left," she said, "it may do exactly what the rankings do: reinforce the reality that great students produce great work, and great institutions have great students."

Rhodes said that in the project's forthcoming demonstration year (after the just-finished pilot year), some participating campuses will compare students with at least 75 percent of their credits with another group that has accumulated less than 25 percent of their credits, so that they can begin to show that sort of "value added" result.

Three more states will join the Multi-State Collaborative this year, Rhodes said, and AAC&U and SHEEO are in discussions with an unidentified research university about housing the project. The currently participating campuses have just received their own data, which Mullaney of the Community College of Rhode Island said she expects her nine-member faculty panel to analyze closely.

"We'll talk about the strengths and weaknesses that we see, and how should we move forward," she said. Asked if she thought professors on her campus would question the legitimacy of the project's results if the college performs badly on the metrics, she said no.

"I might have thought so before, but through this process our faculty has really connected with the idea that this is about student learning," she said. "When they see areas of weakness, I think they'll say, 'Wow, OK, how can we address this? What kinds of teaching strategies can we use?' "

Follow me on Twitter @dougledIHE.